Development of computational methods for structural modeling of pMHC complexes.

Understanding the mechanisms involved in the activation of an immune response is essential to many fields in human health, including vaccine development and personalized cancer immunotherapy. A central step in the activation of the adaptive immune response is the recognition, by T-cell lymphocytes, of peptides displayed by a special type of receptor known as Major Histocompatibility Complex (MHC). Considering the key role of MHC receptors in T-cell activation, the computational prediction of peptide binding to MHC has been an important goal for many immunological applications. This problem, however, is much harder than most docking problems in drug discovery, given the length and flexibility of the peptide-targets. In order to overcome the high dimensionality of this sampling problem, three strategies have been devised: (i) constrained backbone prediction, (ii) constrained termini prediction, and (iii) incremental prediction. Each of these strategies has advantages and limitations, and over the years I have made contributions in all three categories (see Antunes et. al, 2019). First, I have identified allele-specific patterns that were used to develop a constrained backbone prediction tool called DockTope. DockTope was validated for 4 MHC alleles through cross-docking of available crystal structures, being recently integrated to the IEDB Analysis Resource collection as the first open-acess docking-based webserver for modeling pMHC complexes. Later, I started working with DINC, a meta-docking incremental approach that is suited for docking large ligands. I provided proof-of-concept of its use for general structural prediction of pMHC complexes (i.e., modeling different MHC alleles and peptide lengths). Finally, we implemented a constrained termini prediction method for fast generation of ensembles of bound conformations of pMHC complexes APE-Gen. This method allows for large-scale structural analysis of pMHC complexes, being also applicable to to virtually all known HLAs.

Visual representation of the modeling strategies implemented in DINC (left) and APE-Gen (right). DINC starts by selecting a small fragment of the peptide, with only 6 flexible bonds (depicted in green), and using it as input for the first round of docking to the MHC binding cleft (cross-section view depicted in gray). The best binding modes are selected across multiple parallel docking runs, and the corresponding peptide fragments expanded by adding a small number of atoms (depicted in red). The expanded fragment is used for the next round of docking, and this incremental process is repeated until the entire ligand has been reconstructed and docked. APE-Gen uses a different approach. Peptide termini templates (backbone) are used for positioning the anchor residues (A). This is followed by the generation of an ensemble of alternative backbone conformations, with the random coordinate descent loop-closure tool (B). Finally, full-atom reconstruction of peptide side-chains and energy minimization of the resulting complex are performed for each sampled backbone (C). Modified from Antunes et. al, 2018 and Antunes et. al, 2019 (to appear).

Identification of structural features driving T-cell cross-reactivity.

Using DockTope, I have investigated the structural similarity of pMHC complexes presenting cross-reactive and non-cross-reactive variants of the immunodominant peptide NS3-1073, derived from Hepatitis C Virus (HCV). This HLA-A*02:01-restricted peptide was included in a vaccine that was protective only against certain HCV genotypes (cross-genotype-reactivity), according to a study previously performed by a German group (Fytili et al., 2008). Applying Principal Component Analysis (PCA) and hierarchical clustering on data extracted from modeled pMHC complexes, I was able to show that observed patterns of cross-reactivity were mostly driven by structural similarity between the complexes (see Antunes et. al, 2011); particularly, topography and charge distribution over the T-cell-interacting surface. Using this knowledge, I executed a virtual screening against a panel of unrelated viral-derived targets, also modeled in the context of HLA-A*02:01. This analysis indicated potential cross-reactivity of the wild-type HCV-derived peptide (NS3-1073) with peptides from Epstein-Barr Virus, Influenza and HIV. Some of these peptides had little or even no sequence identity with the wild-type HCV peptide, making these cross-reactivities impossible to be predicted using sequence-based analyses. All these targets were later tested with lymphocytes from HCV-infected patients and healthy vaccinated individuals, confirming the predicted cross-reactivities (see Zhang et. al, 2015). More importantly, cross-reactivities with these heterologous targets were associated to differential response to vaccination, highlighting the importance of this issue for vaccine design. More recently, I extended these analyses to evaluate previously described “cross-reactivity networks” among virus-derived peptides. I used structure-based clustering of modeled pMHC complexes to help explain apparent inconsistencies in reported cross-reactivities, and proposed testable hypotheses on the implications of pMHC structural similarity to T-cell cross-reactivity and cancer immunotherapy (see Antunes et. al., 2017).

CrossTope: A structural database for cross-reactivity assessment

The Structural Data Bank for Cross-Reactivity is a curate repository of three-dimensional structures of pMHC complexes, focused on immunogenicity, similarity relationships and cross-reactivity prediction. We used DockTope to predict more than 500 unknown pMHC structures, now publicly available through the CrossTope Data Bank.

A new classification method to understand available docking strategies accounting for protein flexibility.

Molecular Docking became an essential tool for research in drug design, and several different software are currently available. Older applications explored only flexibility of the ligand, while keeping the protein rigid through the entire search. In many cases this approach would not be enough to reproduce the correct protein-ligand binding mode, since proteins are extremely flexible and can change the conformation of the binding site. Nowadays, most docking methods would consider some level of protein flexibility during the search and most classification attempts would relate these different methods to one of the main biomolecular recognition models (induced fit or conformational selection). However, there exists a great diversity of docking methods accounting for protein flexibility, and any classification based on a dichotomy between these two theoretical models is bound to fail. Contrary to what is frequently done, I proposed a more algorithmic classification, focusing on the level of protein flexibility accounted for (e.g. implicit or explicit, partial or full). This alternative classification should help new users to navigate through all the diversity of docking approaches, allowing them to choose the one the best suits the research problem they want to investigate (see Antunes et. al, 2015).

Identification of structural features involved in resistance to HIV-1 protease-inhibitors.

The Human Immunodeficiency Virus type 1 protease enzyme (HIV-1 PR) is one of the most important targets of antiretroviral therapy used in the treatment of AIDS patients. The success of protease-inhibitors (PIs), however, is often limited by the emergence of protease mutations that can confer resistance to a specific drug, or even to multiple PIs. Using molecular docking and molecular dynamics, I evaluated the impact of two unusual mutations (D30V and V32E) over the dynamics of the PR-Nelfinavir complex (see Antunes et. al, 2014). These mutations were identified in drug free HIV-1 patients (from Porto Alegre, Brazil), and involved codons that were previously related to major drug resistance to Nelfinavir. Both studied mutations presented structural features that indicate resistance to Nelfinavir, each one with a different impact over the interaction with the drug. The D30V mutation triggered a subtle change in the PR structure, which was also observed for the well-known Nelfinavir resistance mutation D30N, while the V32E exchange presented a much more dramatic impact over the PR flap dynamics. Moreover, this in silico approach was also able to describe different binding modes of the drug when bound to different proteases, identifying specific features of HIV-1 subtype B and subtype C proteases. A better understanding of the differences among HIV-1 subtypes and the molecular features involved in drug-resistance will allow physicians to prescribe the most effective drug for each individual patient, avoiding treatment failure and promoting durable remission of HIV-1.

Different dynamic behaviors were observed in molecular dynamics simulations of Nelfinavir bound to the wild-type subtype B HIV-1 protease (black) or the subtype B mutant V32E (blue). While the drug-susceptible wild-type is locked into the closed conformation, the drug-resistant mutant can transition to an open state regardless of the presence of the drug. Videos are representative of triplicated trajectories of 50 ns molecular dynamics simulations. Modified from Antunes et. al, 2014.