Antunes Lab Research

Safer and More Precise Cancer Immunotherapies

T-cell–based immunotherapies are among the most promising approaches for treating cancer. These therapies work by training the immune system to recognize and destroy tumor cells. However, two major challenges remain. First, the identification of the best cancer targets to develop vaccines and T-cell therapies must account for the genetic background of both the tumor and the patient, and finding these personalized targets can be difficult. Second, sometimes the engineered immune response can mistakenly attack healthy tissues, causing severe (even lethal) toxic effects.

Our research focuses on developing computational tools that help scientists to address these challenges, by supporting the design of safer and more precise immunotherapies. Supported by the National Cancer Institute’s Informatics Technology for Cancer Research (ITCR) program, this project integrates bioinformatics, structural biology, and immunology to improve how therapeutic targets are identified and evaluated.

HLA-Arena 2.0: Structure-Guided Analysis for Personalized Immunotherapy

The project also develops HLA-Arena, an integrated computational environment to facilitate immunoinformatics analysis and the integration of sequence-based and structural-based approaches. A description of the original HLA-Arena can be found under “Postdoctoral Research” in this page. Currently, we are implementing the HLA-Arena 2.0 environment, with a completely new architecture and many additional pipelines.

These pipelines enable the modeling and simulation of both peptide–HLA and TCR-peptide-HLA structures, neoantigen discovery, design of altered peptide ligands with increased immunogenicity (e.g., mimotopes), safety predictions, etc.

CrossDome: Predicting Off-target Toxicity for Cellular Immunotherapies

One of the core tools developed in our lab is CrossDome, a computational framework that predicts whether a T-cell therapy designed to target a tumor peptide might also recognize similar peptides in healthy tissues. By analyzing large immunopeptidomics datasets and comparing biochemical features of peptides, CrossDome can identify potential off-target risks before therapies reach clinical testing.

The University website has posted news article on CrossDome here and on our ITCR-funded project here. Our team is currently developing CrossDome 2.0, with a much bigger immunopeptidomics database, and many additional features. We are also developing a webserver interface.

Improving Immunotherapy Design for All Populations

Cancer immunotherapies harness the immune system to recognize and destroy tumor cells. While these treatments have transformed oncology, their effectiveness can vary significantly across individuals and populations. One important reason is genetic diversity in the human immune system, particularly in molecules responsible for presenting antigens to T cells.

Through a pilot program supported by the NIMHD/NIH and the HEALTH–RCMI initiative at the University of Houston, our lab is developing new computational approaches to improve the design of immunotherapies for all patients, including Black and Hispanic patients. These patients have HLA alleles for which less experiemtnal data is available, in turn leading to less reliable immunoinformatics predictions in precision medicine.

This project addresses that gap by developing structure-based computational methods to better predict how tumor antigens interact with diverse HLA variants and how these interactions may activate T cells. By integrating immunogenomics, structural modeling, and computational immunology, the research aims to improve the identify better tumor-targets for the trreatment of these patients of interst, as well as supporting the design of improved cancer vaccines for them.

Mapping Protein Motion in Space and Time: RMSX and Flipbook

Proteins are dynamic molecules. Their biological function often depends not only on their structure, but also on how they move and change shape over time. Molecular dynamics (MD) simulations allow scientists to observe these motions at atomic resolution, generating detailed trajectories that capture how proteins fluctuate, fold, and interact with other molecules.

However, analyzing these simulations can be challenging. Traditional metrics such as root-mean-square deviation (RMSD) and root-mean-square fluctuation (RMSF) summarize structural changes, but they do not clearly reveal when specific regions of a protein move during a simulation. To address this challenge, our lab developed RMSX and Flipbook, an open-source computational toolkit for analyzing and visualizing time-resolved protein dynamics.

Steps of MD trajectory analysis with RMSX and Flipbook.

RMSX: Time-Resolved Analysis of Protein Fluctuations

RMSX extends the classical RMSF metric by calculating residue fluctuations over consecutive time windows of a molecular dynamics simulation. Instead of averaging motion across the entire trajectory, RMSX identifies both where and when conformational changes occur within a protein. This approach allows researchers to pinpoint transient events such as localized structural rearrangements, allosteric signaling between distant residues and conformational shifts triggered by ligand binding or mechanical forces. By combining spatial and temporal information, RMSX provides a much clearer view of how proteins behave dynamically.

Flipbook: Visualizing Molecular Motion

While RMSX quantifies dynamic behavior, Flipbook provides an intuitive way to visualize it. The method converts simulation data into sequential, color-mapped 3D structures, effectively creating a “flipbook” animation of protein motion. Flipbook can display RMSX results or any residue-level metric—directly on protein structures using widely used molecular visualization tools such as UCSF ChimeraX and VMD. This makes it easier to communicate complex molecular motions in publications, presentations, and teaching.

Together, RMSX and Flipbook provide a streamlined pipeline for high-resolution analysis and visualization of biomolecular dynamics. The tools are designed to be open source and freely available, compatible with standard molecular dynamics workflows, and easy to integrate into Jupyter notebooks, Google Colabs and visualization platforms. By helping researchers identify when and where proteins move, this work supports studies in structural biology, drug discovery, enzymology, mechanistic biochemistry and more.

Links: Publication, GitHub, Video Tutorial

Postdoctoral Research (2014-2020)

Development of computational methods for structural modeling of pMHC complexes.

Understanding the mechanisms involved in the activation of an immune response is essential to many fields in human health, including vaccine development and personalized cancer immunotherapy. A central step in the activation of the adaptive immune response is the recognition, by T-cell lymphocytes, of peptides displayed by a special type of receptor known as Major Histocompatibility Complex (MHC). Considering the key role of MHC receptors in T-cell activation, the computational prediction of peptide binding to MHC has been an important goal for many immunological applications. This problem, however, is much harder than most docking problems in drug discovery, given the length and flexibility of the peptide-targets. In order to overcome the high dimensionality of this sampling problem, three strategies have been devised: (i) constrained backbone prediction, (ii) constrained termini prediction, and (iii) incremental prediction. Each of these strategies has advantages and limitations, and over the years I have made contributions in all three categories (see Antunes et. al, 2019). First, I have identified allele-specific patterns that were used to develop a constrained backbone prediction tool called DockTope. DockTope was validated for 4 MHC alleles through cross-docking of available crystal structures, being recently integrated to the IEDB Analysis Resource collection as the first open-acess docking-based webserver for modeling pMHC complexes. Later, I started working with DINC, a meta-docking incremental approach that is suited for docking large ligands. I provided proof-of-concept of its use for general structural prediction of pMHC complexes (i.e., modeling different MHC alleles and peptide lengths). Finally, we implemented a constrained termini prediction method for fast generation of ensembles of bound conformations of pMHC complexes APE-Gen. This method allows for large-scale structural analysis of pMHC complexes, being also applicable to to virtually all known HLAs.

Visual representation of the modeling strategies implemented in DINC (left) and APE-Gen (right). DINC starts by selecting a small fragment of the peptide, with only 6 flexible bonds (depicted in green), and using it as input for the first round of docking to the MHC binding cleft (cross-section view depicted in gray). The best binding modes are selected across multiple parallel docking runs, and the corresponding peptide fragments expanded by adding a small number of atoms (depicted in red). The expanded fragment is used for the next round of docking, and this incremental process is repeated until the entire ligand has been reconstructed and docked. APE-Gen uses a different approach. Peptide termini templates (backbone) are used for positioning the anchor residues (A). This is followed by the generation of an ensemble of alternative backbone conformations, with the random coordinate descent loop-closure tool (B). Finally, full-atom reconstruction of peptide side-chains and energy minimization of the resulting complex are performed for each sampled backbone (C). Modified from Antunes *et. al*, 2018 and Antunes *et. al*, 2020.

Development of a customizable environment for the structural analysis of peptide-HLA complexes.

Using Jupyter Notebook and Docker, we have created a customizable environment, called HLA-Arena, that enables researchers to easily model any class I pHLA complex of interest and perform varied structural analyses. HLA-Arena includes different workflows, defined as separate notebooks, that consist of the following main stages:

Input processing: Available structures of HLA receptors are obtained from the PDB to be used as such or as templates. Unavailable HLA structures are modeled with Modeller, using a HLA sequence and the structure of a similar HLA receptor as template, if these are provided by the user. Alternatively, users can just provide an allele name (e.g., HLA-A*24:02); HLA-Arena will then fetch the proper sequence from IMGT/HLA, and a reasonable template (based on the HLA supertype classification) from the PDB. In addition, binding affinity of peptides can be estimated with MHCflurry 12 to select the most relevant ones.

Peptide docking: Structures of pHLA complexes are modeled with APE-Gen and/or DINC, which only requires the sequence of the target peptide(s) and the HLA structure(s) obtained previously. Modeled structures can also be minimized with a force field, using OpenMM.

Data analysis: A variety of post-processing options for data analysis can be incorporated in a workflow. These include binding mode rescoring or peptide ranking with DINC, and structure visualization with NGL Viewer, among others.

For a smooth user experience, all computational tools involved in HLA-Arena are packaged within a Docker image, therefore eliminating the burden of managing software dependencies. Another advantage of Docker containerization is to make HLA-Arena platform-agnostic. As a result, it can be deployed on a desktop computer or a high-performance computing cluster, across different operating systems. Users can customize available workflows by adding modeling or analysis steps. We plan to continuously expand the capabilities of HLA-Arena by providing support for additional tools.

Research during the COVID-19 pandemic

DINC-COVID webserver for ensemble docking

The novel coronavirus SARS-CoV-2, which causes the respiratory disease COVID-19, went from an outbreak to a world-wide pandemic in just a few months. In response, there have been unprecedented global efforts to develop effective treatments. Among pharmacological targets, proteins involved in the viral replication have been used in several computational studies focused on drug design, drug repurposing and virtual screening. Unfortunately, proposed SARS-CoV-2 inhibitors have not yet impacted the course of the COVID-19 pandemic. Most of these efforts, however, have largely ignored the issue of receptor flexibility. In this context, we have implemented a computational tool for ensemble docking with SARS-CoV-2 proteins, including the main protease (Mpro), papain-like protease (PLpro) and RNA-dependent RNA polymerase (RdRp). DINC-COVID is available as a user friendly webserver, providing plausible binding modes between conformations of a selected ensemble and a user uploaded ligand. These binding modes are sampled with DINC, our parallelized meta-docking tool, and scored with three different scoring functions.

Molecular dynamics of Compstatin analogs

On another study, I used molecular dynamics simulations to uncover the mechanistic properties of several compstatin analogs. Compstatin is a peptide-based drug proven to be a very promising inhibitor of the complement system, an important component of the innate immunity. A recent compstatin analog is being developed as a candidate drug against several pathological conditions, including COVID-19. However, the reasons behind its higher potency and increased binding affinity to complement proteins are not fully clear. I performed simulations involving six analogs alone in solution and two complexes with compstatin bound to complement component 3. These simulations reveal that all the analogs we consider, except the original compstatin, naturally adopt a pre-bound conformation in solution. Interestingly, this set of analogs adopting a pre-bound conformation includes analogs that were not known to benefit from this behavior. We also show that the most recent compstatin analog forms a stronger hydrogen bond network with its complement receptor than an earlier analog.

Identification of broadly-protective SARS-derived peptide-targets

Fortunately, we have now multiple effective vaccines that are already been used to control the ongoing COVID-19 pandemic. Most of these vaccines aim at inducing the production of neutralizing antibodies against envelope proteins. However, envelope proteins are knowingly more susceptible to selection pressure, and therefore more prone to mutations that can quickly lead to resistance to treatment. In other words, even if successful vaccination campaigns are executed in different countries, it is unclear for how long this immunization will be effective. In addition, the large reservoir of SARS-type viruses in the wild highlights the continued risk for new pandemics in the future. Therefore, there is a need for effective vaccination strategies that would protect individuals against a broader range of SARS-like coronaviruses. To address this problem, we have started working on the creation of a new HLA-Arena workflow specifically designed to help the identification of conserved targets across SARS-CoV-like viruses. This project was funded through the IIBR:Informatics:RAPID award mechanism of NSF.

Graduate Research (2009-2014)

Identification of structural features driving T-cell cross-reactivity.

Using DockTope, I have investigated the structural similarity of pMHC complexes presenting cross-reactive and non-cross-reactive variants of the immunodominant peptide NS3-1073, derived from Hepatitis C Virus (HCV). This HLA-A*02:01-restricted peptide was included in a vaccine that was protective only against certain HCV genotypes (cross-genotype-reactivity), according to a study previously performed by a German group (Fytili et al., 2008). Applying Principal Component Analysis (PCA) and hierarchical clustering on data extracted from modeled pMHC complexes, I was able to show that observed patterns of cross-reactivity were mostly driven by structural similarity between the complexes (see Antunes et. al, 2011); particularly, topography and charge distribution over the T-cell-interacting surface. Using this knowledge, I executed a virtual screening against a panel of unrelated viral-derived targets, also modeled in the context of HLA-A*02:01. This analysis indicated potential cross-reactivity of the wild-type HCV-derived peptide (NS3-1073) with peptides from Epstein-Barr Virus, Influenza and HIV. Some of these peptides had little or even no sequence identity with the wild-type HCV peptide, making these cross-reactivities impossible to be predicted using sequence-based analyses. All these targets were later tested with lymphocytes from HCV-infected patients and healthy vaccinated individuals, confirming the predicted cross-reactivities (see Zhang et. al, 2015). More importantly, cross-reactivities with these heterologous targets were associated to differential response to vaccination, highlighting the importance of this issue for vaccine design. More recently, I extended these analyses to evaluate previously described “cross-reactivity networks” among virus-derived peptides. I used structure-based clustering of modeled pMHC complexes to help explain apparent inconsistencies in reported cross-reactivities, and proposed testable hypotheses on the implications of pMHC structural similarity to T-cell cross-reactivity and cancer immunotherapy (see Antunes et. al., 2017).

CrossTope: A structural database for cross-reactivity assessment

The Structural Data Bank for Cross-Reactivity is a curate repository of three-dimensional structures of pMHC complexes, focused on immunogenicity, similarity relationships and cross-reactivity prediction. We used DockTope to predict more than 500 unknown pMHC structures, now publicly available through the CrossTope Data Bank.

A new classification method to understand available docking strategies accounting for protein flexibility.

Molecular Docking became an essential tool for research in drug design, and several different software are currently available. Older applications explored only flexibility of the ligand, while keeping the protein rigid through the entire search. In many cases this approach would not be enough to reproduce the correct protein-ligand binding mode, since proteins are extremely flexible and can change the conformation of the binding site. Nowadays, most docking methods would consider some level of protein flexibility during the search and most classification attempts would relate these different methods to one of the main biomolecular recognition models (induced fit or conformational selection). However, there exists a great diversity of docking methods accounting for protein flexibility, and any classification based on a dichotomy between these two theoretical models is bound to fail. Contrary to what is frequently done, I proposed a more algorithmic classification, focusing on the level of protein flexibility accounted for (e.g. implicit or explicit, partial or full). This alternative classification should help new users to navigate through all the diversity of docking approaches, allowing them to choose the one the best suits the research problem they want to investigate (see Antunes et. al, 2015).

Identification of structural features involved in resistance to HIV-1 protease-inhibitors.

The Human Immunodeficiency Virus type 1 protease enzyme (HIV-1 PR) is one of the most important targets of antiretroviral therapy used in the treatment of AIDS patients. The success of protease-inhibitors (PIs), however, is often limited by the emergence of protease mutations that can confer resistance to a specific drug, or even to multiple PIs. Using molecular docking and molecular dynamics, I evaluated the impact of two unusual mutations (D30V and V32E) over the dynamics of the PR-Nelfinavir complex (see Antunes et. al, 2014). These mutations were identified in drug free HIV-1 patients (from Porto Alegre, Brazil), and involved codons that were previously related to major drug resistance to Nelfinavir. Both studied mutations presented structural features that indicate resistance to Nelfinavir, each one with a different impact over the interaction with the drug. The D30V mutation triggered a subtle change in the PR structure, which was also observed for the well-known Nelfinavir resistance mutation D30N, while the V32E exchange presented a much more dramatic impact over the PR flap dynamics. Moreover, this in silico approach was also able to describe different binding modes of the drug when bound to different proteases, identifying specific features of HIV-1 subtype B and subtype C proteases. A better understanding of the differences among HIV-1 subtypes and the molecular features involved in drug-resistance will allow physicians to prescribe the most effective drug for each individual patient, avoiding treatment failure and promoting durable remission of HIV-1.

Dinler A. Antunes