End-to-end learning of protein-protein interactions
Submission navigation links for Project
Submission information
Submission Number: 61
Submission ID: 92
Submission UUID: ef874fa2-e0e2-4987-a7dc-f0d9cacd71d9
Submission URI: /form/project
Created: Wed, 08/12/2020 - 13:54
Completed: Wed, 08/12/2020 - 15:10
Changed: Wed, 07/06/2022 - 15:10
Remote IP address: 165.230.224.100
Submitted by: Galen Collier
Language: English
Is draft: No
Webform: Project
End-to-end learning of protein-protein interactions

Halted
Project Leader
Project Personnel
Project Information
Protein-protein interactions (PPIs) are involved in numerous fundamental biological processes and a model that can reliably predict whether two proteins interact — and predict the effect of protein variation on an existing interaction — opens up new avenues for systems biology and for protein design. Current state-of-the-art PPI prediction models rely on sequence similarity with proteins known to interact and have an intrinsically limited accuracy for the protein variants of interest for cancer or viral/bacterial infection.
The goal of the project is to train deep learning models for PPI prediction in absence of structural information about the protein complex. We have recently developed models to predict the structure of any complex formed by two proteins A and B of known structure (see our preprint “Protein-protein docking using learned three-dimensional representations”, https://www.biorxiv.org/content/10.1101/738690v2), and we now aim at developing models that generate the structure of the AB complex at once, without explicitly searching for the optimal relative orientations of the two proteins, and that predict the binding affinity of proteins A and B directly from their structures. Such models have two main advantages: (1) they are much more computationally efficient, since they avoid a costly grid search in the space of translations and rotations, and (2) they are differentiable, which means they can be used as building blocks for larger neural architectures that, for instance, also predict the structures of the individual proteins A and B themselves.
This project is enabled by the development of TorchProteinLibrary, a computationally efficient library of differentiable primitives for deep neural network models of protein structure (see our preprint “TorchProteinLibrary: A computationally efficient, differentiable representation of protein structure” https://arxiv.org/abs/1812.01108). The library implements the functionalities needed to perform end-to-end learning of protein structure prediction.
The goal of the project is to train deep learning models for PPI prediction in absence of structural information about the protein complex. We have recently developed models to predict the structure of any complex formed by two proteins A and B of known structure (see our preprint “Protein-protein docking using learned three-dimensional representations”, https://www.biorxiv.org/content/10.1101/738690v2), and we now aim at developing models that generate the structure of the AB complex at once, without explicitly searching for the optimal relative orientations of the two proteins, and that predict the binding affinity of proteins A and B directly from their structures. Such models have two main advantages: (1) they are much more computationally efficient, since they avoid a costly grid search in the space of translations and rotations, and (2) they are differentiable, which means they can be used as building blocks for larger neural architectures that, for instance, also predict the structures of the individual proteins A and B themselves.
This project is enabled by the development of TorchProteinLibrary, a computationally efficient library of differentiable primitives for deep neural network models of protein structure (see our preprint “TorchProteinLibrary: A computationally efficient, differentiable representation of protein structure” https://arxiv.org/abs/1812.01108). The library implements the functionalities needed to perform end-to-end learning of protein structure prediction.
Project Information Subsection
Research workflow development: successful training of deep learning models for PPI prediction in absence of structural information about the protein complex. Communicating the findings in the form of presentations and/or publications.
{Empty}
- Grad or undergrad
- Interested in structural biology research
- Experienced Linux or Unix user
- Comfortable working in a remote Linux environment (HPC cluster)
- Some experience with Python programming
- Structural modeling experience (understanding general concepts) will be helpful
- Familiarity with machine learning concepts will be helpful
- Interested in structural biology research
- Experienced Linux or Unix user
- Comfortable working in a remote Linux environment (HPC cluster)
- Some experience with Python programming
- Structural modeling experience (understanding general concepts) will be helpful
- Familiarity with machine learning concepts will be helpful
{Empty}
Practical applications
{Empty}
Rutgers University–Camden
303 Cooper St
Camden, New Jersey. 08102
Camden, New Jersey. 08102
CR-Rutgers
09/01/2020
No
Already behind3Start date is flexible
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
Effort involved in recruiting and training junior-level research software engineers.
{Empty}
{Empty}
Final Report
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}