[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: The ability to modulate pathogenic proteins represents a powerful treatment strategy for diseases. Unfortunately, many proteins are considered “undruggable” by small molecules, and are often intrinsically disordered, precluding the usage of structure-based tools for binder design. To address these challenges, we have developed a suite of algorithms that enable the design of target-specific peptides via protein language model embeddings, without the requirement of 3D structures. First, we train a model that leverages ESM-2 embeddings to efficiently select high-affinity peptides from natural protein interaction interfaces. We experimentally fuse model-derived peptides to E3 ubiquitin ligases and identify candidates exhibiting robust degradation of undruggable targets in human cells. Next, we develop a high-accuracy discriminator, based on the CLIP architecture, to prioritize and screen peptides with selectivity to a specified target protein. As input to the discriminator, we create a Gaussian diffusion generator to sample an ESM-2-based latent space, fine-tuned on experimentally-valid peptide sequences. Finally, to enable de novo generation of binding peptides, we train an instance of GPT-2 with protein interacting sequences to enable peptide generation conditioned on target sequence. Our model demonstrates low perplexities across both existing and generated peptide sequences. Together, our work lays the foundation for programmable protein targeting and editing applications.
Speaker: Pranam Chatterjee
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Trade-offs between accuracy and speed have long limited the applications of machine learning interatomic potentials. Recently, E(3)-equivariant architectures have demonstrated leading accuracy, data efficiency, transferability, and simulation stability, but their computational cost and scaling has generally reinforced this trade-off. In particular, the ubiquitous use of message passing architectures has precluded the extension of accessible length- and time-scales with efficient multi-GPU calculations. In this talk I will discuss Allegro, a strictly local equivariant deep learning interatomic potential designed for parallel scalability and increased computational efficiency that simultaneously exhibits excellent accuracy. After presenting the architecture, I will discuss applications and benchmarks on various materials and chemical systems, including recent demonstrations of scaling to large all-atom biomolecular systems such as solvated proteins and a 44 million atom model of the HIV capsid. Finally, I will summarize the software ecosystem and tooling around Allegro.
Speaker: Albert Musaelian
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Engineered proteins play increasingly essential roles in industries and applications spanning pharmaceuticals, agriculture, specialty chemicals, and fuel. Machine learning could enable an unprecedented level of control in protein engineering for therapeutic and industrial applications. Large self-supervised models pretrained on millions of protein sequences have recently gained popularity in generating embeddings of protein sequences for protein property prediction. However, protein datasets contain information in addition to sequence that can improve model performance. This talk will cover models that use sequences, structures, and biophysical features to predict protein function or to generate functional proteins.
Speaker: Kevin K. Yang
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Molecular simulations enable the study of biomolecules and their dynamics on an atomistic scale. A common task is to compare several simulation conditions - like mutations or different ligands - to find significant differences and interrelations between them. However, the large amount of data produced for ever larger and more complex systems often renders it difficult to identify the structural features that are relevant for a particular phenomenon. PENSA is a flexible software package that enables a comprehensive and thorough investigation into biomolecular conformational ensembles. It provides a wide variety of featurizations and feature transformations that allow for a complete representation of biomolecules like proteins and nucleic acids, including water and ion cavities within the biomolecular structure, thus avoiding bias that would come with manual selection of features. PENSA implements various methods to systematically compare the distributions of these features across ensembles to find the significant differences between them and identify regions of interest. It also includes a novel approach to quantify the state-specific information between two regions of a biomolecule which allows, e.g., the tracing of information flow to identify signaling pathways. PENSA is a modular open-source library that also comes with convenient tools for loading data and visualizing results in ways that make them quick to process and easy to interpret. This talk will demonstrate its usefulness in real-world examples by showing how it helps to determine molecular mechanisms efficiently.
Speaker: Martin Vögele
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Cryptic pockets, which are absent in ligand-free structures and have the potential to be used as drug targets, are often challenging to access through conventional biomolecular simulations due to their slow motions. To overcome this limitation, we have combined AlphaFold and Markov State modelling (MSM) to accelerate the discovery of cryptic pockets. AlphaFold was used to generate a diverse structural ensemble with open or partially open pockets that can serve as starting points for molecular dynamics simulations which were later stitched together using MSM to predict free energy and kinetics associated with cryptic pocket opening. Our approach explored known cryptic pockets, as well as discovered new cryptic pockets which were absent in PDB. Our study highlighted the power of AlphaFold and MSM to discover novel cryptic pockets which can unlock development of next-gen therapeutics.
Speaker: Stephan Thaler
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Cryptic pockets, which are absent in ligand-free structures and have the potential to be used as drug targets, are often challenging to access through conventional biomolecular simulations due to their slow motions. To overcome this limitation, we have combined AlphaFold and Markov State modelling (MSM) to accelerate the discovery of cryptic pockets. AlphaFold was used to generate a diverse structural ensemble with open or partially open pockets that can serve as starting points for molecular dynamics simulations which were later stitched together using MSM to predict free energy and kinetics associated with cryptic pocket opening. Our approach explored known cryptic pockets, as well as discovered new cryptic pockets which were absent in PDB. Our study highlighted the power of AlphaFold and MSM to discover novel cryptic pockets which can unlock development of next-gen therapeutics.
Speaker: Soumendranath Bhakat
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: The fundamental equations that govern Nature at atomistic scales are well understood in terms of quantum mechanics. Solving such equations is not possible apart from very simple systems, yet solutions to this problem represent one of the grand challenges for computational sciences as it would allow an understanding of all properties of molecular systems. We investigate this challenge by solving the sampling and accuracy problems of atomistic simulations using machine learning, physics, and GPUs. Machine learning potentials, as universal many-body function approximators, could deliver the next-generation modeling approach, blurring the boundary between quantum mechanics, molecular mechanics, and coarse-grained simulations into a cohesive methodology. In recent years, incredible progress has been made in transferable molecular representations which can learn effective potential functions. New methods for learning such potentials and even the energetics of the underlying physical systems are now available. However, there are still problems in extending the generalizability, lack of accurate datasets, and handling of charges and charged molecules, all within a speed bound which must be able to handle large systems like protein complexes. In this talk, I will discuss how to advance these scientific problems toward next-generation molecular simulations both in the context of biomolecular simulations (ACEMD/OpenMM) and more general machine learning frameworks (TorchMD, TorchMD-NET).
Speaker: Gianni De Fabritiis
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein function or structure. Existing approaches usually pretrain protein language models on a large number of unlabeled amino acid sequences and then finetune the models with some labeled data in downstream tasks. Despite the effectiveness of sequence-based approaches, the power of pretraining on known protein structures, which are available in smaller numbers only, has not been explored for protein property prediction, though protein structures are known to be determinants of protein function. In this paper, we propose to pretrain protein representations according to their 3D structures. We first present a simple yet effective encoder to learn the geometric features of a protein. We pretrain the protein graph encoder by leveraging multiview contrastive learning and different self-prediction tasks. Experimental results on both function prediction and fold classification tasks show that our proposed pretraining methods outperform or are on par with the state-of-the-art sequence-based methods, while using much less pretraining data.
Speaker: Zuobai Zhang
Twitter - Prudencio
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: While computational methods have become a mainstay in drug discovery programs, many calculations are too time-consuming to be applied to large datasets. Active learning (AL), a machine learning method used to direct a search iteratively, can enable the application of computationally expensive methods such as relative binding free energy (RBFE) calculations to sets containing thousands of molecules. Moreover, AL can also be applied to virtual screening, enabling the rapid processing of billions of molecules. This presentation will provide an overview of active learning and highlight some applications in drug discovery.
Speakes: Pat Walter & James Thompson
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
Try datamol.io - the open source toolkit that simplifies molecular processing and featurization workflows for machine learning scientists working in drug discovery: https://datamol.io/
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Developing novel bioactive molecules is time-consuming, costly and rarely successful. As a mitigation strategy, we utilize, for the first time, cellular morphology to directly guide the de novo design of small molecules. We trained a conditional generative adversarial network on a set of 30 000 compounds using their cell painting morphological profiles as conditioning. Our model was able to learn chemistry-morphology relationships and influence the generated chemical space according to the morphological profile. We provide evidence for the targeted generation of known agonists when conditioning on gene overexpression profiles, even though no information on biological targets was used during training. Based on a target-agnostic readout, our approach facilitates knowledge transfer between biological pathways and can be used to design bioactives for many targets under one unified framework. Prospective application of this proof-of-concept to larger chemical spaces promises great potential for hit generation in drug and phytopharmaceutical discovery and chemical safety.
Speaker: Paula A. Marin Zapata
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - datamol.io
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: The process of finding molecules that bind to a target protein is a challenging first step in drug discovery. Crystallographic fragment screening is a strategy based on elucidating binding modes of small polar compounds and then building potency by expanding or merging them. Recent advances in high-throughput crystallography enable screening of large fragment libraries, reading out dense ensembles of fragments spanning the binding site. However, fragments typically have low affinity thus the road to potency is often long and fraught with false starts. Here, we take advantage of high-throughput crystallography to reframe fragment-based hit discovery as a denoising problem – identifying significant pharmacophore distributions from a fragment ensemble amid noise due to weak binders – and employ an unsupervised machine learning method to tackle this problem. Our method screens potential molecules by evaluating whether they recapitulate those fragment-derived pharmacophore distributions. We retrospectively validated our approach on an open science campaign against SARS-CoV-2 main protease (Mpro), showing that our method can distinguish active compounds from inactive ones using only structural data of fragment-protein complexes, without any activity data. Further, we prospectively found novel hits for Mpro and the Mac1 domain of SARS-CoV-2 non-structural protein 3. More broadly, our results demonstrate how unsupervised machine learning helps interpret high throughput crystallography data to rapidly discover of potent chemical modulators of protein function.
Speaker: William McCorkindale
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Deep learning models that leverage large datasets are often the state of the art for modelling molecular properties. When the datasets are smaller (less than 2000 molecules), it is not clear that deep learning approaches are the right modelling tool. In this work we perform an extensive study of the calibration and generalizability of probabilistic machine learning models on small chemical datasets. Using different molecular representations and models, we analyse the quality of their predictions and uncertainties in a variety of tasks (binary, regression) and datasets. We also introduce two simulated experiments that evaluate their performance: (1) Bayesian optimization guided molecular design, (2) inference on out-of-distribution data via ablated cluster splits. We offer practical insights into model and feature choice for modelling small chemical datasets, a common scenario in new chemical experiments. We have packaged our analysis into the DIONYSUS repository, which is open sourced to aid in reproducibility and extension to new datasets.
Speaker: Gary Tom
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: In computer-aided drug discovery, quantitative structure activity relation models are trained to predict biological activity from chemical structure. Despite the recent success of applying graph neural network to this task, important chemical information such as molecular chirality is ignored. To fill this crucial gap, we propose Molecular-Kernel Graph Neural Network (MolKGNN) for molecular representation learning, which features SE(3)-/conformation invariance, and interpretability. For our MolKGNN, we first design a molecular graph convolution to capture the chemical pattern by comparing the atom’s similarity with the learnable molecular kernels. Furthermore, we propagate the similarity score to capture the higher-order chemical pattern. To assess the method, we conduct a comprehensive evaluation with nine well-curated datasets spanning numerous important drug targets that feature realistic high class imbalance and it demonstrates the superiority of MolKGNN over other GNNs in CADD. Meanwhile, the learned kernels identify patterns that agree with domain knowledge, confirming the pragmatic interpretability of this approach. This work was recently accepted by AAAI23.
Speaker: Yunchao (Lance) Liu
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Deep learning, and in general, auto-differentiation frameworks allow the expression of many scientific problems as end-to-end learning tasks. Common themes in scientific machine learning involve learning surrogate functions of expensive simulators, sampling complex distributions directly or time-propagation of known or unknown differential equation systems efficiently. We will describe our recent work in applying deep-learning surrogates and auto-differentiation techniques in molecular simulations. In particular, we will explore active learning of machine learning potentials with differentiable uncertainty; the use of deep neural network generative models to learn reversible coarse-grained representations of atomic systems; and the application of differentiable simulations for reaction path finding without prior knowledge of collective variables. Overfitting and lack of generalizability are constant challenges in AI for science. We will discuss the scalability of active learning to practical applications in molecular simulations and the sensitivity of deep learning answers to molecular problems, like the fitting of pair potentials from observables.
Speaker: Can Chen
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Deep learning, and in general, auto-differentiation frameworks allow the expression of many scientific problems as end-to-end learning tasks. Common themes in scientific machine learning involve learning surrogate functions of expensive simulators, sampling complex distributions directly or time-propagation of known or unknown differential equation systems efficiently. We will describe our recent work in applying deep-learning surrogates and auto-differentiation techniques in molecular simulations. In particular, we will explore active learning of machine learning potentials with differentiable uncertainty; the use of deep neural network generative models to learn reversible coarse-grained representations of atomic systems; and the application of differentiable simulations for reaction path finding without prior knowledge of collective variables. Overfitting and lack of generalizability are constant challenges in AI for science. We will discuss the scalability of active learning to practical applications in molecular simulations and the sensitivity of deep learning answers to molecular problems, like the fitting of pair potentials from observables.
Speaker: Rafael Gomez-Bombarelli
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Accurate 3D molecular information is a keystone for many computational programs but accessing reliable 3D conformers is still challenging. It requires enumerating and optimizing a huge isomer and conformer space, which would overwhelm any traditional computational methods. In light of this, we proposed the Auto3D package for generating low-energy 3D conformers using fast and reliable neural network potentials (NNPs). Given a SMILES, Auto3D returns the low-energy 3D conformers by automatizing the isomer enumeration and duplicate filtering process, 3D building process, geometry optimization process, and ranking process. In conjunction with Auto3D, we developed ANI-2xt NNP, which was trained especially for tautomer-related tasks. These NNPs were used to generate 3D structures and compute molecular properties. In a tautomeric reaction energy calculation task, the ANI-2xt NNP achieved similar accuracy but was several orders of magnitude faster than the reference DFT method.
Speaker: Zhen Liu
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Recently, artificial intelligence (AI) for drug discovery has raised increasing interest in both the machine learning (ML) and computational chemistry communities. The core problem of AI for drug discovery is molecule representation learning, where the molecule knowledge can be naturally presented in different modalities: chemical formula, molecular graph, geometric conformation, knowledge base, biomedical literature, etc. In this talk, I would like to provide a perspective concentrating on molecule pretraining from topology, geometry, and textual description. Such a unified perspective paves the way for molecule representation interpretation as well as discovery tasks.
Speaker: Shengchao Liu
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: Molecular dynamics (MD) simulation techniques are widely used for various natural science applications. Increasingly, machine learning (ML) force field (FF) models begin to replace ab-initio simulations by predicting forces directly from atomic structures. Despite significant progress in this area, such techniques are primarily benchmarked by their force/energy prediction errors, even though the practical use case would be to produce realistic MD trajectories. We aim to fill this gap by introducing a novel benchmark suite for ML MD simulation. We curate representative MD systems, including water, organic molecules, peptide, and materials, and design evaluation metrics corresponding to the scientific objectives of respective systems. We benchmark a collection of state-of-the-art (SOTA) ML FF models and illustrate, in particular, how the commonly benchmarked force accuracy is not well aligned with relevant simulation metrics. We demonstrate when and how selected SOTA methods fail, along with offering directions for further improvement. Specifically, we identify stability as a key metric for ML models to improve. Our benchmark suite comes with a comprehensive open-source codebase for training and simulation with ML FFs to facilitate further work.
Speaker: Xiang Fu
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: The universality of thermodynamics and statistical mechanics has led to a language comprehensible to chemists, physicists & others, enabling countless scientific discoveries in diverse fields. In the last decade, a new arguably common language that everyone seems to speak but at least no chemist fully understands, has emerged with the advent of artificial intelligence (AI). It is natural to ask if AI can be integrated with the various theoretical and simulation methods in chemistry for new discoveries. At the same this raises many open questions, including: (1) should chemists, who are not fundamentally trained in AI, trust any of the results obtained using AI, (2) can AI paradigms developed for non-molecular systems with massive training data can directly be applied to chemistry with all its quirks, richness, known/unknown laws, and often poor/limited data? In this seminar I will show how such an integration of disciplines can be attained, creating trustable, robust AI frameworks for use by chemists. I will demonstrate such methods on different problems involving protein kinases, riboswitches and crystal polymorph nucleation, where we predict mechanisms at timescales much longer than milliseconds while keeping all-atom/femtosecond resolution. I will conclude with an outlook for future challenges and opportunities, envisioning a new sub-discipline of “Artificial Chemical Intelligence” where chemistry moves hand-in-hand with AI to enable smart molecular discovery, and is not just yet another domain for application of AI.
Speaker: Pratyush Tiwary
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery
[DISCLAIMER] - For the full visual experience, we recommend you tune in through our YouTube channel to see the presented slides.
If you enjoyed this talk, consider joining the Molecular Modeling and Drug Discovery (M2D2) talks live.
Also, consider joining the M2D2 Slack.
Abstract: The latest biological findings observe that the motionless lock-and-key theory is no longer applicable and that changes in atomic sites and binding pose can provide important information for understanding drug binding. However, the computational expenditure limits the growth of protein trajectory-related studies, thus hindering the possibility of supervised learning. We present a novel spatial-temporal pre-training method based on the modified Equivariant Graph Matching Networks (EGMN), dubbed ProtMD. It has two specially designed self-supervised learning tasks: atom-level prompt-based denoising generative task and conformation-level snapshot ordering task to seize the flexibility information inside MD trajectories with very fine temporal resolutions. More importantly, we investigate the underlying mechanism behind the success of ProtMD, and further demonstrate a tight correlation between the magnitude of spatial motion of conformation and the extent to which the ligand and the receptor bind with each other.
Speaker: Fang Wu
Twitter - Prudencio
Twitter - Therence
Twitter - Jonny
Twitter - Valence Discovery