Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function.
Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function.
Proteins. 2018 03;86 Suppl 1:302-310
Authors: Kundrotas PJ, Anishchenko I, Badal VD, Das M, Dauzhenka T, Vakser IA
The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of α-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.
PMID: 28905425 [PubMed - indexed for MEDLINE]
Lineage space and the propensity of bacterial cells to undergo growth transitions.
PLoS Comput Biol. 2018 08;14(8):e1006380
Authors: Bandyopadhyay A, Wang H, Ray JCJ
The molecular makeup of the offspring of a dividing cell gradually becomes phenotypically decorrelated from the parent cell by noise and regulatory mechanisms that amplify phenotypic heterogeneity. Such regulatory mechanisms form networks that contain thresholds between phenotypes. Populations of cells can be poised near the threshold so that a subset of the population probabilistically undergoes the phenotypic transition. We sought to characterize the diversity of bacterial populations around a growth-modulating threshold via analysis of the effect of non-genetic inheritance, similar to conditions that create antibiotic-tolerant persister cells and other examples of bet hedging. Using simulations and experimental lineage data in Escherichia coli, we present evidence that regulation of growth amplifies the dependence of growth arrest on cellular lineage, causing clusters of related cells undergo growth arrest in certain conditions. Our simulations predict that lineage correlations and the sensitivity of growth to changes in toxin levels coincide in a critical regime. Below the critical regime, the sizes of related growth arrested clusters are distributed exponentially, while in the critical regime clusters sizes are more likely to become large. Furthermore, phenotypic diversity can be nearly as high as possible near the critical regime, but for most parameter values it falls far below the theoretical limit. We conclude that lineage information is indispensable for understanding regulation of cellular growth.
PMID: 30133447 [PubMed - indexed for MEDLINE]
Crosstalk and the evolvability of intracellular communication.
Nat Commun. 2017 07 10;8:16009
Authors: Rowland MA, Greenbaum JM, Deeds EJ
Metazoan signalling networks are complex, with extensive crosstalk between pathways. It is unclear what pressures drove the evolution of this architecture. We explore the hypothesis that crosstalk allows different cell types, each expressing a specific subset of signalling proteins, to activate different outputs when faced with the same inputs, responding differently to the same environment. We find that the pressure to generate diversity leads to the evolution of networks with extensive crosstalk. Using available data, we find that human tissues exhibit higher levels of diversity between cell types than networks with random expression patterns or networks with no crosstalk. We also find that crosstalk and differential expression can influence drug activity: no protein has the same impact on two tissues when inhibited. In addition to providing a possible explanation for the evolution of crosstalk, our work indicates that consideration of cellular context will likely be crucial for targeting signalling networks.
PMID: 28691706 [PubMed - indexed for MEDLINE]
Gene ontology improves template selection in comparative protein docking.
Proteins. 2018 Dec 06;:
Authors: Hadarovich A, Anishchenko I, Tuzikov AV, Kundrotas PJ, Vakser IA
Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure, and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, localization, and such) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains - biological process, molecular function and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy. This article is protected by copyright. All rights reserved.
PMID: 30520123 [PubMed - as supplied by publisher]
Evolutionary pathways of repeat protein topology in bacterial outer membrane proteins.
Elife. 2018 Nov 29;7:
Authors: Franklin MW, Nepomnyachyi S, Feehan R, Ben-Tal N, Kolodny R, Slusky JS
Outer membrane proteins (OMPs) are the proteins in the surface of Gram-negative bacteria. These proteins have diverse functions but a single topology: the β-barrel. Sequence analysis has suggested that this common fold is a β-hairpin repeat protein, and that amplification of the β-hairpin has resulted in 8-26-stranded barrels. Using an integrated approach that combines sequence and structural analyses we find events in which non-amplification diversification also increases barrel strand number. Our network-based analysis reveals strand-number evolutionary pathways, including one that progresses from a primordial 8-stranded barrel to 16-strands and further, to 18-strands. Among these are mechanisms of strand number accretion without domain duplication, like a loop-to-hairpin transition. These mechanisms illustrate perpetuation of repeat protein topology without genetic duplication, likely induced by the hydrophobic membrane. Finally, we find that the evolutionary trace is particularly prominent in the C-terminal half of OMPs, implicating this region in the nucleation of OMP folding.
PMID: 30489257 [PubMed - as supplied by publisher]
Intrinsic limits of information transmission in biochemical signalling motifs.
Interface Focus. 2018 Dec 06;8(6):20180039
Authors: Suderman R, Deeds EJ
All living things have evolved to sense changes in their environment in order to respond in adaptive ways. At the cellular level, these sensing systems generally involve receptor molecules at the cell surface, which detect changes outside the cell and relay those changes to the appropriate response elements downstream. With the advent of experimental technologies that can track signalling at the single-cell level, it has become clear that many signalling systems exhibit significant levels of 'noise,' manifesting as differential responses of otherwise identical cells to the same environment. This noise has a large impact on the capacity of cell signalling networks to transmit information from the environment. Application of information theory to experimental data has found that all systems studied to date encode less than 2.5 bits of information, with the majority transmitting significantly less than 1 bit. Given the growing interest in applying information theory to biological data, it is crucial to understand whether the low values observed to date represent some sort of intrinsic limit on information flow given the inherently stochastic nature of biochemical signalling events. In this work, we used a series of computational models to explore how much information a variety of common 'signalling motifs' can encode. We found that the majority of these motifs, which serve as the basic building blocks of cell signalling networks, can encode far more information (4-6 bits) than has ever been observed experimentally. In addition to providing a consistent framework for estimating information-theoretic quantities from experimental data, our findings suggest that the low levels of information flow observed so far in living system are not necessarily due to intrinsic limitations. Further experimental work will be needed to understand whether certain cell signalling systems actually can approach the intrinsic limits described here, and to understand the sources and purpose of the variation that reduces information flow in living cells.
PMID: 30443336 [PubMed]
Structural Basis for Binding of Allosteric Drug Leads in the Adenosine A1 Receptor.
Sci Rep. 2018 Nov 15;8(1):16836
Authors: Miao Y, Bhattarai A, Nguyen ATN, Christopoulos A, May LT
Despite intense interest in designing positive allosteric modulators (PAMs) as selective drugs of the adenosine A1 receptor (A1AR), structural binding modes of the receptor PAMs remain unknown. Using the first X-ray structure of the A1AR, we have performed all-atom simulations using a robust Gaussian accelerated molecular dynamics (GaMD) technique to determine binding modes of the A1AR allosteric drug leads. Two prototypical PAMs, PD81723 and VCP171, were selected. Each PAM was initially placed at least 20 Å away from the receptor. Extensive GaMD simulations using the AMBER and NAMD simulation packages at different acceleration levels captured spontaneous binding of PAMs to the A1AR. The simulations allowed us to identify low-energy binding modes of the PAMs at an allosteric site formed by the receptor extracellular loop 2 (ECL2), which are highly consistent with mutagenesis experimental data. Furthermore, the PAMs stabilized agonist binding in the receptor. In the absence of PAMs at the ECL2 allosteric site, the agonist sampled a significantly larger conformational space and even dissociated from the A1AR alone. In summary, the GaMD simulations elucidated structural binding modes of the PAMs and provided important insights into allostery in the A1AR, which will greatly facilitate the receptor structure-based drug design.
PMID: 30442899 [PubMed - in process]
Natural language processing in text mining for structural modeling of protein complexes.
BMC Bioinformatics. 2018 03 05;19(1):84
Authors: Badal VD, Kundrotas PJ, Vakser IA
BACKGROUND: Structural modeling of protein-protein interactions produces a large number of putative configurations of the protein complexes. Identification of the near-native models among them is a serious challenge. Publicly available results of biomedical research may provide constraints on the binding mode, which can be essential for the docking. Our text-mining (TM) tool, which extracts binding site residues from the PubMed abstracts, was successfully applied to protein docking (Badal et al., PLoS Comput Biol, 2015; 11: e1004630). Still, many extracted residues were not relevant to the docking.
RESULTS: We present an extension of the TM tool, which utilizes natural language processing (NLP) for analyzing the context of the residue occurrence. The procedure was tested using generic and specialized dictionaries. The results showed that the keyword dictionaries designed for identification of protein interactions are not adequate for the TM prediction of the binding mode. However, our dictionary designed to distinguish keywords relevant to the protein binding sites led to considerable improvement in the TM performance. We investigated the utility of several methods of context analysis, based on dissection of the sentence parse trees. The machine learning-based NLP filtered the pool of the mined residues significantly more efficiently than the rule-based NLP. Constraints generated by NLP were tested in docking of unbound proteins from the DOCKGROUND X-ray benchmark set 4. The output of the global low-resolution docking scan was post-processed, separately, by constraints from the basic TM, constraints re-ranked by NLP, and the reference constraints. The quality of a match was assessed by the interface root-mean-square deviation. The results showed significant improvement of the docking output when using the constraints generated by the advanced TM with NLP.
CONCLUSIONS: The basic TM procedure for extracting protein-protein binding site residues from the PubMed abstracts was significantly advanced by the deep parsing (NLP techniques for contextual analysis) in purging of the initial pool of the extracted residues. Benchmarking showed a substantial increase of the docking success rate based on the constraints generated by the advanced TM with NLP.
PMID: 29506465 [PubMed - indexed for MEDLINE]
Gaussian accelerated molecular dynamics for elucidation of drug pathways.
Expert Opin Drug Discov. 2018 Oct 29;:1-11
Authors: Bhattarai A, Miao Y
INTRODUCTION: Understanding pathways and mechanisms of drug binding to receptors is important for rational drug design. Remarkable advances in supercomputing and methodological developments have opened a new era for application of computer simulations in predicting drug-receptor interactions at an atomistic level. Gaussian accelerated molecular dynamics (GaMD) is a computational enhanced sampling technique that works by adding a harmonic boost potential to reduce energy barriers. GaMD enables free energy calculations without the requirement of predefined collective variables. GaMD has proven useful in biomolecular simulations, in particular, the prediction of drug-receptor interactions. Areas covered: Herein, the authors review recent GaMD simulation studies that elucidated pathways of drug binding to proteins including the G-protein-coupled receptors and HIV protease. Expert opinion: GaMD is advantageous for enhanced simulations of, amongst many biological processes, drug binding to target receptors. Compared with conventional molecular dynamics, GaMD speeds up biomolecular simulations by orders of magnitude. GaMD enables routine drug binding simulations using personal computers with GPUs or common computing clusters. GaMD and, more broadly, enhanced sampling simulations are expected to dramatically increase our capabilities to determine the mechanisms of drug binding to a wide range of receptors in the near future. This will greatly facilitate computer-aided drug design.
PMID: 30371112 [PubMed - as supplied by publisher]