BenchSci Blog

20 Useful Bioinformatic Tools and Resources For Protein Science

Written by Mohamed Helmy, Ph.D. | Jan 25, 2018 2:32:15 PM

In my previous article, I discussed several bioinformatic resources to help with your antibody selection. In this article, I will introduce additional tools and resources for protein science to help you identify the functionality of proteins in study. I will categorize these resources by the types of study they can be used for, which includes protein interactions, pathways, and regulations, as well as post-translational modifications.

 

 

Protein Interactions and Gene Regulations

Protein interaction is one of the most commonly studied characteristic for a given protein. It refers to the physical contacts between proteins. Studying PPI is essential for understanding the role of a particular protein in the cell, since proteins rarely act alone. There are different ways of studying PPI and, therefore, different types of resources through which the interactions of certain proteins can be obtained.

Similar to PPI, the gene interaction networks (or the gene regulatory networks) are also very useful when investigating the function of your gene of interest and understanding the events leading to and governing the expression of the mRNA of the protein in study.

The major resources for PPI listed here are meta databases, meaning that they host data from different sources and for different organisms:

  • String database is a huge resource of PPI with over 1.4 trillion interactions between 9.6 million proteins in over 2,000 organisms.
  • IntAct provides literature curated or independently submitted protein interactions for 104,510 interactors together with analysis tools for molecular interaction data. IntAct also provides ComplexPortal, a manually curated resource for macromolecular complexes from a number of key model organisms.
  • The Comprehensive Resource of Mammalian Protein Complex (CORUM) is another resource for protein complex information.
  • The new Human Reference Protein Interactome (HuRI) Project web portal provides pre-publication access to the interaction data of the HuRI project.
  • For gene interactions, the GeneMania database is a great resource which hosts ~600 million interactions between 163,599 genes in 9 organisms.

Pathways

Another common analysis that investigates the role of genes and proteins in cellular processes is the pathway enrichment analysis. It aims to identify the genetic and signalling pathways in which a gene or a protein is involved. This is mainly to reduce the dimensions of data by arranging genes and proteins into pathways for the purpose of data interpretation in the context of biological processes, pathways, and networks. Several tools can be used to identify the pathways that your gene or protein is part of:

  • The most common tools are g:Profiler and DAVID protein servers.
  • Obtaining information about a particular pathway can also be done using pathway databases such as Reactome, KEGG, and PathwayCommons.
  • Finally, the best way to visualize the pathway enrichment analysis results is to use Cytoscape and its Enrichment Map Plugin for the network data integration, analysis, and visualization.

Post-translational Modifications

Post-translational modifications (PTMs) are protein enzymatic modifications that occur after the protein biosynthesis process. They are essential for the cell signalling process and several disease states are attributed to the changes in the PTMs of specific proteins. Several public resources provide PTM information for a protein or a set of proteins:

  • UniProt provides a dedicated section for PTM and processing.
  • The bioinformatics resources portal, ExPaSy, also contains a database for PTMs.
  • PhosphoSitePlus is a resource providing comprehensive information and tools for the study of protein PTMs for over 50,000 proteins with ~500,000 identified PTM.
  • The PTM Structural Database (PTM-SD) provides access to proteins for which PTMs are both experimentally annotated and structurally resolved.

There are also tools and web servers that investigate the impact of PTMs on cellular processes and the relation between them and diseases:

  • MIMP is a web server for predicting the impact of mutations on kinase-substrate phosphorylations.
  • The Activedriverdb is an online database developed to visualize and explore mutations affecting PTM sites in human proteins.
  • BioTools has 36 more resources and tools for PTM data acquisition, analysis and visualization.

In my next article, I will introduce resources for investigating the relationship between your protein of interest and various diseases, biomarkers, and drug targets. Until then, check out our own bioinformatics tool to help you quickly and easily identify published data for a given antibody to help with your antibody selection.