Towards a better understanding of Protein-Protein Interaction Networks

Date

2014-12-23

Authors

GutiƩrrez-Bunster, Tatiana A.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Proteins participate in the majority of cellular processes. To determine the function of a protein it is not sufficient to solely know its sequence, its structure in isolation, or how it works individually. Additionally, we need to know how the protein interacts with other proteins in biological networks. This is because most of the proteins perform their main function through interactions. This thesis sets out to improve the understanding of protein-protein interaction networks (PPINs). For this, we propose three approaches: (1) Studying measures and methods used in social and complex networks. The methods, measures, and properties of social networks allow us to gain an understanding of PPINs via the comparison of different types of network families. We studied models that describe social networks to see which models are useful in describing biological networks. We investigate the similarities and differences in terms of the network community profile and centrality measures. (2) Studying PPINs and their role in evolution. We are interested in the relationship of PPINs and the evolutionary changes between species. We investigate whether the centrality measures are correlated with the variability and similarity in orthologous proteins. (3) Studying protein features that are important to evaluate, classify, and predict interactions. Interactions can be classified according to different characteristics. One characteristic is the energy (that is the attraction or repulsion of the molecules) that occurs in interacting proteins. We identify which type of energy values contributes better to predicting PPIs. We argue that the number of energetic features and their contribution to the interactions can be a key factor in predicting transient and permanent interactions. Contributions of this thesis include: (1) We identified the best community sizes in PPINs. This finding will help to identify important groups of interacting proteins in order to better understand their particular interactions. We furthermore find that the generative model describing biological networks is very different from the model describing social networks A generative model is a model for randomly generating observable data. We showed that the best community size for PPINs is around ten, different from the best community size for social and complex network (around 100). We revealed differences in terms of the network community profile and correlations of centrality measures; (2) We outline a method to test correlation of centrality measures with the percentage of sequence similarity and evolutionary rate for orthologous proteins. We conjecture that a strong correlation exists. While not obtaining positive results for our data. Therefore, (3) we investigate a method to discriminate energetic features of protein interactions that in turn will improve the PPIN data. The use of multiple data sets makes possible to identify the energy values that are useful to classify interactions. For each data set, we performed Random Forest and Support Vector Machine with linear, polynomial, radial, and sigmoid kernels. The accuracy obtained in this analysis reinforces the idea that energetic features in the protein interface help to discriminate between transient and permanent interactions.

Description

Keywords

Protein-Protein Interaction Networks, Data mining, Social network analysis, Protein-protein interaction, Bioinformatics

Citation