Mashup

Compact Integration of Multi-Network Topology for Functional Analysis of Genes
Computation and Biology Group, MIT CSAIL

Illustration of Mashup 

Mashup extracts a compact vector representation of topology that accurately explains the topological patterns of nodes in multiple heterogeneous interaction networks. The vector representations — one for each gene/protein — can then be readily plugged into off-the-shelf machine learning methods to derive functional insights about genes or proteins.

Reference

Compact Integration of Multi-Network Topology for Functional Analysis of Genes
Hyunghoon Cho, Bonnie Berger, Jian Peng
Cell Systems 3 (6), 2016 [Link]

A previous version of this work appeared in:

Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks
Hyunghoon Cho, Bonnie Berger, Jian Peng
International Conference on Research in Computational Molecular Biology (pp. 62-64), 2015 [Link]

Data and code

A MATLAB implementation of Mashup, example data sets (human and yeast), and evaluation code for gene function prediction using Mashup representations can be downloaded from [here].

Pre-trained vectors

The following text files contain pre-trained Mashup representations of genes in human and several model organisms. Each row contains the feature vector of a gene. The corresponding list of gene names (one per row) is also provided.

Used in our publication:

Other organisms:

Note: For these four organisms, we used the default setting of 1000 dimensions. While the downstream performance of our framework is quite robust to this parameter, you may want to consider using a different number more suitable for your application.