Princeton University
Lewis-Sigler Institute for Integrative Genomics

Princeton Center for Quantitative Biology Tools and Resources

Faculty , Fellows

Publications

Bioinformatics tools and resources
Download data

Lewis-Sigler Institute Microarray Facility
Lewis-Sigler Institute Imaging Facility
Princeton Mass Spectrometry Facility

Undergraduate Curriculum
Princeton Molecular Biology Outreach

Harvard-Princeton NIGMS Centers Seminars
There are no upcoming seminars. View past seminars.

The large amount of experimental data generated in Center labs are stored in the PUMAdb, part of the Microarray Core resource. The PUMAdb is a critical resource for archiving, analyzing, and publishing our experimental results in their entirety. Once published, all the data are made public through the PUMAdb.

Having the data in their entirety across different experimental conditions is crucial as we make progress toward our goal of generating accurate cellular models. Of course, we are not the only ones producing these types of data. Thus, there are efforts within the Center to collect data from external sources as well. One such effort collected over 130 expression data sets; an algorithm was developed to query the data set collection for those experiments most informative for a particular set of query genes. The SPELL tool provides a web interface to this collection and analysis method.

Externally generated data are combined with our experimental results and are used for data integration methods such as those employed by the Troyanskaya lab. For example, the Troyanskaya lab has developed BioPIXIE, a system to discover interaction networks and pathways using a Bayesian approach.

Researchers at the Center for Quantitative Biology have generated several other public tools and resources for data analysis and visualization. These include:

ChARM: Chromosomal Aberration Region Miner : Chromosomal aberration detection tool
FIRE : FIRE is a motif discovery and characterization program based on mutual information.
GOLEM : GOLEM is a userful tool which allows the viewer to navigate and explore a local portion of the Gene Ontology (GO) hierarchy. Users can also load annotations for various organisms into the ontology in order to search for particular genes, or to limit the display to show only GO terms relavent to a particular organism, or to quickly search for GO terms enriched in a set of query genes
GRIFn : GRIFn is a novel system for interactive evaluation of functional genomic data and methods. It allows you to upload your own data, view evaluations in multiple contexts, and compare it with other published high throughput data.
Generic Gene Ontology (GO) Term Finder : This generic ("multi-organism") GO Term Finder web tool finds significant GO terms shared among a list of genes from your organism of choice, helping you discover what these genes may have in common. The implementation of this Generic GO Term Finder depends on the GO-TermFinder software written by Gavin Sherlock and Shuai Weng at Stanford University, made publicly available through the GMOD project.
Generic Gene Ontology (GO) Term Mapper : This generic ("multi-organism") GO Term Mapper web tool maps the granular GO annotations for genes in a list to a set of GO slim terms, allowing you to bin your genes into broad categories. The implementation of this Generic GO Term Mapper uses map2slim.pl script written by Chris Mungall at Berkeley Drosophila Genome Project, and some of the modules included in the GO-TermFinder distribution written by Gavin Sherlock and Shuai Weng at Stanford University, made publicly available through the GMOD project.
Inquiry Bioinformatics Suite : The Inquiry Bioinformatics Suite provides commonly used bioinformatics tools such as the EMBOSS software suite, BLAST, and HMMer.
MEFIT : a Microarray Experiment Functional Integration Technology : MEFIT is a Microarray Experiment Functional Integration Technology. Given any amount of microarray data, it predicts the probability of pairwise functional relationship for any gene pair within individual biological functions.
Nearest Neighbor Networks (NNN) : Nearest Neighbor Networks (NNN) is a graph-based algorithm used to cluster genes with similar microarray expression profiles. The NNN clustering method is an alternative to classical techniques such as hierarchical and K-means clustering. NNN generates clusters of functionally related genes with high precision, and the clusters generally represent a broader selection of biological processes than those produced by other methods; NNN performs best on data sets with many conditions and on datasets that are modular (i.e. contain several grouped subsets of conditions). The NNN algorithm is described in Huttenhower et al. 2007 (http://www.biomedcentral.com/1471-2105/8/250) and was developed in the Troyanskaya and Coller labs, and the web tool was implemented by Juan Alvarez in the Bioinformatics group at Princeton.

P-POD : Princeton Protein Orthology Database : P-POD displays families of predicted orthologs from P. falciparum, H. sapiens, D. melanogaster, M. musculus, A. thaliana, C. elegans, D. rerio, and S. cerevisiae with an emphasis on providing information about disease-related genes and experimental confirmation of orthology from the literature.
PUMAdb publications page : Published microarray data and web supplements from Princeton researchers
Princeton University Microarray Database (PUMAdb) : The Princeton University MicroArray database (PUMAdb) stores raw and normalized data from microarray experiments, as well as their corresponding image files. In addition, PUMAdb provides interfaces for data retrieval, analysis and visualization. Princeton researchers and their collaborators should register for a database account.
SPELL : Serial Pattern of Expression Levels Locator : SPELL (Serial Pattern of Expression Levels Locator) is a query-driven search engine for large gene expression microarray compendia. Given a small set of query genes, SPELL identifies which datasets are most informative for these genes, then within those datasets additional genes are identified with expression profiles most similar to the query set.
Virus Infection Project : The Virus Infection Project (VIP) is a web tool that provides a way to look at information about transcripts during CMV infections.
Yeast Functional Genomics Database (YFGdb) : The goal of YFGdb is to collect and freely disseminate all available yeast functional genomics data, along with requisite analysis tools, to the yeast community and the biomedical research community at large. YFGdb contains data sets from microarray as well as many other genomics/proteomics studies including large-scale interaction and phenotype experiments. YFGdb has been implemented using the Generic Model Organism Database Construction Set as part of the GMOD project.
bioPIXIE : bioPIXIE is a novel system for biological data integration and visualization. It allows you to discover interaction networks and pathways in which your gene(s) of interest participate.

With our data collection infrastructure and genomics data integration methods in hand, we have been expanding upon the types of data that we collect (for example, images, metabolite data) so that we (and others, because we make all data and requisite tools available to all) can build more rich models of biological processes.

BACK TO TOP



Copyright © 2006-2007 Lewis-Sigler Institute for Integrative Genomics.
Princeton University. All Rights Reserved.