Science for Health
20 June 2011
NIMR scientists, in collaboration with computer scientists from Royal Holloway University of London, have developed a new method to create visual maps and find patterns of gene activity in genome data.
Measuring gene activity plays a central role in biological knowledge discovery and molecular diagnostics. In recent years, techniques have been developed that can measure the activity of all the genes in a genome simultaneously. These so-called high throughput genomic techniques, such as microarray analysis and next generation sequencing, produce large amounts of data. For example, the human genome has approximately 25000 genes and a typical experiment might measure the activity of these genes in 10 or more different samples. The analysis of these ‘gene expression’ experiments is challenging because of the scale and complexity of the data.
Often the aim of a study is to find patterns in the data and to identify sets of genes that behave similarly across many samples. The volume of data and limitations in current analysis methods means that this is difficult. To address this deficiency, Natascha Bushati, working in the lab of James Briscoe (pictured) at NIMR, collaborated with Chris Watkins, a computer scientist at Royal Holloway, to develop new software tools. Taking advantage of the latest methods from the computer science and machine learning fields, they developed a way to map and visualize gene expression data. The software they produced creates an interactive two-dimensional map in which each gene is represented by a point, and genes with similar patterns are located close together. This map can be used to study and explore gene expression data, in a way that is similar to how an ordinary map is used to survey a country.
Click image to view at full-size
A map of gene activity from yeast cells reveals cyclic gene expression over time (T0-T36). Different genes cycle with different phases but genes with similar phases are grouped together in the plot. The cyclic phase of small numbers of genes are highlighted by the different colours.
The method provides an intuitive and flexible way to interact with and analyze gene expression data. It reveals the local relationship between pairs of genes and at the same time provides a global view of all the genes. The widespread use of gene expression data in biomedical research means that this new method is widely applicable. The visgenex software is freely available on NIMR's website.
As a biologist I found working with computer scientists an exciting challenge. We are pleased with the method we developed as it is a significant improvement over what is currently available and we are confident that it will help provide new insight into the causes of disease and ill health.
Natascha Bushati
The software we have developed brings together the latest algorithms from computer science with cutting edge techniques in biological research. The explosion of data being generated by the genomic revolution in biology means that interdisciplinary collaborations are increasing important in order to analyse and understand the results of biological experiments. We intend to continue working with our computer science colleagues to extend this method and develop further tools to support the analysis of genomic data.
James Briscoe
An intuitive graphical visualization technique for the interrogation of transcriptome data.
Natascha Bushati, James Smith, James Briscoe and Christopher Watkins (2011).
Nucleic Acids Research epub ahead of print.
© MRC National Institute for Medical Research
The Ridgeway, Mill Hill, London NW7 1AA
Top of page