The principles for how genes are activated and inactivated are known but from a genomic perspective our knowledge is very limited. Each cell type has a unique set of active genes that are regulated by the action of a collection of the 2000 transcription factors and other nuclear proteins that bind the DNA molecule.
Until recently this could only be studied in vitro and for parts of genes. We use chromatin immunoprecipitation (ChIP) to study this in vivo. For detection we have developed efficient massive parallel sequencing (ChIP-seq) techniques, which allows us to interrogate the whole genome.
The traditional view of a gene, with a single beginning and end, has been challenged and in addition to the previously known enhancers and other distant regulatory elements, multiple promoters and complex alternative splicing has been found. We therefore annotate all identified DNA-protein interactions relative to everything that is known about the genome.
Revealing mechanisms behind diseases
These studies generate massive amounts of data and in order to fully explore the information we develop new informatics strategies and collaborate with specialists in the field. The methods can be used to reveal the mechanisms for common diseases and cancer. We have started to explore this in liver cells and immune cells and have found hundreds of regulatory variants that likely explain association to common metabolic and autoimmune diseases. We have also characterized a large collection of regulatory variants that are excellent candidates to contribute to cancer.