  About the ENCODE Data Coordination Center (DCC)

The Encyclopedia of DNA Elements (ENCODE) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of the consortium is to build a comprehensive parts list of the functional elements of the human genome, including elements that act at the protein level (coding genes) and RNA level (non-coding genes), and regulatory elements that control the cells and circumstances in which a gene is active. The discovery and annotation of gene elements is accomplished primarily by sequencing RNA from a diverse range of sources, comparative genomics, integrative bioinformatic methods, and human curation. Regulatory elements are typically investigated through DNA hypersensitivity assays, assays of DNA methylation, and chromatin immunoprecipitation (ChIP) of proteins that interact with DNA, including modified histones and transcription factors, followed by sequencing (ChIP-Seq). The results of ENCODE experiments, collected in the ENCODE DCC database, are displayed on the UCSC Genome Browser. The data can also be downloaded from the ENCODE DCC website in text format.

ENCODE data is now available for the entire human genome.  Mouse ENCODE experiments are currently underway, and data on the mouse genome from such experiments will be made available as soon as possible. To access the human ENCODE data, open the Genome Browser, select the March 2006 assembly of the human genome, and go to your region of interest. ENCODE tracks will be marked with the NHGRI logo . The bulk of the ENCODE data can be found in the Expression and Regulation track groups, with a few in the Mapping, Genes, and Variation groups. Although most participating research groups have provided several tracks, generally only selected data from each research group are displayed by default. Click the hyperlinked name of a particular track to display a page containing configuration options and details about the methods used to generate the data. See the Genome Browser User's Guide for further information about displaying tracks and navigating in the Genome Browser. To receive notifications of ENCODE data releases and related news by email, subscribe to the encode-announce mailing list.

Data from the earlier ENCODE project pilot phase, which covered approximately 1% of the genome,  are available on the May 2004 and March 2006 human assemblies. The ENCODE Pilot Project web pages provide convenient browser access to these regions.

Before publishing research that uses ENCODE data, please read the data release policy, which places some restrictions on publication use of data for nine months following the data release.


16 November 2010 - Release of the first ENCODE RNA-seq data on hg19:

We are pleased to announce the release of the first ENCODE RNA-seq data on the GRCh37/hg19 human browser. Two tracks have just been released; these are organized in the new ENC RNA-seq super-track within the browser 'Expression' track group. The super-track provides additional documentation and organization by data type. The two tracks released are:

CSHL Sm RNA-seq: This track depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or subcellular compartments from ENCODE cell lines.

GIS RNA-seq: This track shows high throughput sequencing of RNA samples from tissues or subcellular compartments from cell lines included in the ENCODE Transcriptome subproject.

All of the data that comprise these tracks were originally released on hg18 and have been remapped to hg19.

15 Nov 2010 - New ENCODE Tutorial at OpenHelix: OpenHelix, together with the UCSC Genome Bioinformatics group, anounce a new online tutorial suite to teach users how to access the ENCODE data in the UCSC Genome Browser. Read more.

16 September 2010 - First Production ENCODE Data on hg19 has been Released: We are pleased to announce the release of the first sets of production ENCODE data on hg19. Read more.

