Eukaryotic Promoter Database


EPD is a biological database and web resource of eukaryotic RNA polymerase II promoters with experimentally defined transcription start sites. Originally, EPD was a manually curated resource relying on transcript mapping experiments targeted at individual genes and published in academic journals. More recently, automatically generated promoter collections derived from electronically distributed high-throughput data produced with the CAGE or TSS-Seq protocols were added as part of a special subsection named EPDnew. The EPD web server offers additional services, including an entry viewer which enables users to explore the genomic context of a promoter in a UCSC Genome Browser window, and direct links for uploading EPD-derived promoter subsets to associated web-based promoter analysis tools of the and servers. EPD also features a collection of position weight matrices for common promoter sequence motifs.

History and Impact

EPD was created in 1986 as an electronic version of a eukaryotic promoter compilation published in an article and has been regularly updated since then. The database was initially distributed on magnetic tapes as part of the EMBL data library and later via the Internet. The collaboration between EPD and the EMBL library was cited as a pioneering example of remote nucleotide sequence annotation by domain experts. EPD has played an instrumental role in the development and evaluation of promoter prediction algorithms as it is broadly considered the most accurate promoter resource. As of November 2014, it has been cited about 2500 times in scientific literature. EPD has also received ample coverage by textbooks in bioinformatics and systems biology.