DisProt


In molecular biology, DisProt is a curated biological database collection of intrinsically unstructured proteins. It is a community resource annotating protein sequences for intrinsically disorder regions from the literature. DisProt classifies intrinsic disorder based on experimental methods and three ontologies for molecular function, transition and binding partners.
Historically, the study of disordered proteins has been hampered by the lack of an organised resource collecting them and their properties together. Release 7 of DisProt contains information on more than 800 proteins. Each protein entry in DisProt is characterised by a DisProt identifier which takes the form of the prefix DP followed by a 5 digit protein identifier. For example, DP00016 refers to the Cyclin-dependent kinase inhibitor 1 protein. Release 8 of DisProt contains more than 1400 non ambiguous entries and over 3000 disordered protein regions. DisProt 8 also introduced the concept of a stable DisProt region identifier. DisProt has been widely used to train software methods to predict disordered regions in proteins. In addition, DisProt has been used to understand the properties of intrinsically unstructured proteins.

Website

The DisProt website provides users with an interface to search by keyword, freetext, or by sequence similarity using BLAST. Users can also browse through the entries by their identifier, detection method or PubMed ID. The entire data set can be downloaded from the website in either CSV or JSON format.
DisProt web-server exposes some RESTful endpoints allowing programmatic access to DisProt and retrieval of different data types. Available GET routes provide access to all available data given a DisProt ID, a list of entries of a given type or a list of functional terms used for DisProt annotation.