Disease-gene associations mined from literature

The DISEASES resource is available for download:

Text mining channel: full filtered
Knowledge channel: full filtered
Experiments channel: full filtered
Integrated channel (experimental): full

The files contain all links in the DISEASES database. All files start with the following four columns: gene identifier, gene name, disease identifier, and disease name. The knowledge files further contain the source database, the evidence type, and the confidence score. The experiments files instead contain the source database, the source score, and the confidence score. Finally, the textmining files contain the z-score, the confidence score, and a URL to a viewer of the underlying abstracts.

Download files from earlier versions are archived on figshare.

DISEASES tagger and the latest dictionary of human gene and disease names can also be downloaded for local installation on Unix platforms. We also make available a list of PubMed IDs for excluded publications from research papermills.

Creative Commons License