KEGG annotation of ORFs ====== KEGG is a database of hierarchical biological functions. - Website: https://www.genome.jp/kegg/ - Citation: Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic acids research. 2021 Jan 8;49(D1):D545-51. KOfam release 2022-03-01 was used as the reference for annotation. - Source: https://www.genome.jp/ftp/db/kofam/archives/2022-03-01/ KofamScan v1.3.0 was used to perform annotation. Command: ``` exec_annotation -f detail-tsv --no-report-unannotated -o output.tsv input.faa ``` KEGG release 102.0+ (2022-05-03) was used to construct high-level hierarchies of functional catalogs. - Source: From the KEGG website using the REST API. Statistics: - Total number of matches: 22,571,932 - Number of annotated ORFs: 21,541,087 - Number of KOs ("K"): 10,237 - Number of modules ("M"): 395 - Number of pathways ("ko"): 408 - Number of compounds ("C"): 6,363 * - Number of drugs ("D"): 109 - Number of glycans ("G"): 246 - Number of reactions ("R"): 4,515 - Number of rclasses ("RC"): 1,552 - Number of diseases ("H"): 527 * Note: drugs and glycans are included in the compound mappings. Database files: - orf-to-ko.map: Mapping of ORFs to KEGG orthologies (KOs). KOs are the entrance to the KEGG system. - For other files refer to "Typical filename patterns" of ../README. Collapsing order: ``` compound ^ EC > reaction > rclass \ / ^ ORF > KO -> module > class | \ v | > pathway > class \ v -> disease ``` Notes: - Three KOs: K16119, K16120, and K16121 are present in Kofam but obsolete in KEGG. They have no mapping.