Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan;42(Database issue):D199-205.
doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

Data, information, knowledge and principle: back to metabolism in KEGG

Affiliations

Data, information, knowledge and principle: back to metabolism in KEGG

Minoru Kanehisa et al. Nucleic Acids Res. 2014 Jan.

Abstract

In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A schematic diagram of genome annotation in KEGG. It consists of two parts: defining KO entries represented by K numbers (right) and assigning K numbers to genes in complete genomes (left). The KO definition is manually done, but the K number assignment is highly computerized (see text).
Figure 2.
Figure 2.
Reaction data processing in KEGG. The reaction formula is decomposed into a set of reactant pairs, one-to-one relationships of substrate–product pairs. Each reactant pair is characterized by the local structure transformation pattern, called RDM pattern of KEGG atom type changes. Among the reactant pairs that appear in the KEGG pathway maps, distinct RDM patterns are used to define reaction class entries identified by RC numbers. The reaction module is a conserved sequence of RC numbers observed in different pathways. This example shows the RDM pattern (R for reaction center atoms in red, D for difference region atoms in green and M for matched region atoms in blue) of RC00067, which appears in the reaction module RM001 variant 01 (see details in http://www.kegg.jp/kegg/reaction/rmodule.html).
Figure 3.
Figure 3.
An example of the modular architecture of the metabolic network. The reaction module RM001 for chain extension of 2-oxocarboxylic acids (large circles with the number of carbons) is used in combination with other reaction modules to generate amino acids (red circles), glucosinolates (green circles) and coenzyme B (blue circle) (see details in http://www.kegg.jp/pathway/map01210).
Figure 4.
Figure 4.
The KEGG DRUG D numbers play a role of integrating different types of data and information, now including drug labels and drug products. Here, D numbers are used to integrate the ATC classification and drug products, effectively classifying all drug products in the USA (see http://www.kegg.jp/brite/br08303_ndc).

Similar articles

Cited by

References

    1. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 2012;40:D109–D114. - PMC - PubMed
    1. Kanehisa M. Chemical and genomic evolution of enzyme-catalyzed reaction networks. FEBS Lett. 2013;587:2731–2737. - PubMed
    1. Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M. Modular architecture of metabolic pathways revealed by conserved sequences of reactions. J. Chem. Inf. Model. 2013;53:613–622. - PMC - PubMed
    1. McDonald AG, Boyce S, Tipton KF. ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res. 2009;37:D593–D597. - PMC - PubMed
    1. Lespinet O, Labedan B. ORENZA: a web resource for studying ORphan ENZyme activities. BMC Bioinformatics. 2006;7:436. - PMC - PubMed

Publication types

MeSH terms

Substances