Data, information, knowledge and principle: back to metabolism in KEGG

doi:10.1093/nar/gkt1076

. 2014 Jan;42(Database issue):D199-205.

doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

Data, information, knowledge and principle: back to metabolism in KEGG

Minoru Kanehisa¹, Susumu Goto, Yoko Sato, Masayuki Kawashima, Miho Furumichi, Mao Tanabe

Affiliations

Affiliation

¹ Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan and Life Science Solutions Department, Fujitsu Kyushu Systems Ltd., Sawara-ku, Fukuoka 814-8589, Japan.

PMID: 24214961
PMCID: PMC3965122
DOI: 10.1093/nar/gkt1076

Data, information, knowledge and principle: back to metabolism in KEGG

Minoru Kanehisa et al. Nucleic Acids Res. 2014 Jan.

. 2014 Jan;42(Database issue):D199-205.

doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

Authors

Minoru Kanehisa¹, Susumu Goto, Yoko Sato, Masayuki Kawashima, Miho Furumichi, Mao Tanabe

Affiliation

¹ Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan and Life Science Solutions Department, Fujitsu Kyushu Systems Ltd., Sawara-ku, Fukuoka 814-8589, Japan.

PMID: 24214961
PMCID: PMC3965122
DOI: 10.1093/nar/gkt1076

Abstract

In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

PubMed Disclaimer

Figures

**Figure 1.**
A schematic diagram of genome annotation in KEGG. It consists of two parts: defining KO entries represented by K numbers (right) and assigning K numbers to genes in complete genomes (left). The KO definition is manually done, but the K number assignment is highly computerized (see text).

**Figure 2.**
Reaction data processing in KEGG. The reaction formula is decomposed into a set of reactant pairs, one-to-one relationships of substrate–product pairs. Each reactant pair is characterized by the local structure transformation pattern, called RDM pattern of KEGG atom type changes. Among the reactant pairs that appear in the KEGG pathway maps, distinct RDM patterns are used to define reaction class entries identified by RC numbers. The reaction module is a conserved sequence of RC numbers observed in different pathways. This example shows the RDM pattern (R for reaction center atoms in red, D for difference region atoms in green and M for matched region atoms in blue) of RC00067, which appears in the reaction module RM001 variant 01 (see details in http://www.kegg.jp/kegg/reaction/rmodule.html).

**Figure 3.**
An example of the modular architecture of the metabolic network. The reaction module RM001 for chain extension of 2-oxocarboxylic acids (large circles with the number of carbons) is used in combination with other reaction modules to generate amino acids (red circles), glucosinolates (green circles) and coenzyme B (blue circle) (see details in http://www.kegg.jp/pathway/map01210).

**Figure 4.**
The KEGG DRUG D numbers play a role of integrating different types of data and information, now including drug labels and drug products. Here, D numbers are used to integrate the ATC classification and drug products, effectively classifying all drug products in the USA (see http://www.kegg.jp/brite/br08303_ndc).

See this image and copyright information in PMC

Cited by

Genomic reconstruction of short-chain fatty acid production by the human gut microbiota.
Frolova MS, Suvorova IA, Iablokov SN, Petrov SN, Rodionov DA. Frolova MS, et al. Front Mol Biosci. 2022 Aug 11;9:949563. doi: 10.3389/fmolb.2022.949563. eCollection 2022. Front Mol Biosci. 2022. PMID: 36032669 Free PMC article.
MRE: a web tool to suggest foreign enzymes for the biosynthesis pathway design with competing endogenous reactions in mind.
Kuwahara H, Alazmi M, Cui X, Gao X. Kuwahara H, et al. Nucleic Acids Res. 2016 Jul 8;44(W1):W217-25. doi: 10.1093/nar/gkw342. Epub 2016 Apr 29. Nucleic Acids Res. 2016. PMID: 27131375 Free PMC article.
DNA microarray integromics analysis platform.
Waller T, Gubała T, Sarapata K, Piwowar M, Jurkowski W. Waller T, et al. BioData Min. 2015 Jun 25;8:18. doi: 10.1186/s13040-015-0052-6. eCollection 2015. BioData Min. 2015. PMID: 26110022 Free PMC article.
Long-term continuous mono-cropping of Macadamia integrifolia greatly affects soil physicochemical properties, rhizospheric bacterial diversity, and metabolite contents.
Tao L, Zhang C, Ying Z, Xiong Z, Vaisman HS, Wang C, Shi Z, Shi R. Tao L, et al. Front Microbiol. 2022 Oct 6;13:952092. doi: 10.3389/fmicb.2022.952092. eCollection 2022. Front Microbiol. 2022. PMID: 36274682 Free PMC article.
A key genetic factor for fucosyllactose utilization affects infant gut microbiota development.
Matsuki T, Yahagi K, Mori H, Matsumoto H, Hara T, Tajima S, Ogawa E, Kodama H, Yamamoto K, Yamada T, Matsumoto S, Kurokawa K. Matsuki T, et al. Nat Commun. 2016 Jun 24;7:11939. doi: 10.1038/ncomms11939. Nat Commun. 2016. PMID: 27340092 Free PMC article.

See all "Cited by" articles

References

1. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 2012;40:D109–D114. - PMC - PubMed
1. Kanehisa M. Chemical and genomic evolution of enzyme-catalyzed reaction networks. FEBS Lett. 2013;587:2731–2737. - PubMed
1. Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M. Modular architecture of metabolic pathways revealed by conserved sequences of reactions. J. Chem. Inf. Model. 2013;53:613–622. - PMC - PubMed
1. McDonald AG, Boyce S, Tipton KF. ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res. 2009;37:D593–D597. - PMC - PubMed
1. Lespinet O, Labedan B. ORENZA: a web resource for studying ORphan ENZyme activities. BMC Bioinformatics. 2006;7:436. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

[1] Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 2012;40:D109–D114. - PMC - PubMed

[2] Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular datasets. Nucleic Acids Res. 2012;40:D109–D114. - PMC - PubMed

[3] Kanehisa M. Chemical and genomic evolution of enzyme-catalyzed reaction networks. FEBS Lett. 2013;587:2731–2737. - PubMed

[4] Kanehisa M. Chemical and genomic evolution of enzyme-catalyzed reaction networks. FEBS Lett. 2013;587:2731–2737. - PubMed

[5] Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M. Modular architecture of metabolic pathways revealed by conserved sequences of reactions. J. Chem. Inf. Model. 2013;53:613–622. - PMC - PubMed

[6] Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M. Modular architecture of metabolic pathways revealed by conserved sequences of reactions. J. Chem. Inf. Model. 2013;53:613–622. - PMC - PubMed

[7] McDonald AG, Boyce S, Tipton KF. ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res. 2009;37:D593–D597. - PMC - PubMed

[8] McDonald AG, Boyce S, Tipton KF. ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res. 2009;37:D593–D597. - PMC - PubMed

[9] Lespinet O, Labedan B. ORENZA: a web resource for studying ORphan ENZyme activities. BMC Bioinformatics. 2006;7:436. - PMC - PubMed

[10] Lespinet O, Labedan B. ORENZA: a web resource for studying ORphan ENZyme activities. BMC Bioinformatics. 2006;7:436. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Data, information, knowledge and principle: back to metabolism in KEGG

Affiliation

Data, information, knowledge and principle: back to metabolism in KEGG

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources