Retrieval of Enzyme Category and Subcellular Localization for Use in Metabolic Network Analysis
|Eugene W. Hinderer1, Hunter N.B. Moseley2|
2University of Louisville
The exponential growth of genomic and transcriptomic sequencing over the last decade has driven a dramatic increase in the size of the gene and protein sequence repositories and derived protein knowledge databases like Uniprot. To integrate this sequence and functional information with other high-throughput omics-level technologies like mass spectrometry and nuclear magnetic resonance-based metabolomics, new bioinformatics tools are required to gather and organize relevant biological data in an automatic manner. These new tools are enabling a systems biochemical approach to the computational modeling of cellular metabolism as metabolic networks. Such representations can aid in the study of metabolic processes involved in disease and the discovery of drug targets. Towards this end, we present our ongoing efforts to retrieve subcellular localization and enzyme reaction category annotations from knowledge bases for visualizing subcellular compartmentalization of atom-resolved metabolic networks and aiding the study of cellular processes under various conditions. The retrieval is accomplished by parsing several annotated databases like UniProt and cross-referencing annotations with relevant Gene Ontology (GO) terms using object-oriented programs written in the Python programing language. This cross-referencing requires extracting distinct sub-directed acyclic graphs of GO terms related to specific subcellular locations/compartments using a starting set of relevant GO terms and has_a/part_of relationships from the overall GO directed acyclic graph structure. Initial analyses show specific patterns of enzymatic reactions in human cells that vary quite widely with subcellular localization. Also, different caveats are revealed about the type and completeness of annotation in specific public databases.