Content area
Full text
Background & Summary
The multimodal, three-dimensional (3D) Human Reference Atlas (HRA)1,2 aims to map the healthy, adult human body across scales—from the whole body to the single cell and biomarker levels. Data from different sources (organs, technologies, and labs), analyzed and used following standard operating procedures (SOPs, humanatlas.io/standard-operating-procedures), need to be integrated so they can be explored across scales. The HRA Knowledge Graph (KG) defines and provides the core data structures that are used to store, link, query, and explore HRA data.
KGs are widely used to store and interlink data about relevant entities within a specific domain or task. The Google Knowledge Graph3 supports Google Search with its billions of searches processed daily. Major online shopping retailers such as Amazon4 use KGs to organize products, searches, and media items5. KGs across domains are structured using vocabularies, e.g., Friend of a Friend6, Simple Knowledge Organization System (SKOS)7, and Music Ontology8,9. Plus, there exist collaborative efforts for publishing structured data on the web. A major, widely used, light-weight data format for KGs to provide structured data on the web is JavaScript Object Notation for Linked Data (JSON-LD, json-ld.org). Many biomedical ontologies and metadata standards are provided in JSON-LD, such as in the Open Biological and Biomedical Ontologies (OBO) Foundry10,11 and National Center for Biomedical Ontology (NCBO) BioPortal12, and tools exist to convert ontologies to JSON-LD, such as Protégé (protege.stanford.edu)13, ROBOT14, and rdflib (rdflib.readthedocs.io/en/stable). Spearheaded by major industry companies, including Google and Microsoft, metadata schemas on schema.org promote the structured representation for data on the web and provide shared vocabulary in various encodings, including Resource Description Framework (RDF, see Box 1) and JSON-LD. They describe entities and relationships in the semantic web and other structured data efforts. An overview of other commonly used vocabularies is available on www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies.
This paper presents the HRA KG v2.2, which uses 10 ontologies (see Other Ontologies in Methods) to interlink 33 anatomical structures (ASs), cell types (CTs), plus biomarkers (B) tables (see Box 1), 71 3D reference objects for organs, 22 Functional Tissue Units (FTUs)15, 11,698 single-cell (sc) datasets, and other HRA Digital Objects (DOs). The HRA KG is accessible...




