|
Types of correspondences: equivalence, subsumption, others Purpose of ontology alignment
|
tarix | 17.01.2018 | ölçüsü | 445 b. | | #21099 |
|
Linking two ontologies by detecting semantic correspondences between their representational units Types of correspondences: equivalence, subsumption, others Purpose of ontology alignment: - Creating interoperability between semantically annotated data
- Enriching semantics
- Cross-Validation of ontologies
Requirements of ontology alignment: - comparable scope
- comparable context
- comparable semantic foundations
Outline Introduction Methodology Assessment Conclusion
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
BioTop – a Life Science Upper Ontology Recent development (starting 2006, Freiburg & Jena) Goal: to provide formal definitions of upper-level types and relations for the biomedical domain Uses description logics (OWL-DL) - 339 classes, 60 relation types
- 373 subclass axioms
- 80 equivalent class axioms, 66 disjoint class axioms
links to OBO ontologies downloadable from: http://purl.org/biotop
UMLS Semantic Network (SN) Tree of 135 semantic types (e.g. Tissue, Diagnostic_Procedure) 53 associative relationships (e.g., treats, location_of) 612 relational assertions (triples), sanctioning the domain and range of relations {Tissue; location_of; Diagnostic_Procedure} mainly unchanged in the last 20 years
UMLS Semantic Network (SN)
Comparison UMLS-SN - BioTop
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Methodology Prerequisite: provide description logics semantics to the UMLS SN: umlssn.owl - Subsumption ⊑
- Equivalence ≡
Redefinition of UMLS SN semantics
Redefinition of UMLS SN semantics Semantic Types, e.g.: Tissue, Diagnostic_Procedure: - Types extend to classes of individuals
- subsumption hierarchies = is-a hierarchies (every instance of a child is also an instance of each parent)
- no explicit disjoint partitions
Semantic Relations, e.g.: treats, location_of: - Reified as classes, not represented as OWL object properties
Triples, e.g.: {Tissue; location_of; Diagnostic_Procedure} - domain and range restrictions = value restrictions on the roles has-domain and has-range
UMLS SN: Why SRs as classes … and not OWL object properties? (I)
UMLS SN: Why SRs as classes .. and not OWL object properties? (II) Source Representation Target Representation
All triples including R are defined as subclasses of R Affects_Domain_Cell_Component_Range_Physiologic_Function ⊑ Affects ⊓ has_domain. Cell_Component ⊓ has_range. Physiologic_Function All parents are fully defined by the union of their children Brings_About ≡ Produces ⊔ Causes
Mapping
Mapping Fully manually, using Protégé 4, consistency check with Fact++ and Pellet 1.5, supported by explanation plugin* Analyzing - UMLS SN hierarchies and free-text definitions
- BioTop formal and free-text definitions
Iterative check of - logic consistency (DL classifier)
- domain adequacy (analysis of new entailments)
Mapping workflow
Mapping of UMLS Types Direct Match (often after content addition to BioTop): sn:Plant ≡ bt:Plant Restriction mapping: sn:AnatomicalAbnormality ≡ bt:OrganismPart ⊓ bt:bearerOf.bt:PathologicalCondition Union: sn:Gene_Or_Genome ≡ bt:Gene ⊔ bt:Genome. Out of scope sn:Daily_Or_Recreational_Activity ⊑ bt:Action ⊓ bt:hasParticipant.bt:Human No mapping sn:Idea_or_concept
Mapping of domain and range sn:hasDomain ≡ bt:hasAgent sn:hasRange ≡ bt:hasPatient Mapping of (reified) SN relations sn:Affects≡ bt:Affecting
Linkage of (reified) SN relations to BioTop relations by augmented restrictions: sn:hasDomain (bt:physicalPartOf (ImmaterialPhysicalEntity ⊔ MaterialEntity)) ⊓ sn:hasRange (bt:hasPhysicalPart (ImmaterialPhysicalEntity ⊔ MaterialEntity))
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Formative evaluation of BioTop: Mapping and subsequent classification unveils hidden problems in BioTop: - Faulty disjointness axioms (e.g. bt:Organic Chemical was disjoint from bt:Carbohydrate)
- ambiguities: Sequence as information entity vs. sequence as molecular structure
- granularity mismatches: e.g. Chromosome as molecule
Assessment: NE co-occurrences Named Entity tagging, UMLS concept pairs identified in 15 M PubMed abstracts Expert rating with sample of co-occurrences: which are semantically related?
Assessment: NE co-occurrences Using SN alone: very low agreement with expert rating Using SN+BioTop: very few rejections (only 3) Reasons: - false-positive rate: Expert rating done on NE (e.g. Superoxide reductase unrelated with Aldehyde), but system judgments at type level: sn:Enzyme related to sn:Organic Chemical
- few rejections: DL’s open world semantics
Assessment: finding incompatible semantic types Each UMLS concept is categorized by one or more UMLS SN types 397 different SN type combinations Using UMLS-SN BioTop Bridge: 133 combinations inconsistent, affecting 6116 UMLS concepts Main reason: hidden ambiguities, e.g. sn:Manufactured Object ⊓ sn:HealthCareRelatedOrganization (e.g. Hospital as building vs. organization).
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Outline Introduction Methodology - UMLS SN: formal redefinition
- Interactive Mapping
Assessment - Ontology Cross-Validation
- NE co-occurrence validation
- UMLS SN cluster consistency
Conclusion
Conclusion Sucessful alignment between the (legacy) SN and the (novel) BioTop ontology Necessary: formal re-interpretation of SN Prospect: join large amount of data annotated by the SN with formal rigor of BioTop Strength: machine inference, consistency checking Challenge: Antagonize unwarranted effects of the open world semantics by making exhaustive use of disjoint partitions More use cases !
Acknowledgements EC STREP project “BOOTStrep” (FP6 – 028099) Intramural Research Program of the National Institutes of Health (NIH), US National Library of Medicine Martin Boeker (Freiburg) Holger Stenzhorn (Freiburg)
Ontology Stack
Dostları ilə paylaş: |
|
|