Episode 38: Ontologies - effective data sharing
👥Guests
In this episode of the microbinfie podcast, hosts explore the critical role of ontologies in standardizing and sharing scientific data across disciplines, featuring insights from experts Dr. Emma Griffiths and Dr. João Carriço.
Join us for the second part of our crash course on ontologies, featuring insights from Dr. Emma Griffiths and Dr. João Carriço. In this session, the focus will be on effective data sharing techniques and strategies.
Dr. Griffiths and Dr. Carriço are experts in the field and will guide us through the intricacies of utilizing ontologies to facilitate better data sharing practices. Whether you're involved in bioinformatics, computer science, or any field where data plays a crucial role, this session is essential for understanding how to leverage ontologies for maximum impact.
Guests
- Dr. Emma Griffiths and Dr. João Carriço
Key Topics:
- Basics of ontologies and their importance in data sharing
- Strategies for integrating ontologies into existing data frameworks
- Case studies highlighting successful data sharing using ontologies
- Tips and best practices from leading experts
Importance of Ontologies
- Standardization: Ontologies help standardize information, making data sharing interoperable.
- Data Sharing Challenges: To share data effectively, consistent definitions for terms like location and susceptibility are needed.
- Example: The Genome Tracker initiative standardizes terms for foodborne pathogens, enhancing analysis by using standardized fields.
Application of Ontologies
- Type-On Ontology: Developed for defining genetic terms like alleles and loci, improving interoperability in sharing gene schemas.
- Interdisciplinary Use: Ontologies are valuable not just in microbial genomics but also in human genomics and drug discovery, creating rich knowledge bases.
- ARO Ontology: The Antimicrobial Resistance Ontology supports databases like CARD, enabling better analysis of antimicrobial resistance data.
Integrating Ontologies into Work
- Start with Metadata Standardization: Use community standards and ontologies (e.g., Oboe Foundry) to standardize metadata.
- Define Objectives: Clearly identify what questions you want to answer to guide ontology use.
- Graph Databases: Consider using graph databases to implement ontologies for effective querying.
Tools for Implementing Ontologies
- Text Conversion: Tools like those from EMBL-EBI can highlight ontology terms in text. LexMapper standardizes short text entries into ontology terms.
- Data Collection: It's crucial to define metadata parameters before data collection to improve data quality.
Developing Specifications
- Resources: The Genomic Standards Consortium provides standardized attribute lists for genomics data.
- Using Gene Tool: Gene is likened to an "Amazon for ontologies," allowing users to browse and create specifications for data entry.
Challenges in Ontology Usage
- Complexity: Understanding ontology overlaps and variations is crucial. Different domains may use terms differently, and definitions can vary.
- Validation Issues: The ontologies are built by diverse researchers, which can lead to inconsistencies. It’s essential to know the context and the ontology’s purpose.
Key Points
1. Ontology Fundamentals
- Ontologies help standardize information and make data sharing interoperable
- Provide definitions, IDs, synonyms, and relationships between terms
- Enable complex querying and knowledge base development
2. Practical Applications
- Used in diverse fields including microbial genomics, human genomics, and drug discovery
- Examples include Genome Tracker, Type-On Ontology, and Antimicrobial Resistance Ontology (ARO)
- Support databases like CARD and Resistance Gene Identifier (RGI)
3. Implementation Strategies
- Start by standardizing metadata using community standards
- Define clear objectives before developing ontologies
- Use tools like LexMapper and EMBL-EBI text conversion tools
- Consider graph databases for effective data querying
Take-Home Messages
- Ontologies are crucial for creating standardized, shareable scientific data
- Start ontology development by clearly defining research questions
- Invest time in metadata standardization for long-term research benefits