Episode 54: SARS-CoV-2 In Canada and addressing data sharing and privacy
👥Guests
The microbinfie podcast explores the challenges of genomic data sharing and metadata harmonization during the SARS-CoV-2 pandemic in Canada, highlighting the complex interplay between privacy, public health, and national surveillance efforts.
Join guest host Dr. Emma Griffiths as she talks with Dr. Finn McGuire and Dr. William Hsiao about the SARS-CoV-2 genomics epidemiology efforts in Canada.
For more information, visit the CanCOGeN website.
Extra notes
-
The podcast discusses the challenges and frameworks for harmonizing metadata in microbial bioinformatics, particularly in the context of the National Genomic Surveillance Database and the CanCoGen initiative.
-
The metadata management plan for CanCoGen consists of three tiers:
- Low-risk, de-identified metadata that can be released publicly.
- Middle-tier metadata useful for national surveillance and tracking, shared within national and provincial labs.
- Local metadata, maintained by partners and potentially identifiable, which can be used for research after de-identification.
-
Synchronizing metadata involves challenges due to differing definitions of non-identifiable information across jurisdictions, and legal consultation was needed to clarify these.
-
A long list of relevant metadata for SARS-CoV-2 sampling was established within the PHAGE consortium (Public Health Alliance for Genomic Epidemiology) and adopted in CanCoGen.
-
Metadata include information on the reasoning for sequencing and conditions of sample collection, which are critical for epidemiological interpretations.
-
Efforts are placed on ensuring metadata is publicly available and that the process adopts an open-access framework, allowing contributions to standards improvement.
-
The podcast mentions the development of a tool called the Data Harmonizer, designed to standardize data collection, perform validation, and ensure compatibility for public repositories and national reporting.
-
The conversation highlights data-sharing challenges in Canada due to differing privacy laws among provinces, affecting the complete and timely sharing of genomic data with global repositories like GISAID.
-
Solutions for the future involve improving the social science aspect of data sharing to align technical infrastructure with public opinion and ethical considerations.
-
There is a recommendation for centralized data curation and analysis while maintaining enough flexibility for decentralized systems to function through interconnected data sharing and expertise.
-
The need to focus on quality control metrics and fostering good communication was emphasized to overcome data-sharing hurdles in decentralized health systems.
Key Points
1. Metadata Management Strategy
- CanCoGen developed a three-tier metadata management approach
- Tiers range from low-risk public data to locally maintained identifiable information
- Extensive legal consultations required to define non-identifiable metadata
2. Data Sharing Challenges
- Provincial privacy laws create significant barriers to data sharing
- Differences in legal interpretations across Canadian provinces complicate national reporting
- Careful quality control and metadata standardization slow down data release
3. Technical Solutions
- Developed the Data Harmonizer tool for metadata standardization
- Created metadata standards through PHAGE consortium
- Implemented validation and export functions for public repositories
Take-Home Messages
- Effective pandemic response requires flexible, collaborative metadata sharing
- Technical tools must be complemented by social and legal frameworks
- Public opinion and trust are crucial in overcoming data-sharing barriers