Less is more Approaches to biologist-driven analysis and next-generation sequencing data Paul Gordon...
-
Upload
karina-lathrop -
Category
Documents
-
view
216 -
download
1
Transcript of Less is more Approaches to biologist-driven analysis and next-generation sequencing data Paul Gordon...
Less is more
Approaches to biologist-driven analysis
and next-generation sequencing data
Paul GordonGenome Canada Bioinformatics Platform
University of Calgary
What am I doing here?
• Next Generation Sequencing
• Next Generation Web
• Future challenges
Genome Canada Bioinformatics Platform
Better tech: less DNA, more sequence
44μm
70nm
PhytoMetaSyn
Sprockets: Hierarchical Gene Models from ESTs
Developed in collaboration with BASF Plant Sciences
Genozymes
Hydrocarbon Metagenomics
Exploring gene expression patterns
CAVEman• Java 3D-based, world-first complete 3D human body atlas (adult male)
– 2,335 organs, hierarchical organization following Terminologia Anatomica• Numerous applications involving mapping of genetic and disease data• More information: http://cave.ucalgary.ca/caveman
Patient MRI stack mapped onto atlas and registered by landmarks
Pharmacokinetics visualization(Absorption-distribution-metabolism-excretion of Aspirin)
Basic Research
• Archaeal UV-light response
• Large-scale human genome organization
• ING-protein interactions (cancer and ageing-rated
proteins)
Research Applications
• Kidney transplants: improved rejection diagnostics in Edmonton
•Mad cow disease/chronic wasting disease: live diagnostics
•Desulf.: mechanisms of oil pipeline corrosion and its prevention
DNA Diagnostics Discovery for Mad Cow
Preclinical ClinicalPreinoculation
Controls
Control animal #6
Ball toy
Photo: S. Czub, CFIA Lethbridge
Next-gen
Motif finding (elk dataset)61 blood samples
107 million base pairs
432 billion pairwise alignments (6574312)
1082019 25mers or smaller
Uninfected 152317
Infected3 universal
Infected 132417
Thousands of animal coverage/timepoint combos (CPU intensive)
Decypher hardware accelerator
Decypher hardware accelerator
Motif Results
↑ EVI1
↑PLZF
Retrovirus
PrPsc(+?)
↓PLZF-controlled genes
Infectious agent
Circulating Nucleic Acids
Endogenous Retrovirus? Consistent with protein-only evidence…
Neurovirulent? (e.g. M.L. Labat 1999)
Possible mode of action?
Virus particles? ~25nm
PrP Amyloid fibres
Vacuole Manuelidis et al, PN
AS 2007
Protected promoters(Motifs A & B)
Feedback
PrP
Integration
Nucleoprotein complexesCell death
CNA Export
Carp et al., EMBO J., 2006Leblanc et al., EMBO J. 2006Stengel et al., Biochem. Biophys. Res. Commun. 2006Lee et al., Biochem. Biophys. Res. Commun. 2006Etc.Activation
Better tech: less input, more resultsBetter tech: less DNA, more sequence
Generate Manuscript
Now
Where are we at?
Bioinformatics
Web
Emerging Technologies
Life Sciences
Semantic Web
Source: Gartner Inc.
How software works…
Functions/Rules
Parameters/Input
Results/Output
(article, allele,…)
(Gene name, DNA sequence, QTL…)
The problem with the Web
Once you label me, you negate me.Søren Kierkegaard
1998 Now
Bluejayhttp://bluejay.ucalgary.ca
Comparative genomics
BioMoby linking
Waypoints
Gene expression integration
The task at hand (biologist)
Sequencer Data File (Binary)
ACCGT…
KnownProteins
BLASTReport(related
proteins)(computer scientist)
DNASequence
NCBI_gi
Sequence_Alignment
Audience
GodAmoeba
Tave
rna
self-
star
ters
Willin
g to
take
traini
ng
Capab
le b
ut fe
arfu
l
Self-perception of computer skills
The need for shoehorns
• The current vision of the Semantic Web intends to create a new structure starting up with no reference to its vast, functioning, but more primitive predecessor … things just don’t happen like that
All the Web as Workflows
Seahawk
Proxied Web page
Drag ‘n’ drop
Seahawkprompting
What’s Ahead?
The more a man learns, the more he realizes how little he knows
Semantic Web
http://www.uniprot.org/tissues/229 http://purl.uniprot.org/po/0009009
Take home messages
As tech improves, we can ask better questions
We will need shoehorns to access existing resources for the foreseeable future