Less is more Approaches to biologist-driven analysis and next-generation sequencing data Paul Gordon...

Post on 01-Apr-2015

216 views 1 download

Transcript of Less is more Approaches to biologist-driven analysis and next-generation sequencing data Paul Gordon...

Less is more

Approaches to biologist-driven analysis

and next-generation sequencing data

Paul GordonGenome Canada Bioinformatics Platform

University of Calgary

What am I doing here?

• Next Generation Sequencing

• Next Generation Web

• Future challenges

Genome Canada Bioinformatics Platform

Better tech: less DNA, more sequence

44μm

70nm

PhytoMetaSyn

Sprockets: Hierarchical Gene Models from ESTs

Developed in collaboration with BASF Plant Sciences

Genozymes

Hydrocarbon Metagenomics

Exploring gene expression patterns

CAVEman• Java 3D-based, world-first complete 3D human body atlas (adult male)

– 2,335 organs, hierarchical organization following Terminologia Anatomica• Numerous applications involving mapping of genetic and disease data• More information: http://cave.ucalgary.ca/caveman

Patient MRI stack mapped onto atlas and registered by landmarks

Pharmacokinetics visualization(Absorption-distribution-metabolism-excretion of Aspirin)

Basic Research

• Archaeal UV-light response

• Large-scale human genome organization

• ING-protein interactions (cancer and ageing-rated

proteins)

Research Applications

• Kidney transplants: improved rejection diagnostics in Edmonton

•Mad cow disease/chronic wasting disease: live diagnostics

•Desulf.: mechanisms of oil pipeline corrosion and its prevention

DNA Diagnostics Discovery for Mad Cow

Preclinical ClinicalPreinoculation

Controls

Control animal #6

Ball toy

Photo: S. Czub, CFIA Lethbridge

Next-gen

Motif finding (elk dataset)61 blood samples

107 million base pairs

432 billion pairwise alignments (6574312)

1082019 25mers or smaller

Uninfected 152317

Infected3 universal

Infected 132417

Thousands of animal coverage/timepoint combos (CPU intensive)

Decypher hardware accelerator

Decypher hardware accelerator

Motif Results

↑ EVI1

↑PLZF

Retrovirus

PrPsc(+?)

↓PLZF-controlled genes

Infectious agent

Circulating Nucleic Acids

Endogenous Retrovirus? Consistent with protein-only evidence…

Neurovirulent? (e.g. M.L. Labat 1999)

Possible mode of action?

Virus particles? ~25nm

PrP Amyloid fibres

Vacuole Manuelidis et al, PN

AS 2007

Protected promoters(Motifs A & B)

Feedback

PrP

Integration

Nucleoprotein complexesCell death

CNA Export

Carp et al., EMBO J., 2006Leblanc et al., EMBO J. 2006Stengel et al., Biochem. Biophys. Res. Commun. 2006Lee et al., Biochem. Biophys. Res. Commun. 2006Etc.Activation

Better tech: less input, more resultsBetter tech: less DNA, more sequence

Generate Manuscript

Now

Where are we at?

Bioinformatics

Web

Emerging Technologies

Life Sciences

Semantic Web

Source: Gartner Inc.

How software works…

Functions/Rules

Parameters/Input

Results/Output

(article, allele,…)

(Gene name, DNA sequence, QTL…)

The problem with the Web

Once you label me, you negate me.Søren Kierkegaard

1998 Now

Bluejayhttp://bluejay.ucalgary.ca

Comparative genomics

BioMoby linking

Waypoints

Gene expression integration

The task at hand (biologist)

Sequencer Data File (Binary)

ACCGT…

KnownProteins

BLASTReport(related

proteins)(computer scientist)

DNASequence

NCBI_gi

Sequence_Alignment

Audience

GodAmoeba

Tave

rna

self-

star

ters

Willin

g to

take

traini

ng

Capab

le b

ut fe

arfu

l

Self-perception of computer skills

The need for shoehorns

• The current vision of the Semantic Web intends to create a new structure starting up with no reference to its vast, functioning, but more primitive predecessor … things just don’t happen like that

All the Web as Workflows

Seahawk

Proxied Web page

Drag ‘n’ drop

Seahawkprompting

What’s Ahead?

The more a man learns, the more he realizes how little he knows

Semantic Web

http://www.uniprot.org/tissues/229 http://purl.uniprot.org/po/0009009

Take home messages

As tech improves, we can ask better questions

We will need shoehorns to access existing resources for the foreseeable future