Meta data and bioinformatics Bioinformatics is EBI-centred, loosely organised Bioinformatics was...

14
Meta data and bioinformatics Bioinformatics is EBI-centred, loosely organised Bioinformatics was coined by Pauline Hogekamp ~1979 European bioinformatics started, more or less, under Chris Sander at the EMBL ~1988 Bioinformatics is organised chaos with too many α- males The EBI is the boss, but they are powerless; this schizophrenic situation will not change with its Elixir ESFRI project EBI’s idea for a wheel with spokes ESFRI project is wrong * Bioinformatics cannot function without the EBI even if the bioinformaticians wanted it Having a boss makes it easy to make decisions about ontologies, meta data, etc

Transcript of Meta data and bioinformatics Bioinformatics is EBI-centred, loosely organised Bioinformatics was...

Meta data and bioinformatics

Bioinformatics is EBI-centred, loosely organised

Bioinformatics was coined by Pauline Hogekamp ~1979

European bioinformatics started, more or less, under Chris Sander at the EMBL ~1988

Bioinformatics is organised chaos with too many α-males

The EBI is the boss, but they are powerless; this schizophrenic situation will not change with its Elixir ESFRI project

EBI’s idea for a wheel with spokes ESFRI project is wrong *

Bioinformatics cannot function without the EBI even if the bioinformaticians wanted it

Having a boss makes it easy to make decisions about ontologies, meta data, etc

Meta data and bioinformatics

Bioinformatics has been dealing with data from day-1 on, and bioinformaticians dealt with interoperability even before Email could be send from the Netherlands to Germany.

Even the databases from the late 70’s had accession codes.

We are all providers. Our users sit in life science labs.

We are about to start second generation bioinformatics.

Data deposition is obligatory if you want to publish.

Data deposition generates citations (that generate money).

3

The ‘start’ of centralised databases

The schema…….

5

Submission – Chambon 1987

An early submission

6

Scale

Stu

ff

Years

The business model

7

The boss…….

8

EMBRACE is the work of a cohesive community of information engineers. I have to thank you for that, and thank the Commission for their support.I hope the community persists beyond EMBRACE.--------------------------------------------------------------------Embrace has been the FP6 NoE for database and tool interoperability. Outcome: SOAP, the EMBRACE Web service registry, an ontology for bioinformatics, Grid facilities, and a human network.

Citation

Discovery

Ontology (headed by previous EBI boss…)

Ontologies

Problems

Ontologies aren’t ontologies.

Metadata aren’t metadata.

Too many α-males.

We did the metadata EMBRACE project as a prelude to semantic applications, without knowing what is meant really with ‘semantic’.

Data must be stored. The easiest way of doing this is using big monolithic databases. But with human genome data that simply isn’t possible. Remote data will have to do. But how do we get access?

Do we trust each other internationally with the data (i.e. One USA database removed antrax related data).

What happens if the EBI has to stop because of a lack of money?

Solutions?

Better search methods:

MRS searches in/with compressed data.

Triplet comparison.

Text analysis technologies based on word vectors.

Search on distributed data.

Searchability (of the many small databases):

Continuous communication (Elixir) with niche-databasers.

Follow the EBI (if they stay follow-able).

Cross platform search.

Better science to know what queries are needed (SB):

Deeper hyperlinks.

Wider hyperlinks.

Inference engines (those reaeaeaeaeallllly need metadata)

Summary

Bioinformatics is EBI-centred, loosely organised

The EBI is the boss, but powerless

Bioinformatics cannot function without the EBI

Having a boss makes it easy to make decisions

EBI decides, in consultation, on ontologies, meta data, etc.