Vision and Politics Μεταπτυχιακό μάθημα: Προχωρημένα...

Post on 18-Dec-2015

218 views 0 download

Transcript of Vision and Politics Μεταπτυχιακό μάθημα: Προχωρημένα...

Vision and PoliticsVision and Politics

Μεταπτυχιακό μάθημα: Προχωρημένα Ερευνητικά Θέματα Βάσεων Δεδομένων

Έλενα Σαρρή

Καθηγητής: Τ.Σελλής

A. Silberschatz, M. Stonebraker, J. Ullman, "Database Research: Achievements and Opportunities Into the 21st Century," ACM SIGMOD Record, March 1996.

summary of the workshop held in May of 1995

• Key role in creating techonological infrastracture• Areas of database research: support for multimedia

objects, distribution of information, new database applications, workflow and transaction management, and ease of database management and use.

• New capabilities provided by the technology developments in hardware capability, hardware capacity, and communication.

need for industrial support

Points of interest

2 themes: of demand require new solutions

historically confirmed of ability to put ideas to practical use

so viability of db research community vital

The Changing World of Database Management

• A db system is a computerized record keeping system

• Stores provides access to information

• Basic components: data hardware shoftware(consideration: scope magnitude complexity)

• Hardware inpact: cost speed components (multiprocessors) capacity

Every humal enterprise includes computerized info

The Case for DBMS Research

Goal of this report:

• financial support of db research is a worthwhile investment

• Illustrating the pay off from funding db research

Recent Research Achievements

• in 1990 report the grate majority of market are US-owned companies

• products from research prototypes

outline of the new developments

Object-Oriented and Object-Relational Database Systems

In 1990 few research prototype of OODBs. Questionable relationship of OODBs and relational systems

Few research prototypes combining features of relational DBMS (SQL access to simple data types) with OODBs (modelling of complex data) to create ORDBs (object relational database systems) and DOODs (deductive object oriented database systems)

Today there are a variety of commercial OODBs It is a $75M/year market growing at about 50% per year.

Support for New Data Types

• Research attempts of last decade concerning spatial and temporal data types are now part of commercial DBMSs and GIS

Transaction Processing

A DBMS support coordination of many users of shared information

• Traditional transaction management not enadequate for today’s distributed information systems

New db Applications

EOSDISEarth Observing System Data

Information SystemEOS is a collection of satellites gathering info

regarding atmosphere, oceans and land. They return 1/3 PentaByte/year of data that are integrated in EOSDIS.

Challenges are:• Providing on-line access to PB-sized Databases• Supporting thousands of information consumers• Providing effective mechanisms for browsing and

searching for the desired data

Electronic Commerce

There are thousands of projects supporting electronic purchasing of goods. E-commerce involves very large number of participants interacting over the network.

Unlike EOSDIS there are many suppliers and many consumers. Among the challenges are:

• Heterogeneous information sources must be integrated

• E-commerce needs reliable, distributed authentication and funds transfer

Health Care Information Systems

Physicians need to draw on different kinds of info like:• Medical records on various hospitals• Info about drugs• Procedures• Diagnostic toolsTransforming health-care sector will have major impact on

cost and quality. Challenges are:• Integration of heterogeneous forms of information• Access control to preserve confidentiality of medical

records• Interfaces to information appropriate for health-care

professionals

Digital Publishing

Storage of books & articles in electronic form and delivery through high speed networks offering new features like audio & video.

Education industry draws much closer ro publishing and offers facilities like interactive learning. Challenges are:

• Management and delivery of extremely large bodies of data at very high rates

• Protection of intellectual properties

Trends That Affect Database Research

Technological Trends the last 50 years exponential improvement Improvement by factor 10+ every 10 years on:·        # of machine instruction executable in a sec·        processor cost·        amount of secondary storage per unit cost·        amount of main memory per unit costImprovement in price/performance new products,

servicesThe last few years ·        # bits transmitted / unit cost·        # bits transmitted / secso able to deal with Terabytes, complex queries cost

effectively

Database Architecture Trends

changes in db structure and use• The relational approach is today ubiquitous (from

very large parallel architectures to home computers) • Client-server architectures will become progressively

more common for database servers to be accessed remotely over networks.

• The traditional data has been joined by various kinds of multimedia data. This trend is fuelling the success of ORDB

Information Highway

• # of Web bits carried by the Internet 15-20% per month, or a factor of 10 growth per year.

• db will play a critical role in this information explosion.

New Research Directions

• Putting multimedia objects to DBMSs• Distribution of information• New uses of db• New transaction models • Easy use and management of db

Support for Multimedia Objects

Areas of research in multimedia data:

Tertiary Storage·        new level of storage hierarchy·        is made by buffering selected data to secondary

storage like acess to secondary storage by buffering selected data to main memory from disk

Tertiary storage devices are orders of magnitude slower than secondary storage(disks), yet also of vastly greater capacity.

New Data TypesTo support multimedia objects

QoSdelivering multimedia data to many usersbottleneck

different needs (movie/ lecture video)

optimize access based on predicted use

Support for Multimedia Objects

User Interface SupportRequirenment of new interface other than SQL

Ex Quering image db need interface that allows description of color, shape, other characteristics

For ex. Course video : sample frames, text-based indexes, segment search

Support for Multimedia Objects

Distribution of Information

new environment facilitated by the Web requires rethinking of the concepts in current distributed database technology

Degree of autonomyDb sources connected through a network owned by

different participants (health care system Web)Refuse connectionDifferent systems capabilities

Accounting and BillingClient payments for each access to remote dataQuering strategies- billing rates. Willing?

Security and Privacy

• Flexible authentication and authorization systems

• Sale information of anonymous user

Replication and Reconciliation

Nodes disconnected Data often duplicatedCopies reconciled at connectionFrequent eventNeed for high speed protocols

algorithmsEx call routing system

Data Integration and Conversion

Information sources has a variety of formats and models

Use of mediators like agents

Information Retrieval and Discovery

problems information of informally connected

sources / heterogeneous data• Changes without notice• Unclear definitions• Need for techniques to support

searches like in db technology (indexes)

Data Quality

Different sources with different reliability

• Evaluate and query the reliability or the lineage (origin)

Data Mining

•Extraction of information from large bodies

•Decision makers

•Fast response

•Formulate query

•Optimization techniques for complex queries

•Use of non expert users

Data Warehouses

Huge collections of data mainly used for decision support systems. They copy of data from one or more databases.

Issues• Tools for data pumps (modules for obtaining

updates/ translate them)• Methods for data scrubbing (data consistent identify

different representation of the same value)• Create metadictionary (how data obtained)

Repositories

storing and managing both data and metadata

They must• Obtain an evolving set of representations of

the same or similar information (module represented as source code, object code, flow diagram etc..)

• Support versions (snapshots of an element evolving over time) and configurations (versioned collection of versions)

Easy of use

• Improved interfaces for end user and application programmer- administrator

• Easier installation and upgrade of db management systems

Database Metatheory:Asking the Big Queries

Christos H. Papadimitriou

University of California San Diego

Theory and its Function  

• In the context of an applied science, theory in broad sense is the use of significant abstraction, scientific research, the suppression of low-level details of the object or artifact being studied or designed.

Solution to complexity imposed by theoreticians:

 • (a) They develop mathematical models of the

artifact. Turing machines, formal languages, and the relational model

Solution to complexity imposed by theoreticians:

 • (b) abstract models can become reality: (typically,

algorithms and representational schemes) that are derived from the mathematical models.This function of theory is what we usually mean by “synthesis” or “positive results.”Such results must be actually verified by experiments.

Solution to complexity imposed by theoreticians:

 • (c) Analyze the mathematical models to predict the

outcome of the experiments (and calibrate the models).

Solution to complexity imposed by theoreticians:

 • (d) explore. They develop and study extensions and alternative applications of the model, and they seek its ultimate limitations.

Introduce and apply more and more sophisticated mathematical techniques. build a theoretical body of knowledge and a mathematical methodology that overcome the motivating artifact and model

Exploration is usually guided by aesthetics, taste, and sense of what is “important” and “relevant”.

uncontroversial necessary parts of the research and discovery process in any

science of the artificial:

 (a)Model building(b) synthesis(c) analysis

• criticized most: predictably• liked by theoreticians: exploration

arguments in defense of exploration:  

• (1) It has been historically beneficial to computer science;

• (2) in reasonable doses, it promotes the field’s health and connectivity;

• (3) exploration and proving elegant theorems are natural and attractive activities, and so it would be wrong and futile to repress them.

Drawbacks

(1) can disortent the field and lead at into crisis, when it is disproportionately extensive in comparison to model budding, synthesis, and analysis

(2) will not thrive if it consistently ignores practice

(3) requires true discipline and honesty in its exposition, especially in avoiding frivolous and unchecked claims of relevance and applicability.

On Negative Results:  

In computer science theorems are judged by (as in mathematics)

• Elegance• Depth• importance in long-term research

But here also

• complexity-reducing or points out a setback in this regard.

 

• negative results are the only possible self-contained theoretical results

• Positive results —complexity reducing solutions such as algorithms and presentation schemes— must be validated experimentally and can therefore be considered as mere invitations to experiment.

• delimitation is the ultimate success in exploration  

What is “Good Theory”?  

Paul Feyerabend“Science is an essentially anarchic enterprise.[…] There is no idea that is not capable of improving our knowledge. […] The only principle that does not inhibit progress is ‘anything goes’. ”

What is “Good Theory”?  

• although there is no such thing as “bad science”, success is an important aspect

• Not just an inner process driven by methodology and results but a much more complex predicate of the social dynamics of the field and its environment, and of course open to circumstances and chance.

• An adoption metaphor in computer science from other sciences is essential and increases its prestige and propagandistic value.

What does this all mean for theoreticians?

• free-style exploratory theoretical research • its success will depend mainly on its propagandistic

value, ability to contaminate its environment,

especially on its potential to influence practice

• Theoreticians should be expositor and popularizer to bring his or her results to the attention of the experimentalist and the practitioner, to convince them of their value by arguments that are measured, rigorous, and credible

What does this all mean for theoreticians?

• do your own experiments helps a lot

• ultimate success of a scientific idea is, of course, the launching of a victorious scientific revolution

Paradigms and Revolution

The stages of the scientific process according to Thomas Kuhn for natural science:

Immature science

Normal science

Crises Revolution

          

The stages of the scientific process according to Thomas Kuhn for natural science:

Immature science

Normal science

Crises Revolution

          

 

• long periods of “normal science, “ in which the field progresses incrementally within a broadly accepted framework that includes not only scientific assumptions and theories, but also conventions about what are appropriate questions

• to ask and how further development should proceed. Such a framework is called a paradigm. Copernicus’ model and Einstein’s general theory seem to be the most frequently mentioned paradigms.

• scientists consider it their duty to defend the paradigm and show that it works

Natural Science

• But cruel facts that do not fit in the paradigm accumulate, despite the community’s ingenious efforts to sweep them under the rug; the paradigm creaks and staggers, and we enter a stage of “science in crisis”

• Νew kinds of ingenuity and imagination develop and compete. Eventually, and typically, one of them triumphs and becomes the next paradigm; this is the stage of “scientific revolution”

• ultimate success of a scientific idea is, of course, the launching of a victorious scientific revolution

Immature science

Normal science

Crises Revolution

adaptation to applied science and the sciences of the artificial.

Kuhn’s model static / eternalin the sciences of the artificial study artifacts, which keep

changing while studied or because it is being studied tight closed-loop interaction between a science and its object.

in case of computer science stages of Kuhn’s model are much accelerated

Crises in natural science are caused by the accumulation of

anomalies, observations of the objective reality that cannot fit the current paradigm. In contrast, in computer science we have no objective reality against which to judge our scientific work.

the operational analog of falsifiability in computer science

“research units” (researchers, papers, research groups, results,or subfields) influence each other

Connection

Autistic behavior is the exception that tests the wisdom of the rule

Most of theory is within a few hops from practice, and vice-versa.

bottom snapshotlocal situation

seems unchanged (say, the average degree is the same)

connectivity is lowTangents and

introverted components are the rule

The little connectivity that exists is via long paths

Practitioners stop communicate to relevant theory

interaction unpleasant, unfriendly, defensive style

The field is in crisis.

Revolution as in natural science

Practitioners (having given up on theory) develop and use their own abstractions, models, and mathematical techniques

while theoreticians make their own attempts to reconnect to practice (responding to “pressures” from within their community and outside).

The uninspiring practical problems and the unresponsive theoretical work that triggered the crisis become less central, and new small research traditions blossom.

Well-targeted exploratory theory connects several of them, and a new healthy state emerges from the ashes. A successfully championed new research paradigm may then take over

Why this relational model is applied in computer science

(1) It was a powerful and attractive proposal (whose plausibility was expertly supported by theoretical arguments)

(2) it was explicitly open-ended, a whole framework for research problems, applications, and experiments;

(3) it came as the result of a crisis (or was it “immature science”?);

(4) it was indeed followed by a period of normal science.

we are now in the blues of a crisis, or even in the flames of an on-going revolution

A Brief History of PracticeAncient Greek tradition strongly favors theory over practice

(Aristotle)Before the last century, an inventor could become famous

only if he was a moonlighting major theoretician or artist (Archimedes, Aristarchus, Leonardo da Vinci) or if his invention helped in the spreading of theoretical knowledge (Gutenberg).

Practice starts obtaining a measure of respectability with Galileo (1564- 1642) (and later under the influence of the British empiricist philosophers)

However, only after James Watt (1736- 1819) did sophisticated theoretical knowledge come to the assistance of practice and invention, thus launching the industrial age and the traditions of applied science and engineering

• Respect for practice is so universal today • Theory and practice collaborated two centuries,

with theory dominating important domains in applied science due to its academic prowess and prestige

• Serious and systematic ideological attack against the value and necessity of theory in applied science seems to be a novel and disturbing phenomenon of the last decade or so.

• Histrory of computer science, is a miniature of the history of science

• The strongest influence came from mathematics(and less from electrical engineering and physics),