CMSC424: Database Design

28
CMSC424, Spring 2005 CMSC424: Database Design Lecture 5

description

CMSC424: Database Design. Lecture 5. Review: Relational Algebra. Relational Algebra Operators Select (  ) Project (  ) Set Union (U) Set Difference (-) Cartesian Product (  ) Rename (  ) These are called fundamental operations. 3. Outer Joins ( ). - PowerPoint PPT Presentation

Transcript of CMSC424: Database Design

Page 1: CMSC424: Database Design

CMSC424, Spring 2005

CMSC424: Database Design

Lecture 5

Page 2: CMSC424: Database Design

CMSC424, Spring 2005

Review: Relational Algebra

Relational Algebra Operators1. Select ()2. Project ()3. Set Union (U)4. Set Difference (-)5. Cartesian Product ()6. Rename ()

These are called fundamental operations

Page 3: CMSC424: Database Design

CMSC424, Spring 2005

Relational AlgebraRedundant Operators

4. Update ( )

2. Division ( )

1. Natural Join ( )

3. Outer Joins ( )

Page 4: CMSC424: Database Design

CMSC424, Spring 2005

Natural Join

Idea: match tuples on common attributes

A B C D

1223

αααβ

+--+

10102010

E B D

‘a’‘a’‘b’‘c’

ααββ

10201010

r s

A B C D E

12233

αααββ

+--++

1010201010

‘a’‘a’‘a’‘b’‘c’

=

Relation1 Relation2Notation:

Page 5: CMSC424: Database Design

CMSC424, Spring 2005

Division

Query: Find customers who have accounts in all branches in Brooklyn

r1 all branches in Brooklynr2 associate customers with branches they

have accounts in

Now what ? Use the division operator

Relation1 Relation2Notation:

Idea: expresses “for all” queries

Page 6: CMSC424: Database Design

CMSC424, Spring 2005

bname lno amt

DowntownRedwood

Perry

L-170L-230L-260

300040001700

loan =cname lno

JonesSmithHayes

L-170L-230L-155

borrower =

=bname lno amt cname

DowntownRedwood

L-170L-230

30004000

JonesSmith

Join result loses… any record of Perry any record of Hayes

Outer Joins

Motivation:

loan borrower =

Page 7: CMSC424: Database Design

CMSC424, Spring 2005

bname lno amt

DowntownRedwood

Perry

L-170L-230L-260

300040001700

loan =cname lno

JonesSmithHayes

L-170L-230L-155

borrower =

bname lno amt cname

DowntownRedwood

Perry

L-170L-230L-260

300040001700

JonesSmith

• preserves all tuples in left relation1. Left Outer Join ( )

┴ = NULL

Outer Joins

loan borrower =

Page 8: CMSC424: Database Design

CMSC424, Spring 2005

bname lno amt cname

DowntownRedwood

L-170L-230L-155

30004000

JonesSmithHayes

bname lno amt

DowntownRedwood

Perry

L-170L-230L-260

300040001700

loan =cname lno

JonesSmithHayes

L-170L-230L-155

borrower =

• preserves all tuples in right relation2. Right Outer Join ( )

┴ = NULL

Outer Joins

loan borrower =

Page 9: CMSC424: Database Design

CMSC424, Spring 2005

bname lno amt

DowntownRedwood

Perry

L-170L-230L-260

300040001700

loan =cname lno

JonesSmithHayes

L-170L-230L-155

borrower =

• preserves all tuples in both relations3. Full Outer Join ( )

┴ = NULL

Outer Joins

bname lno amt cname

DowntownRedwood

Perry┴

L-170L-230L-260L-155

300040001700

JonesSmith

┴Hayes

loan borrower =

Page 10: CMSC424: Database Design

CMSC424, Spring 2005

1. Deletion: r r – s e.g., account account – σbname=Perry (account)(deletes all Perry accounts)

2. Insertion: r r se.g., branch branch {(Waltham, Boston, 7M)}(inserts new branch with bname = Waltham, bcity = Boston, assets = 7M)

3. Update: r πe1,…,en (r)e.g., account πbname,acct_no,bal*1.05 (account)(adds 5% interest to account balances)

Update

Identifier QueryNotation:

Common Uses:

Page 11: CMSC424: Database Design

CMSC424, Spring 2005

Extended Relational Algebra

1. Generalized projection

2. Aggregates

Page 12: CMSC424: Database Design

CMSC424, Spring 2005

e1,…,en (Relation)

e1,…,en can include arithmetic expressions – not just attributes

cname limit balance

JonesTurner

50003000

20002500

credit =

π cname, limit - balance (credit) = cname limit-balance

JonesTurner

3000500

Generalized Projection

Notation:

Example

Then…

Page 13: CMSC424: Database Design

CMSC424, Spring 2005

e1,…,en (Relation)

e1,…,en can include arithmetic expressions – not just attributes

cname limit balance

JonesTurner

50003000

20002500

credit =

π cname, limit - balance as limitbalance (credit) = cname limitbalance

JonesTurner

3000500

Generalized Projection

Notation:

Example

Then…

Page 14: CMSC424: Database Design

CMSC424, Spring 2005

Aggregate Functions and Operations

Aggregation function takes a collection of values and returns a single value as a result.

avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values

Page 15: CMSC424: Database Design

CMSC424, Spring 2005

Aggregate Operation – Example

Relation r:A B

C

77310

g sum(c) as sumC (r)sum-C

27

Page 16: CMSC424: Database Design

CMSC424, Spring 2005

Aggregate Functions and OperationsGeneral form:

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)E is any relational-algebra expressionG1, G2 …, Gn is a list of attributes on which to group (can be empty)

Each Fi is an aggregate function

Each Ai is an attribute name

Page 17: CMSC424: Database Design

CMSC424, Spring 2005

Aggregate Operation – Example

Relation account grouped by branch-name:

branch-name g sum(balance) (account)

branch-name account-number balance

PerryridgePerryridgeBrightonBrightonRedwood

A-102A-201A-217A-215A-222

400900750750700

branch-name balancePerryridgeBrightonRedwood

13001500700

Page 18: CMSC424: Database Design

CMSC424, Spring 2005

Other Theoretical Languages

Relational CalculusNon-proceduralTuple relational calculus

ExamplesSafety

Domain relational calculus

Page 19: CMSC424: Database Design

CMSC424, Spring 2005

Review: Query Languages

Theoretical Use Practical Use

Relational Algebra

Formal semantics of practical QL’s

Language Expressivity

Internal query representation for query optimizers

TRC Foundation for SQL

SQL-92 - Standard for Relational DB Query Languages

SQL-99 - Standard for Object-Relational DB Query Languages

OQL - Standard for Object-Oriented DB Query Languages

XQuery - Standard for XML-based DB Query Languages

Page 20: CMSC424: Database Design

CMSC424, Spring 2005

SQL - Introduction

Standard DML/DDL for relational DB’s

• DML = Data Manipulation Language (queries, updates)• DDL = Data Definition Language (create tables, indexes, …)

• View definition• Security (Authorization)• Integrity constraints• Transactions

• Early 70’s, IBM system R project (SEQUEL)• Later, become standard (Structured Query Language)

Also includes

History

Page 21: CMSC424: Database Design

CMSC424, Spring 2005

SQL: Basic StructureSELECT A1, ….., An

FROM r1, ….., rm

WHERE P

Equivalent to:

A1,A2,…,An (σP (r1 … rn ))

Page 22: CMSC424: Database Design

CMSC424, Spring 2005

A Simple SELECT-FROM-WHERE Query

Similar to

SELECT bnameFROM loanWHERE amt > 1000

Can instead write :SELECT DISTINCT bnameFROM loanWHERE amt > 1000

(removes duplicates from result)

We will discuss bag algebra a bit later

bname ( amt > 1000 (loan) )

bnameRedwoodPerryDowntownPerry

Why preserve duplicates?

Duplicates are retained(i.e., result not a set)

But not quite

Page 23: CMSC424: Database Design

CMSC424, Spring 2005

Another SELECT-FROM-WHERE Query

Similar to

SELECT cname, balanceFROM depositor, accountWHERE depositor.acct_no =

account.acct_no

Can also write

SELECT d.cname, a.balanceFROM depositor as d,

account as aWHERE d.acct_no = a.acct_no

(neccessary for self-joins)

cname balanceJohnsonSmithHayesTurner

JohnsonJones

Lindsay

500700400350900750700

Note:

cname, balance (depositor account )

Returns:

Page 24: CMSC424: Database Design

CMSC424, Spring 2005

The SELECT Clause

• Equivalent to (generalized) projection, despite name

e.g: SELECT *FROM loan

e.g: SELECT bname, acct_no, balance*1.05

FROM account

• Can use ‘*’ to get all attributes

• Can write SELECT DISTINCT to eliminate duplicates• Can write SELECT ALL to preserve duplicates (default)• Can include arithmetic expressions

Page 25: CMSC424: Database Design

CMSC424, Spring 2005

(or , depending on WHERE clause)

The FROM Clause

• Equivalent to cartesian product ()

e.g: FROM borrower, loan

• Computes borrowerloan

• Identifies borrower, loan columns in result, allowing one to write

WHERE borrower.lno = loan.lno

e.g: FROM borrower as b, loan as l

allows one to write

WHERE b.lno = l.lno

• Binds tuples in relations to variable names

Page 26: CMSC424: Database Design

CMSC424, Spring 2005

The WHERE Clause

• Equivalent to Selection, despite name

1. Simpleattribute relop attribute (or constant)

(relop: =, <>, <, >, <=, >=)

2. Complex (using AND, OR, NOT, BETWEEN)e.g: SELECT lno

FROM loanWHERE amt BETWEEN 90000 AND 100000

is the same as…

SELECT lnoFROM loanWHERE amt >= 90000 AND amt <= 100000

• WHERE predicate can be:

Page 27: CMSC424: Database Design

CMSC424, Spring 2005

Data Definition Language

Allows specification of relation schema as well as:Attribute domainsIntegrity constraintsSecurity and authorization informationCreation of Indexes…

Page 28: CMSC424: Database Design

CMSC424, Spring 2005

DDL

CREATE TABLE branch(branch-name char(15) not null, branch-city char(30), assets integer, primary key (branch-name), check (assets >= 0))

DROP TABLE branch

ALTER TABLE branch ADD zipcode integer