of 44 /44
- Relational Model and Algebra Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ
• date post

20-Dec-2015
• Category

## Documents

• view

220

0

Embed Size (px)

### Transcript of YV - Relational Model and Algebra 110 Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ.

YV - Relational Model and Algebra 1

Κεφάλαιο 3Κεφάλαιο 3

ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ

YV - Relational Model and Algebra 2

DATABASE SYSTEMS: The Relational Model and Relational Database Systems

DATABASE SYSTEMS: The Relational Model and Relational Database Systems

OUTLINE

– Informal and Formal Definition of the Model

Structures, Constraints, Operations– Relational Algebra– Relational Calculus – The languages SQL and QBE– Views - Integrity Constraints using SQL– Normalization and Relational Database Design– Relational Database Systems

YV - Relational Model and Algebra 3

Relational Model: Informal DefinitionRelational Model: Informal Definition

Proposed in 1970 by E.F. Codd (“A relational model for large shared data banks”, CACM), as a theory of a database model

Spurred tremendous research in the database field and became the most popular logical data model - many relational DBMSs are today available on nearly all platforms

A relational database is a set of relations RELATION: A table of values. Each column in the table has a

header, called an attribute (field). Each row in the table is called a tuple (record) and stands for an entity or a relationship.

YV - Relational Model and Algebra 4

Formal Definition Formal Definition

STRUCTURES– Only one kind: relations (which have a name)

A domain D is a set of values , D= {d1, d2 , ..., dn}

e.g., DOMAIN OF NAMES = the set of all names

DOMAIN of WEIGHT = the set of all weights

CHAR STRINGS from 1 to 10 in length, etc.

An attribute A names a property of interest in a relation and takes its values from some associated domain D(A).

e.g., EMPLOYEE_NAME, WEIGHT, etc.Attributes are the column names (headers) in a relation (Notation: R.A, or R[A] where R is the relation name)

YV - Relational Model and Algebra 5

Structure Definitions (2)Structure Definitions (2)

A relation schema R is the name and attributes of a relation, with the underlying domains for the attributes. When obvious, the domains are ignored

Notation: R(A1 , A2 , ... An)

The degreee n of a relation R is the number of attributes in R

A database schema S is a set of relation schemas.

Notation: S = {R1 , R2 , ... Rm }

e.g., COMPANY = { EMPLOYEE, PROJECT, ... }

YV - Relational Model and Algebra 6

Structure Definitions (3)Structure Definitions (3)

-- A tuple t of a relation R(A1 , A2 , ... An) is an (ordered) set of values t = <v1 , v2 , ... vn >, where each vi is an element of the domain D(Ai).

-- A relation instance r(R), simply, relation, is a set of tuples

r(R) = { t1 , t2 , ... tk }

alternatively, it is a subset of the Cartesian product

r(R) D(A1) x D(A2) x ... x D(An)

-- The cardinality of R is the number of tuples in r(R), it is denoted by CARDR

-- A relational database is a set of relations (instances)

YV - Relational Model and Algebra 7

Characteristics of RelationsCharacteristics of Relations

ORDERING of attributes in a relational schema is essential ORDERING of tuples in a relation is not important Every tuple is stored only ONCE in a relation (it is a set) A value may appear MULTIPLE TIMES in a column and is

considered ATOMIC (indivisible) - at times this is referred as First Normal Form (1-NF) relation

A special value, called NULL, is used to represent values that are inapplicable or unknown to the database– e.g, the PhoneNumber value of someone without a phone,

Notation: component value of a tuple t, t[Ai] = vi

YV - Relational Model and Algebra 8

Constraints in the ModelConstraints in the Model

CONSTRAINTS– Three kinds of inherent to the model constraints: KEY,

ENTITY INTEGRITY, and REFERENTIAL INTEGRITY.– Three basic explicit constraints: DOMAIN, COLUMN and

USER-DEFINED (some other explicit constraints, like the Functional Dependencies, will be discussed later.)

KEY CONSTRAINTS: The various keys, as defined for Entities and Relationships, hold in the Relational Model.

– Note that a key is a property of the relational schema (not a property of the relation)

YV - Relational Model and Algebra 9

Inherent Constraints (1)Inherent Constraints (1)

– A set of attributes SK of a relation schema R for which each tuple in any relation instance r(R) must have unique value(s) is a

superkey. That is, for distinct t1 and t2, t1[SK] t2[SK]

For instance, SSN of EMPLOYEE, NAME and ADDRESS of EMPLOYEE, SSN and NAME of EMPLOYEE, etc.

– A candidate key K is a minimal superkey (that is, no subset of the attributes in K is a superkey). K is also called key.

For instance, SSN is a candidate key for EMPLOYEE, but the combination {SSN, NAME} is not.

– A primary key PK is one of the candidate keys that is agreed to serve as an identifier for the relation (primary keys are usually distinguished by underlining)

For instance, SSN is the primary key of relation EMPLOYEE.

YV - Relational Model and Algebra 10

Inherent Constraints (2)Inherent Constraints (2)

ENTITY INTEGRITY: The primary key attributes PK in a relation schema R cannot have NULL values in any tuple of a relation instance r(R).

t[PK] NULL, for all t in r(R)

– The reason behind the above constraint is that a primary key is used to identify a tuple in a relation.

– Note that more attributes in R may be constrained to have no NULLS by explicit constraints.

YV - Relational Model and Algebra 11

Inherent Constraints (3)Inherent Constraints (3)

REFERENTIAL INTEGRITY: These constraints involve TWO relations and are used to specify a relationship among tuples of the two relations. They are also called foreign keys.

– A foreign key FK is a set of one or more attributes of a relation R1 that forms a primary key for another relation R2.

A tuple t1 in r(R1) is said to reference tuple t2 in r(R2), IF

t1[FK] = t2[FK]

For instance, for the relation WORKING-ON the attribute SSN is a foreign key (it is the primary key of EMPLOYEE).

YV - Relational Model and Algebra 12

Explicit Constraints (1)Explicit Constraints (1)

DOMAIN CONSTRAINTS: They are the rules defined in the domain definition and inherited by columns (attributes) based on that domain.– A domain can be defined, together with all its integrity rules (e.g.,

the domain of integers having all rules that apply to integers). They are usually the basic data types.

– The ideal support is through strong data typing (very rare)

COLUMN CONSTRAINTS: They are additional to the domain constraints and maintain values in a column.– Column rules go beyond the rules inherited by the domain (e.g.,

the column of small integers or integers between 1 and 10, etc. that further restrict the domain of integers.)

– In many systems, support is given with a CHECK option

YV - Relational Model and Algebra 13

Explicit Constraints (2)Explicit Constraints (2)

USER-DEFINED CONSTRAINTS: Any integrity rule, not among the ones discussed before, is classified as user-defined.

To enforce certain business rules, integrity constraints of arbitrary complexity are required.

Such constraints are expressed either procedurally or declaratively (preferred way)

Several mechanisms can be used to implement the enforcement of such rules: stored procedures, triggers, methods (for object-oriented systems)

Generally, relational DBMS are weak in enforcing rules

YV - Relational Model and Algebra 14

The COMPANY Database in the Relational Model - Schema

The COMPANY Database in the Relational Model - Schema

EMPLOYEE ( SSN, Name, BirthDate, Address, Sex, Salary, SupSSN, DNumber)

DEPARTMENT ( DNumber, DName, MgrSSN, MgrStartDate)

PROJECT ( PNumber, PName, Location, DNumber)

DEPT_LOCATION ( DNumber, DLocation )

WORKS_ON ( SSN, PNumber, HoursPW)

DEPENDENT ( SSN, DependName, Sex, BirthDate, Relationship)

YV - Relational Model and Algebra 15

The COMPANY Database in the Relational Model - Instance

The COMPANY Database in the Relational Model - Instance

.

SSN Name BDate Address Sex Salary SupSSN DNumber1234 john 9.1.55 kifisia m 30000 3344 53344 frank 8.9.45 athina m 55000 8886 59998 alice 7.6.50 ekali f 25000 9876 49876 jenny 2.6.41 patra f 43000 8886 46668 rama 5.8.56 korinth m 38000 3334 54534 joyce 3.7.62 kiato f 25000 3334 59879 jack 2.3.59 maroussi m 25000 9876 48886 james 1.1.40 psihico m 60000 NULL 1

DEPARTMENT

EMPLOYEE

DNumber DLocation1 kolonaki4 zografou5 gkizi5 patisia5 kolonaki

DEPT_LOCATION

YV - Relational Model and Algebra 16

The COMPANY Database in the Relational Model - InstanceThe COMPANY Database in the Relational Model - Instance

.SSN PNumber HoursPW1234 1 321234 2 73344 2 103344 3 103344 10 103344 20 109998 30 309998 10 109876 30 209876 20 156668 3 404534 1 204534 2 209879 10 359879 30 58886 20 NULL

PNumber PName Location DNumber1 prodx gkizi 52 prody patisia 53 prodz kolonaki 510 computer zografou 420 reorganize kolonaki 130 benefits zografou 4

WORKS_ONPROJECT

ESSN Name Sex BDate Relation3334 aliki f 1976 daught3334 theo m 1973 son3334 joy f 1948 spouse9876 abner m 1932 spouse1234 mike m 1978 son1234 aliki f 1979 daught1234 elisa f 1957 spouse

DEPENDENT

YV - Relational Model and Algebra 17

Definition of the Model: Operations Definition of the Model: Operations

OPERATIONS– We can distinguish them in (a) UPDATE, (b) RETRIEVAL

UPDATE operations on Relations– INSERT a tuple– DELETE a tuple– MODIFY a tuple

Integrity constraints should not be violated by the execution of any update operation. For this, updates may propagate to cause other updates automatically.– e.g., when a tuple of EMPLOYEE is deleted, all tuples in

WORKING_ON which have the same value for SSN are also deleted (non-existent employees cannot work on projects!)

YV - Relational Model and Algebra 18

Operations: Relational LanguagesOperations: Relational Languages

RETRIEVAL operations on Relations– There are two flavors:– (a) RELATIONAL ALGEBRA -- somewhat procedural, tells

how to compute the result– (b) RELATIONAL CALCULUS -- somewhat declarative,

tells what properties the result should have

No database system supports the two flavors of retrieval languages in their pure forms. This is because the issues of “ease of use”, “convenience”, etc., play an essential role in user interaction. Yet, the languages supported in DBMSs have their roots in either relational algebra or calculus.

YV - Relational Model and Algebra 19

Relational LanguagesRelational Languages

Query languages: Allow manipulation and retrieval of data from a database.

Relational model supports simple, powerful QLs:– Strong formal foundation based on logic.– Allows for much optimization.

Query Languages != programming languages!– QLs not expected to be “Turing complete”.– QLs not intended to be used for complex calculations.– QLs support easy, efficient access to large data sets.

YV - Relational Model and Algebra 20

Relational Algebra OperationsRelational Algebra Operations

RELATIONAL ALGEBRA A set of operators each of which maps one or more

relations into a new relation (the algebra is CLOSED). The operators, just as in arithmetic algebra, can be nested,

since the result of each operation is itself a relation. There are two types of operators:

– traditional (regular) set operators » union, intersection, difference, ...

– database specific set operators » projection, selection, join, ...

YV - Relational Model and Algebra 21

Relational Algebra: Set OpsRelational Algebra: Set Ops

Traditional Set Operators– Union, Intersection, Difference, Cartesian Product– For the first three operators to apply, we must have UNION

COMPATIBILITY between the two operand relations. That is:

R1 ( A1 , A2 , ... , An ) and R2 ( B1 , B2 , ... , Bn ) must have the same number of attributes and the domains of the corresponding attributes must be compatible, i.e.,

D(Ai) = D(Bi) , for i = 1, 2, ..., n

– By convention, the resulting relation for these operators has the same attribute names as the first operand relation R1

YV - Relational Model and Algebra 22

Relational Algebra : Example DatabaseRelational Algebra : Example Database

Database Schema

STUDENT (SName, SAge) , relation schema R

INSTRUCTOR (IName, IAge) relation schema S

Database Instance

R (STUDENT) S (INSTRUCTOR)

SName SAgemary 20jack 22paul 25barb 22john 24

IName IAgehelen 22paul 25barb 22kris 24

CARDR = 5

CARDS = 4

YV - Relational Model and Algebra 23

Relational Algebra: Set Ops (2)Relational Algebra: Set Ops (2)

UNION - Put all the tuples of two relations in one relation– Notation: R S

– Formally: R S = { t | t is in R or t is in S }

– Example: STUDENT INSTRUCTOR

SName SAgemary 20jack 22paul 25barb 22john 24helen 22kris 24

CARDRS <= CARDR + CARDS

SName SAgemary 20jack 22paul 25barb 22john 24

IName IAgehelen 22paul 25barb 22kris 24

=

YV - Relational Model and Algebra 24

Relational Algebra: Set Ops (3)Relational Algebra: Set Ops (3)

INTERSECTION - Put the common tuples of two relations in one relation– Notation: R S

– Formally: R S = { t | t is in R and t is in S }

– Example: STUDENT INSTRUCTOR

SName SAgepaul 25barb 22

CARDRS <= max(CARDR , CARDS )

SName SAgemary 20jack 22paul 25barb 22john 24

IName IAgehelen 22paul 25barb 22kris 24

=

YV - Relational Model and Algebra 25

Relational Algebra: Set Ops (4)Relational Algebra: Set Ops (4)

SET DIFFERENCE - Select the tuples of the first relation which are not members of the second relation– Notation: R S– Formally: R S = { t | t is in R and t is not in S }– Example: STUDENT INSTRUCTOR

SName SAgemary 20jack 22john 24

CARDRS <= CARDR

SName SAgemary 20jack 22paul 25barb 22john 24

IName IAgehelen 22paul 25barb 22kris 24

=

YV - Relational Model and Algebra 26

Relational Algebra: Set Ops (5)Relational Algebra: Set Ops (5)

CARTESIAN PRODUCT - Combine each tuple of one relation with each tuple of the other– Notation: R S– Formally: R S = { t | t is the concatenation of a tuple

in R with a tuple in S }– Example: STUDENT INSTRUCTOR

SName SAge Iname IAgemary 20 helen 22mary 20 paul 25mary 20 barb 22mary 20 kris 24jack 22 helen 22jack 22 paul 25jack 22 barb 22jack 22 kris 24paul 25 helen 22

paul 25 paul 25paul 25 barb 22paul 25 kris 24barb 22 helen 22barb 22 paul 25barb 22 barb 22barb 22 kris 24john 24 helen 22john 24 paul 25john 24 barb 22john 24 kris 24

CARDRS = CARDR x CARDS

YV - Relational Model and Algebra 27

Relational Algebra Operators: SELECTION

Relational Algebra Operators: SELECTION

SELECTION - Selects the subset of tuples of a relation that satisfy a certain condition (qualification) c , which is an arbitrary Boolean expression on the attributes of R

(“horizontal” subset of R)

– Notation: c (R) or R[c]

– Formally: c (R) = { t | t is in r(R) and condition c holds for t }

– Examples: DNumber = 4 (EMPLOYEE), Salary>30000

(EMPLOYEE)

(Salary>30000 AND DNumber = 4 ) OR DNumber = 5 (EMPLOYEE),

EMPLOYEE [ Dnumber = 4 ], EMPLOYEE [ Salary > 30000 ]

YV - Relational Model and Algebra 28

Relational Algebra Operators: SELECTION (2)

Relational Algebra Operators: SELECTION (2)

– Selection is both commutative and associative

(a) c1 ( c2 (R) ) = c2 ( c1 (R) )

(b) c1 ( c2 (R) ) = c1 AND c2 (R) = c1 , c2 (R)

(c) c1 ( c2 ( c3 (R) ) ) ) = c2 ( c3 ( c1 (R) ) ) )

– Example Result: DNumber = 4 (EMPLOYEE)

SSN Name BDate Address Sex Salary SupSSN DNumber9998 alice 7.6.50 ekali f 25000 9876 49876 jenny 2.6.41 patra f 43000 8886 49879 jack 2.3.59 maroussi m 25000 9876 4

All Employees in department number 4

YV - Relational Model and Algebra 29

Relational Algebra Operators: PROJECTIONRelational Algebra Operators: PROJECTION

PROJECTION - Keeps only certain attributes (specified by a list L) and eliminates the other attributes of a relation R and also all duplicate tuples (“vertical” subset of R)– Notation: L (R) or R[L]

– Formally: L (R) = { t[L] | t is in r(R) and L R }

– Example: Location (PROJECT), or PROJECT[Location]

Locationgkizipatisiakolonakizografou

All Locations where projects are

YV - Relational Model and Algebra 30

Relational Algebra Operators: JOINSRelational Algebra Operators: JOINS

There are several types of JOINs - all combining two relations to form a new one:– (theta) join, equality join, natural join, semi-join, outer join

THETA (CONDITION) JOIN: Connect tuples from two relations that match (satisfy a Boolean condition c) on certain attributes– A theta-join is equivalent to a Cartesian product followed by a

selection on the condition c.

– Notation: R c S or R [ c ] S

– The resulting relation has ALL the attributes of R and of S

R c S c R S ( )

YV - Relational Model and Algebra 31

Relational Algebra Operators: THETA JOINRelational Algebra Operators: THETA JOIN

All department-department combinations where the first department’s number is greater than the second’s

DNumber DName MgrSSN MgrSD1 headqrtr 8886 28.6.735 research 3334 22.3.785 research 3334 22.3.78

YV - Relational Model and Algebra 32

Relational Algebra Operators: EQUALITY JOINRelational Algebra Operators: EQUALITY JOIN

EQUALITY JOIN: Connect tuples from two relations that match (have equal values) on certain attributes. This is exactly like THETA JOIN, except that the condition c is only allowed to have equalities.– Notation: R c S or R [ c ] S

– Example: WORKS_ON HoursPW = DNumber PROJECT

SSN PNumber HoursPW9879 30 59879 30 59879 30 5

PNumber PName Location DNumber1 prodx gkizi 52 prody patisia 53 prodz kolonaki 5

A totally MEANINGLESS Relation

YV - Relational Model and Algebra 33

Relational Algebra Operators: NATURAL JOINRelational Algebra Operators: NATURAL JOIN

NATURAL JOIN: Connect tuples from two relations that match (have equal values) on all common attributes. In the result, the common attributes are kept only once– Notation: R S or R [ X = X ] S

– Example: DEPARTMENT DEPT_LOCATION

DNumber DName MgrSSN MgrSD DLocation5 research 3334 22.3.78 gkizi5 research 3334 22.3.78 patisia5 research 3334 22.3.78 kolonaki4 admin 9876 14.1.85 zografou1 headqrtr 8886 28.6.73 kolonaki

YV - Relational Model and Algebra 34

Relational Algebra Operators: SEMI--JOINRelational Algebra Operators: SEMI--JOIN

SEMI-JOIN: Select the subset of one relation that joins with another. A semi-join is equivalent to a join followed by a projection.– Notation: R c S or R < c ] S

SSN Name BDate Address Sex Salary SupSSN DNumber3344 frank 8.9.45 athina m 55000 8886 59876 jenny 2.6.41 patra f 43000 8886 48886 james 1.1.40 psihico m 60000 NULL 1

Semi-joins are USEFUL in distributed database operations

YV - Relational Model and Algebra 35

Relational Algebra Operators: OUTER--JOINRelational Algebra Operators: OUTER--JOIN

Motivation: In a regular join operation, tuples in relations R or S that do not have matching tuples in the other relation do not appear in the result.

In some queries, all tuples in R (or S) must appear in the result - when no matching tuples are found, NULLs are placed for the missing attribute values.– Notation: R S

OUTER-JOINs are distinguished in:– Left outer join (all tuples in R appear in the result)– Right outer join (all tuples in S appear)– Full outer join (all tuples in R and S appear)

YV - Relational Model and Algebra 36

Relational Algebra Operators: DIVISIONRelational Algebra Operators: DIVISION

DIVISION: Given relations R(X,Y) and S(Y), where X, Y are sets of attributes, a tuple t is a member of the division (denoted: (R S)[X] ) IF for all tS in S there exist tR in R, such that: tR [Y] = tS [Y] and tR [X] = t [X]

– Analogy with number arithmetic:

The quotient q of a/b is the largest number s.t. qb <= a

The quotient Q of R S is the maximal relation s.t. Q X S R

YV - Relational Model and Algebra 37

Relational Algebra Queries (1)Relational Algebra Queries (1)

A series of queries in relational algebra are presented in the sequel, using an example relational database that involves SAILORS who RESERVE some BOATS.

SAILORS (Sid, SName, Rating)

BOATS (Bid, BName, Color)

RESERVE (Sid, Bid, Date)

YV - Relational Model and Algebra 38

Relational Algebra Queries (2)Relational Algebra Queries (2)

QUERY1: Find the names of sailors who have reserved boat number 2

( RESERVE [Bid=2] [Sid=Sid] SAILORS ) [SName]

SName ( Bid=2 RESERVE Sid=Sid SAILORS )

QUERY2: Find the names of sailors who have reserved a red boat

( BOAT [Color=red] [Bid=Bid] RESERVE [Sid=Sid] SAILORS ) [SName]

SName ( Color=red BOAT Bid=Bid RESERVE Sid=Sid SAILORS )

YV - Relational Model and Algebra 39

Relational Algebra Queries (3)Relational Algebra Queries (3)

QUERY3: Find the colors of the boats reserved by eleni

(SAILORS [SName=eleni] [Sid=Sid] RESERVE [Bid=Bid] BOATS) [Color]

Color ( SName=eleni SAILORS Sid=Sid RESERVE Bid=Bid BOATS )

QUERY4: Find the names of the sailors who have reserved at least one boat

( RESERVE [Sid=Sid] SAILORS ) [SName]

SName ( RESERVE Sid=Sid SAILORS )

YV - Relational Model and Algebra 40

Relational Algebra Queries (4)Relational Algebra Queries (4)

QUERY5: Find the names of sailors who have reserved a red or a green boat

SName ( Color=red BOATS Color=green BOATS )

Bid=Bid RESERVE Sid=Sid SAILORS )

QUERY6: Find the names of sailors who have reserved both a red and a green boat

SName ( ( Sid ( Color=red BOATS Bid=Bid RESERVE )

Sid ( Color=green BOATS Bid=Bid RESERVE ) ) Sid=Sid SAILORS )

YV - Relational Model and Algebra 41

Relational Algebra Queries (5)Relational Algebra Queries (5)

QUERY7: Find the names of sailors who have reserved all boats

SName ( ( Sid, Bid RESERVE / Bid BOATS ) Sid=Sid SAILORS )

QUERY8: Find the names and ratings of sailors who have reserved all red boats

SName, Rating ( Sid, Bid RESERVE / Bid (Color=red BOATS ) )

Sid=Sid SAILORS )

YV - Relational Model and Algebra 42

There are several properties that hold in a relational algebra expression (commutatitivity, associativity, etc.)– Examples:

c1 ( L (R) ) = L ( c1 (R) )

c1 ( R c2 S ) = c1 ( R ) c2 S

c1 (R S) = c1(R) c1(S)

....

Such properties are very useful in query optimization

YV - Relational Model and Algebra 43

COMPLETE SET OF OPERATIONS– The set of operators {} is called a complete

set of relational algebra operations. The implication is that ALL other operators can be described as a sequence of the above operators.

– For example, the division operator can be described as:

R / S = X(R) ( (X (R) S) R )

where X are the non-common attributes in R and S

– Equivalently, it is expressed as:

(R / S) [X] = R[X] - ( ( R[X] x S ) - R )[X]

YV - Relational Model and Algebra 44

Relational Algebra CompetenessRelational Algebra Competeness

There are several combinations of relational algebra operators that define a complete set.

Any Query Language equivalent to a complete set of operations is called RELATIONALLY COMPLETE– NOTE: This does not imply that the language is adequate to do

all database operations (e.g., a good language must support aggregates, many forms of joins, built-in functions,...)

An interesting operator -which goes beyond the expressive power of the relational set of operators as defined by Codd- is that of transitive closure. This is a form of recursion in relational databases and is very useful in many applications.