CS 320: Database Management Systems

Functional Dependencies

Ge (Frank) Xia, CS, Lafayette College

Adapted from lecture notes by Prof. Ullman @ Stanford

Functional Dependencies

• Meaning of FD's
• Keys and Superkeys
• Inferring FD's

Functional Dependencies

• X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X , then they must also agree on the attribute A .
- » Say " X -> A holds in R ."
- » Convention : ..., X , Y , Z represent sets of attributes; A , B , C , ... represent single attributes.
- » Convention : no set formers in sets of attributes, just ABC , rather than {A , B , C }.

Example

• Drinkers(name, addr, beersLiked, manf, favBeer)
• Reasonable FD's to assert:
- 1. name -> addr
- 2. name -> favBeer
- 3. beersLiked -> manf

Example Data

FD's With Multiple Attributes

• No need for FD's with > 1 attribute on right.
- » But sometimes convenient to combine FD's as a shorthand.

» Example: name -> addr and name -> favBeer become name -> addr favBeer

• > 1 attribute on left may be essential.
- Example: bar beer -> price

Keys of Relations

• K is a superkey for relation R if K functionally determines all of R .
• K is a key for R if K is a superkey, but no proper subset of K is a superkey.

Example

• Drinkers(name, addr, beersLiked, manf, favBeer)
• {name, beersLiked} is a superkey because together these attributes determine all the other attributes.
- » name -> addr favBeer
- » beersLiked -> manf

Example, Cont.

• {name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey.
- » name doesn't -> manf ; beersLiked doesn't -> addr .
• There are no other keys, but lots of superkeys.
- » Any superset of {name, beersLiked} .

E/R and Relational Keys

• Keys in E/R concern entities .
• Keys in relations concern tuples .
• Usually, one tuple corresponds to one entity, so the ideas are the same.
• But --- in poor relational designs, one entity can become several tuples, so E/R keys and Relational keys are different.

Example Data

Where Do Keys Come From?

• Just assert a key K .
- » The only FD's are K -> A for all attributes A.
• Assert FD's and deduce the keys by systematic exploration.
- » E/R model gives us FD's from entity-set keys and from many-one relationships.

More FD's From "Physics"

• Example: "no two courses can meet in the same room at the same time" tells us: hour room -> course .

Inferring FD's

• We are given FD's X_1 -> A_1 , X_2 -> A_2 , ..., X_n -> A_n , and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD's.
- » Example: If A -> B and B -> C hold, surely A -> C holds, even if we don't say so.
• Important for design of good relation schemas.

Inference Test

Inference Test - (2)

• Use the given FD's to infer that these tuples must also agree in certain other attributes.
- » If B is one of these attributes, then Y -> B is true.
- » Otherwise, the two tuples, with any forced equalities, form a two-tuple relation that proves Y -> B does not follow from the given FD's.

Closure Test

• An easier way to test is to compute the closure of Y , denoted Y ⁺ .
• Basis : Y ⁺ = Y .
• Induction : Look for an FD's left side X that is a subset of the current Y ⁺ . If the FD is X -> A , add A to Y ⁺ .

In a picture

Finding All Implied FD's

• Motivation : "normalization," the process where we break a relation schema into two or more schemas.
• Example: ABCD with FD's AB -> C , C - > D , and D -> A .
- » Decompose into ABC , AD . What FD's hold in ABC ?
- » Not only AB -> C , but also C -> A !

Why?

Basic Idea

1. Start with given FD's and find all nontrivial FD's that follow from the given FD's.
- » Nontrivial = left and right sides disjoint.
2. Restrict to those FD's that involve only attributes of the projected schema.

Simple Algorithm

1. For each set of attributes X , compute X ⁺ .
2. Add X -> A for all A in X ⁺ - X .
3. However, drop XY -> A whenever we discover X -> A .
- » Because XY -> A follows from X -> A in any projection .
4. Finally, use only FD's involving projected attributes.

A Few Tricks

• No need to compute the closure of the empty set or of the set of all attributes.
• If we find X ⁺ = all attributes, so is the closure of any superset of X .

Example

• ABC with FD's A -> B and B -> C . Project onto AC .
- » A ⁺ = ABC ; yields A -> B , A -> C .
- » We do not need to compute AB ⁺ or AC ⁺ .
- » B ⁺ = BC ; yields B -> C .
- » C ⁺ = C ; yields nothing.
- » BC ⁺ = BC ; yields nothing.

Example --- Continued

• Resulting FD's: A -> B , A -> C , and B -> C .
• Projection onto AC : A -> C .
- » Only FD that involves a subset of { A , C }.

A Geometric View of FD's

• Imagine the set of all instances of a particular relation.
• That is, all finite sets of tuples that have the proper number of components.
• Each instance is a point in this space.

Example: R(A,B)

An FD is a Subset of Instances

• For each FD X -> A there is a subset of all instances that satisfy the FD.
• We can represent an FD by a region in the space.
• Trivial FD = an FD that is represented by the entire space.
- » Example: A -> A .

Example: A -> B for R(A,B)

Representing Sets of FD's

• If each FD is a set of relation instances, then a collection of FD's corresponds to the intersection of those sets.
- » Intersection = all instances that satisfy all of the FD's.

Example

Implication of FD's

• If an FD Y -> B follows from FD's X_1 -> A_1 ,... , X_n -> A_n , then the region in the space of instances for Y -> B must include the intersection of the regions for the FD's X_i -> A_i
- » That is, every instance satisfying all the FD's X_i -> A_i surely satisfies Y -> B .
- » But an instance could satisfy Y -> B , yet not be in this intersection.

Example