Adapted from lecture notes by Prof. Ullman @ Stanford
Functional Dependencies
Meaning of FD's
Keys and Superkeys
Inferring FD's
Functional Dependencies
X -> Ais an assertion about a relation R that whenever two tuples of R agree on all the attributes of X , then they must also agree on the attribute A .
» Say " X -> Aholds in R ."
» Convention : ..., X , Y , Zrepresent sets of attributes; A , B , C , ... represent single attributes.
» Convention : no set formers in sets of attributes, just ABC , rather than {A , B , C }.
Example
Drinkers(name, addr, beersLiked, manf, favBeer)
Reasonable FD's to assert:
1. name -> addr
2. name -> favBeer
3. beersLiked -> manf
Example Data
FD's With Multiple Attributes
No need for FD's with > 1 attribute on right.
» But sometimes convenient to combine FD's as a shorthand.
» Example: name -> addr and name -> favBeer become name -> addr favBeer
> 1 attribute on left may be essential.
Example: bar beer -> price
Keys of Relations
K is a superkey for relation R if K functionally determines all of R .
K is a key for R if K is a superkey, but no proper subset of K is a superkey.
Example
Drinkers(name, addr, beersLiked, manf, favBeer)
{name, beersLiked} is a superkey because together these attributes determine all the other attributes.
» name -> addr favBeer
» beersLiked -> manf
Example, Cont.
{name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey.
Usually, one tuple corresponds to one entity, so the ideas are the same.
But --- in poor relational designs, one entity can become several tuples, so E/R keys and Relational keys are different.
Example Data
Where Do Keys Come From?
Just assert a key K .
» The only FD's are K -> A for all attributes A.
Assert FD's and deduce the keys by systematic exploration.
» E/R model gives us FD's from entity-set keys and from many-one relationships.
More FD's From "Physics"
Example: "no two courses can meet in the same room at the same time" tells us: hour room -> course .
Inferring FD's
We are given FD's X_1 -> A_1 , X_2 -> A_2 , ..., X_n -> A_n , and we want to know whether an FD Y -> Bmust hold in any relation that satisfies the given FD's.
» Example: If A -> B and B -> C hold, surely A -> C holds, even if we don't say so.
Important for design of good relation schemas.
Inference Test
Inference Test - (2)
Use the given FD's to infer that these tuples must also agree in certain other attributes.
» If B is one of these attributes, then Y -> Bis true.
» Otherwise, the two tuples, with any forced equalities, form a two-tuple relation that proves Y -> Bdoes not follow from the given FD's.
Closure Test
An easier way to test is to compute the closure of Y , denoted Y+ .
Basis : Y+ = Y .
Induction : Look for an FD's left side X that is a subset of the current Y+ . If the FD is X -> A, add A to Y+ .
In a picture
Finding All Implied FD's
Motivation : "normalization," the process where we break a relation schema into two or more schemas.
Example: ABCD with FD's AB -> C , C - > D , and D -> A .
» Decompose into ABC , AD . What FD's hold in ABC ?
» Not only AB -> C , but also C -> A !
Why?
Basic Idea
1. Start with given FD's and find all nontrivial FD's that follow from the given FD's.
» Nontrivial = left and right sides disjoint.
2. Restrict to those FD's that involve only attributes of the projected schema.
Simple Algorithm
1. For each set of attributes X , compute X + .
2. Add X -> Afor all A in X+ - X .
3. However, drop XY -> Awhenever we discover X -> A.
» Because XY -> A follows from X -> Ain any projection .
4. Finally, use only FD's involving projected attributes.
A Few Tricks
No need to compute the closure of the empty set or of the set of all attributes.
If we find X + = all attributes, so is the closure of any superset of X .
Example
ABC with FD's A -> B and B -> C. Project onto AC .
» A + = ABC ; yields A -> B , A -> C.
» We do not need to compute AB + or AC + .
» B + = BC ; yields B -> C.
» C + = C ; yields nothing.
» BC + = BC ; yields nothing.
Example --- Continued
Resulting FD's: A -> B , A -> C , and B -> C .
Projection onto AC : A -> C.
» Only FD that involves a subset of { A , C }.
A Geometric View of FD's
Imagine the set of all instances of a particular relation.
That is, all finite sets of tuples that have the proper number of components.
Each instance is a point in this space.
Example: R(A,B)
An FD is a Subset of Instances
For each FD X -> Athere is a subset of all instances that satisfy the FD.
We can represent an FD by a region in the space.
Trivial FD = an FD that is represented by the entire space.
» Example: A -> A.
Example: A -> B for R(A,B)
Representing Sets of FD's
If each FD is a set of relation instances, then a collection of FD's corresponds to the intersection of those sets.
» Intersection = all instances that satisfy all of the FD's.
Example
Implication of FD's
If an FD Y -> B follows from FD's X_1 -> A_1 ,... , X_n -> A_n , then the region in the space of instances for Y -> B must include the intersection of the regions for the FD's X_i -> A_i
» That is, every instance satisfying all the FD's X_i -> A_i surely satisfies Y -> B.
» But an instance could satisfy Y -> B, yet not be in this intersection.