Chapter 3 defines the Relational Data Model and presents a top-down
methodology for design of relational databases. The theoretical
foundations of the relational model were introduced by E. Codd in
1970. By late 1980s this model had become the state of the art in
commercial database management.
3.1. Time Invariant Attributes and Keys
Time-invariant attribute - An attribute A is time-invariant if once an
object x becomes related by A to a value y, the object x will
forever be related by A to y, as long as x exists.
There are no time-invariant attributes in the natural user world.
Even if the laws of physics or society do not allow for an
attribute to change in time, the attribute may change in the
perceived real world due to discoveries of errors in earlier
perception. For example, a social security number could be
wrongly reported and then corrected. Thus, time-invariance is
defined only in implementational restrictions. Such restrictions
are unavoidable in the relational database design. The
methodology of relational schema design that is presented below
has among its goals the minimization of the negative effect of
such implementational restrictions.
Example 3-1.
None of the attributes given in the previous examples is truly
time-invariant. The following attributes are the next closest
thing: they change only when an error is discovered.
birth-year of PERSON;
year and season of QUARTER.
The following attributes may sometimes change in time "in the
real world." Thus, declaring them as time-invariant would be a
stronger implementational restriction.
last-name and first-name of PERSON;
name of COURSE.
The following attributes have a high probability of change in
time; no reasonable database design would restrict them as
time-invariant.
address of PERSON;
final-grade of ENROLLMENT.
Keys
1. Single-attribute key
A time-invariant attribute of a category is called its key if it
is 1:1 and total. That means that the values of the attribute
can be used to identify the objects of the category.
Example 3-2.
If we assume that the attribute name of COURSE is time-
invariant, then it is probably also a key since:
a. It is total, provided we do not want our database to
contain courses without names
b. It is 1:1, that is, no two courses have the same
name, and no course has two names.
Due to the time-invariance requirement, no attribute is really a
key in the natural user's world. Thus, the property of a key is
defined only in implementational restrictions, which are
unavoidable in the relational database design. Also, the
requirement of totality is very rarely an integrity constraint
imposed by the logic of the user world but rather is an
implementational restriction.
Example 3-3.
Would the attribute social-security-number of the category
PERSON be its key? Let us assume that we have already imposed
an implementational restriction of time-invariance on this
attribute (thus making it very hard and dangerous for our
users to correct errors).
To be a key, this attribute must be one-to-one and total. It
is indeed one-to-one. But in the real world it is not really
total: there are some persons who do not have social security
numbers before they are reported to our database. The
totality can be imposed as an implementational restriction
with one of the following practical provisions:
a. Persons who do not have social security numbers
would be assigned the dummy default number 0, which
would be called a "social security number" for the
purposes of the database. But then this attribute
would no longer be one-to-one since two persons may
have the same number 0.
b. For persons who do not have real social security
numbers, generate dummy temporary numbers in such a
way that all these numbers are different and not in
the range of possible real social security numbers.
For example, if the real s.s. numbers may never
begin with the digit 9, then begin all the dummy
numbers with this digit.
Apart from the unnaturalness and "cheating" which
are bound to result in misinterpretation of the
computer's reports, there is a serious technical
problem: What if a person did not have a real s.s.
number at first, but later received one, and this
new number is a valuable piece of information which
the user wants to keep in the database? If we allow
replacement of the old dummy number by the new real
one, then the time-invariance requirement is
violated.
c. Yet another possibility is to disallow recording of
information about persons who do not have s.s.
numbers. I am afraid that our client might be
unhappy with such a restriction.
Example 3-4.
We can generate a new artificial attribute to serve as a key.
Let id# be a new artificial attribute of the category PERSON,
generated in such a way that when new persons enter the
database, they are assigned arbitrary meaningless numbers
which have not been used before. (When a person is deleted
from the database, the number is not reused.) This attribute
would be a key.
Convention: In this text, we shall name the attributes constrained to
be keys with the suffix -key.
Example 3-5.
name-key of COURSE
The name of the attribute name-key implicitly defines a
constraint or implementational restriction "This attribute is
a key of its domain, the category COURSE."
2. Multiattribute key
The following definition extends the concept key to a collection
of attributes.
Key of a category - a collection of total time-invariant attributes
f ,f ,..., f whose domain is that category and which satisfy two
1 2 n
requirements:
,...(i) Forranyicollectioneof values, x
1 n
than one object y of the category s.t.
x = y.f and x = y.f and...and x = y.f
1 1 2 2 n n
(ii) No proper subcollection of these attributes always satisfies
(i).
Practically, requirement (i) means that the collection of
attributes is sufficient to identify every object of the
category. Requirement (ii) means that the collection is minimal:
if one of the attributes is not known then the remaining
attributes might not provide sufficient information to identify
every object of the category.
Example 3-6.
(the-year, the-season) is the only key of the category
QUARTER.
Either attribute alone does not identify each object. Together
they do.
Convention: In this text, when a category is constrained to have
exactly one key, and the key is composed of several attributes,
we shall name these attributes with the suffix -in-key.
Example 3-7.
the-year-in-key and the-season-in-key of QUARTER
Note:
(i) In the real world a category usually has no key. Thus, the
existence of a key is usually not an integrity constraint
but rather an implementational restriction. This
restriction will be imposed when unavoidable due to
limitations of a DBMS or a database model, especially the
relational model.
(ii) Existence of a key makes every object of the category
identifiable with the values of the key and eliminates the
necessity to refer to abstract objects.
Example 3-8.
Courses can be identified with strings name-key. Quarters can
be identified with pairs (the-year-in-key, the-season-in-key).
(iii)A category which has no key may still have all its objects
completely identifiable (using different relations and their
combinations for different objects), but the identification
would not be uniform.
Example 3-9.
The category DEPARTMENT does not have a key (it does not have
any attribute at all; the relation NAME is not an attribute
because it is 1:m). However, if the relation department-name
is total, then, being 1:m, it distinguishes between every two
departments according to their names.
If the relation department-name is not total, then we might
have a problem distinguishing between two departments that
have no names at all. In that case, it is possible that the
"anonymous" departments can be distinguished by their
relationships of works-in with instructors.
Example 3-10.
We have not defined any key for the category INSTRUCTOR.
However, most instructors have unique names and can be
identified by their names. Few instructors have common names,
and their identification requires names of their departments
in addition to their first and last names. Some of the
instructors have nonunique names but do not work for any
department at all. These rare persons can be identifiable by
their names in conjunction with their birth years or with
courses they teach or with their addresses (whichever of this
information is available and sufficient for identification).
This identification is nonuniform because different
instructors are identified by different relations.
The nonuniform identification is quite common and satisfactory
in the real world and in the Binary Model, but the
implementational restrictions of the Relational Model will
necessitate a uniform identification of the objects of a
category.
(iv) When a key is composed of several attributes it is still one
key.
(v) A category may theoretically have several keys. However,
since categories in the real world rarely have even one key,
the existence of more than one key would be an unnecessarily
strong implementational restriction, which is not required
by database management systems. Thus, the possibility of
multiple keys will be ignored in this text.
3.2. Relational Schemas Defined
A binary schema is called table-oriented or a relational schema if:
(i) All the abstract categories of the schema have keys.
(ii) All the abstract categories are pairwise disjoint.
(iii)The only relations are attributes.
Thus, all the information in a relational schema is represented by
attributes of categories.
Example 3-11.
Figure (on the last page of this book) could be a
relational schema for the university application, provided:
o All the categories are restricted to having keys as
shown.
o There are no persons but students and instructors.
o The categories INSTRUCTOR and STUDENT are disjoint.
We shall see later that this schema can be used even without
imposing the severe restriction of disjointness.
Example 3-12.
Consider the category COURSE-ENROLLMENT. In the relational
schema it has no relations to the categories STUDENT and
COURSE-OFFERING. Instead, it has attributes giving
essentially the same information. Thus, instead:
[] the-student - relation from COURSE-ENROLLMENT to
STUDENT (m:1)
the category has the attribute:
[] student-id-in-key - attribute of COURSE-ENROLLMENT,
range: Integer (m:1)
Table-declaration, relation-declaration - any subschema of a
relational schema having only one abstract category with all of
its attributes and their ranges. The name of the table is the
name of that category.
Example 3-13.
Here is the table-declaration QUARTER.
[] QUARTER - category
[] year-in-key - attribute of QUARTER, range:
1980..1995 (m:1)
[] season-in-key - attribute of QUARTER, range: String
(m:1)
Instantaneous table, instantaneous relation - an instantaneous
database viewed under a table-declaration. This means that an
instantaneous table is a part of an instantaneous relational
database containing all the objects of one table and all their
relationships (attributes).
Representation of an instantaneous table - a printable table whose
title is the name of the category, the names of the columns are
the attributes, and for every object in the category there is a
row (called a tuple) composed of the values of the attributes of
that object.
Example 3-14.
The following is an instantaneous table of STUDENT. (Of
course, in this example the instantaneous table is
unrealistically small. It would contain thousands of tuples
for a normal university.)
STUDENT
|id-key| last-name| first-name| birth-year| address | major-dept- | minor-dept-|
| | | | | | main-name | main-name |
+------+-----------+------------+------------+-------------+------------------+-------------+
|12345 | Victory | Elizabeth | 1966 | 100 Sun St.| Computer Science| Economics |
|12348 | Howards | Jane | 1965 | 200 Dorms | Arts | Economics |
|43532 | Wood | Michael | 1964 | 110 Dorms | Arts | Economics |
Figure 3-1. An instantaneous table.
Note:
(i) There cannot be two identical rows in an instantaneous table
because this would imply that there are two objects with the
same values of the key.
(ii) The order of tuples (rows) is immaterial; the order of
columns is immaterial.
| |
Example 3-15. |
| ------- - -- |
| |
| The following figure represents the very same instantaneous |
| table as the previous example. |
| |
| |
STUDENT
|birth-year| address | major-dept- | id-key| last-name| first-name| minor-dept-|
| | | main-name | | | | main-name |
|1964 | 110 Dorms | Arts | 43532 | Wood | Michael | Economics |
|1965 | 200 Dorms | Arts | 12348 | Howards | Jane | Economics |
|1966 | 100 Sun St.| Computer Science| 12345 | Victory | Elizabeth | Economics |
Figure 3-2. Some columns and rows have been moved in the
representation without changing the instantaneous table.
Representation of an instantaneous relational database - a collection
of instantaneous tables.
| |
Example 3-16. |
| ------- - -- |
| |
| The following figure is a representation of an instantaneous |
database for the schema of Figure . This instantaneous |
| ---
| database represents the same state of the application's real |
world as the binary instantaneous database of Figure on |
| ---
page . |
| --- |
| |
STUDENT
|id-key| last-name| first-name| birth-year| address | major-dept- | minor-dept-|
| | | | | | main-name | main-name |
+------+-----------+------------+------------+-------------+------------------+-------------+
|12345 | Victory | Elizabeth | 1966 | 100 Sun St.| Computer Science| Economics |
|12348 | Howards | Jane | 1965 | 200 Dorms | Arts | Economics |
|43532 | Wood | Michael | 1964 | 110 Dorms | Arts | Economics |
INSTRUCTOR
|id-key | last-name | first-name | birth-year | address |
+-------+------------+-------------+-------------+------------------+
|11332 | Brown | George | 1956 | 112 Lucky Dr. |
|14352 | Whatson | Mary | 1953 | 231 Fortune Dr. |
|24453 | Blue | John | 1950 | 536 Orange Dr. |
DEPARTMENT DEPARTMENT NAMING
| main-name-key | | name-key | main name |
+----------------+ |-----------------+------------------+
|Computer Science| | CS | Computer Science|
|Mathematics | | Math | Mathematics |
|Physics | | Physics | Physics |
|Arts | | Mathematics | Mathematics |
|Economics | | Computer Science| Computer Science|
------------------ | Arts | Arts |
| Economics | Economics |
Figure 3-3. An instantaneous database for the relational schema of
the university application.
WORK
| instructor-id-in-key| department-main-name-in-key|
|---------------------+-----------------------------+
| 11332 | Computer Science |
| 11332 | Mathematics |
| 14352 | Physics |
| 24453 | Mathematics |
COURSE QUARTER
| name-key | | year-in-key| season-in-key|
+----------+ |------------+---------------+
Databases 1990 Fall
|Football | | 1990 | Winter |
|Gastronomy| | 1990 | Spring |
COURSE OFFERING
|instructor-id-in-key| course-name-in-key| year-in-key| season-in-key|
+--------------------+--------------------+-------------+---------------+
|11332 | Databases | 1990 | Fall |
|11332 | Football | 1990 | Fall |
|11332 | Gastronomy | 1990 | Fall |
COURSE ENROLLMENT
|instructor-id-| course-name-| year- | season-| student-id-| final-grade|
| in-key | in-key | in-key| in-key | in-key | |
+--------------+--------------+--------+---------+-------------+-------------+
|11332 | Gastronomy | 1990 | Fall | 12345 | 100 |
|11332 | Gastronomy | 1990 | Fall | 12348 | 70 |
|11332 | Databases | 1990 | Fall | 12348 | 80 |
Figure 3-3. Continued.
An alternative representation of relational schemas (linear,
nongraphic):
table name [ attribute : range,..., attribute : range ]
...
table name [ attribute : range,..., attribute : range ]
| |
Example 3-17. |
| ------- - -- |
| |
Here is a linear representation of the schema of Figure : |
| --- |
| |
| DEPARTMENT [ main-name-key: String] |
| |
| |
| DEPARTMENT-NAMING [ name-key: String, main-name: String] |
| |
| |
| STUDENT [ id-key: Integer, last-name: String, first-name: |
| String, birth-year: 1870..1990, address: String, major- |
| department-main-name: String, minor-department-main-name: |
| String] |
| |
| |
| INSTRUCTOR [ id-key: Integer, last-name: String, first-name: |
| String, birth-year: 1870..1990, address: String] |
| |
| |
| WORK [ instructor-id-in-key: Integer, department-main-name- |
| in-key: String] |
| |
| |
| QUARTER [ year-in-key: 1980..1995, season-in-key: String] |
| |
| |
| COURSE [ name-key: String] |
| |
| |
| COURSE-OFFERING [ instructor-id-in-key: Integer, course-name- |
| in-key: String, year-in-key: 1980..1995, season-in-key: |
| String] |
| |
| |
| COURSE-ENROLLMENT [ instructor-id-in-key: Integer, course- |
| name-in-key: String, year-in-key: 1980..1995, season-in- |
| key: String, student-id-in-key: Integer, final-grade: |
| 0..100] |
| |
| |
3.3. Implementational Restrictions: Pros and Cons
Consequences of implementational restrictions in the Relational Model:
o The schemas often deviate from the real world.
o The schemas are often unnatural, inflexible, and redundant.
o The integrity constraints are often underrepresented in the
schemas.
o The queries are usually harder to specify.
| |
| Example 3-18. |
| ------- - -- |
| Instead of |
| (i WORKS-IN d) |
| ----- -- |
| |
| we have to say: |
| |
| |
(exists w in WORK: |
| ---- |
| |
i. ID-key = w. INSTRUCTOR-ID-in-key and |
| -- --- ---------- -- -- --- |
| |
d. MAIN-NAME-key = w. DEPARTMENT-MAIN-NAME-in-key) |
| ---- ---- --- ---------- ---- ---- -- --- |
| |
Purposes of implementational restrictions in the Relational Model:
o Exclusion of nonattribute relations,
exclusion of intersecting categories,
exclusion of sub- and supercategories -- allow for readable
and simple representation of the instantaneous database as
everyday tables.
o Totality and uniqueness of keys -- allow:
- A standard printable representation for every object
- Readable reference to objects of one table from objects
of another table
- Unambiguous definition of simple updates
o Time-invariance of keys -- prevents inconsistent update of
keys.
If the values of the key of an object are updated, then all
the references to this object throughout the instantaneous
database become wrong. The human user cannot be relied upon
to find and update all the references. The database
management system is normally unaware of these references
and thus cannot update them automatically, unless it is a
very advanced system having a high-level support for the
so-called referential integrity.
Many database management systems do not explicitly require
the time-invariance but do not provide a high-level support
for the referential integrity either. This does not mean
that those systems do not need the time-invariance.
Rather, this means that they do not check for the time-
invariance, and, without warning to the user, they corrupt
the database when the user modifies a value of a key.
Note: It is hard to say whether the severe implementational
restrictions can really be justified by the benefits of the
model.
Totality of nonkey attributes
Many relational database management systems require a further
implementational restriction: they require that all the
attributes be total.
Relational DBMS that do not require the totality of attributes
allow null values of attributes.
The restriction of the totality of the nonkey attributes
pragmatically requires a modification of the meaning of an
attribute which is nontotal in the real world. A special value is
identified which can never be a value of the attribute in the
real world. This value is assigned to the objects that do not
have any real value of the attribute.
| |
Example 3-19. |
| ------- - -- |
| |
| We can assign the dummy grade "-1" to the enrollments which do |
| not have any real grade. Thus we convert: |
| |
| |
[] final-grade - attribute of COURSE-ENROLLMENT, range: |
| ----- ----- ------ ----------
0..100 (m:1) |
| - --- - - |
| |
| into: |
| |
| |
[] new-final-grade - attribute of COURSE-ENROLLMENT, |
| --- ----- ----- ------ ----------
range: -1..100 (m:1) |
| - --- - - |
| |
| so that: |
| |
| |
x. NEW-FINAL-GRADE = -1 iff |
| --- ----- ----- |
| |
x FINAL-GRADE null |
| ----- -----
| |
Such ``cheating'' will, however, cause inconvenience and
misinterpretation of queries of naive users.
| |
| Example 3-20. |
| ------- - -- |
| The query |
| |
| get enrl.STUDENT-ID where |
| ------- -- |
| enrl is a COURSE-ENROLLMENT and |
| |
| enrl. new-FINAL-GRADE < 60 |
| --- ----- ----- |
| will also retrieve the students who have no grades at all. |
| The system will also believe that two students who really have |
| no grades have identical grades, and thus will mislead the |
| user. |
| |
| |
Some relational database management systems allow the analyst to
define default values of attributes. These would be the values
for the objects when the user has entered no value. The
definition of default values enhances the convenience of updates,
but it does not solve the problem of the misinterpretation of
queries. A default value is still a regular value as far as the
system is concerned.
| |
Example 3-21. |
| ------- - -- |
| |
| We can define "-1" to be the default grade. This will be the |
| grade of all the enrollments for which the user has not |
| provided a grade. |
| |
| |
In many cases the default value is defined as 0 or the empty
string.
| |
Example 3-22. |
| ------- - -- |
| |
| We should not define 0 as the default for the grade if our |
| intention is to use the default value as a substitute for |
null. This is because 0 might be a real grade of a poor |
| ----
| student. |
| |
| |
We can use the empty string `' as the default for last-name. |
| ---- ---- |
| |
3.4. Relational Database Design
The purpose of this section is to show how a high-quality relational
database can be designed once we have a semantic binary schema for the
application.
3.4.1. Design principles
Schema-conversion - replacement of a schema by another schema having
the same information content. This means that each of the two
schemas can be regarded as a userview of the other.
Schema-conversion is a means of database design: a schema is
first designed in a higher-level database model and then
translated into a lower-level model which is supported by the
available DBMS (when a DBMS for a higher-level model is
unavailable or inadequate).
Note:
(i) Schema conversion is usually done in order to impose
implementation restrictions needed because of the database
model or the database management system. Thus, the latter
schema is usually of lesser quality than the former.
| |
Example 3-23. |
| ------- - -- |
| |
| We shall see in this section how the semantic schema of the |
| university application can be converted into the relational |
schema of Figure . |
| --- |
| |
(ii) Although only the latter schema (or its descendants after
more conversions) will be used by the DBMS software, after
conversion the former schema must be kept and maintained as
a documentation of the application's real world.
(iii)After conversion, the former schema is called the conceptual
semantic schema of the latter physical schema or its
descendants.
(iv) When the concepts of the real world change, the conceptual
semantic schema must be changed first, and only then is the
physical schema regenerated from the conceptual semantic
binary schema by conversion.
This section presents a conversion algorithm of a semantic schema into
a relational schema whose quality is among the highest possible for
the Relational Model, provided the original semantic schema is of high
quality.
This algorithm can be performed manually by the database designer.
Alternatively, an automatic tool can be used to perform all the
busywork, while prompting the database designer for intelligent
decisions (and using defaults when the designer fails to provide such
a guidance). One such tool has been developed at the Florida
International University and the University of California by the
author and his students.
3.4.2. Composition and split of relations
Two auxiliary definitions of terminology that will be used in the
conversion algorithm follow.
Composition of relations
Let the range of Relation R be the domain of Relation R .
1 2
Relation R is the composition of R on R if
1 2
xRy iff exists z such that xR z and zR y.
1 2
[diagram skipped]
| |
Example 3-24. |
| ------- - -- |
| |
| Consider two relations: |
| |
| |
[] the-course - relation from COURSE-OFFERING to |
| --- ------ ------ --------
COURSE (m:1) |
| ------ - - |
| |
[] name - relation from COURSE to String (1:1) |
| ---- ------ ------ - - |
| |
The composition of THE-COURSE on NAME is: |
| --- ------ ---- |
| |
[] the-name-of-the-course - attribute of COURSE- |
| --- ---- -- --- ------ ------
OFFERING, range: String (m:1) |
| -------- ------ - - |
| |
Relation-split - conversion of a schema having a relation R into
another schema having, instead of R, a new abstract category C
and two total functional relations R , R , whose domain is C,
1 2
such that:
There is a fact xRy
if and only if
there exists an object z in C for which zR x and zR y.
1 2
[diagram skipped]
Figure 3-4. Relation R is split into a category C and two relations,
R and R . Every relationship x-y is broken into x-z and z-y.
1 2
| |
Example 3-25. |
| ------- - -- |
| |
| If due to an implementational restriction we may not have a |
m:m relation: |
| - - |
| |
[] DEPARTMENT - category |
| ---------- |
| |
[] INSTRUCTOR - category |
| ---------- |
| |
[] works-in - relation from INSTRUCTOR to DEPARTMENT |
| ----- -- ---------- ----------
(m:m) |
| - - |
| |
| |
| |
| then we can split it into: |
| |
| |
[] DEPARTMENT - category |
| ---------- |
| |
[] INSTRUCTOR - category |
| ---------- |
| |
[] WORK - category |
| ---- |
| |
[] the-instructor - relation from WORK to INSTRUCTOR |
| --- ---------- ---- ----------
(m:1) |
| - - |
| |
[] the-department - relation from WORK to DEPARTMENT |
| --- ---------- ---- ----------
(m:1) |
| - - |
| |
| |
| |
| This split necessitates additional integrity constraints: |
| |
| |
| o Both new relations are total. |
| |
| |
| o For any combination of an instructor and a department |
| there is at most one object in work. |
| |
| |
| The latter constraint is more rigorously formulated in |
| calculus, as follows. |
| |
| |
for every w in WORK: |
| ---- |
| |
for every v in WORK: |
| ---- |
| if w.the-instructor = v.the-instructor and |
| --- ---------- --- ----------
w.the-department = v.the-department |
| --- ---------- --- ---------- |
| |
then w=v. |
| - - |
| |
The composition of relations and relation-split can be regarded as
userviews.*
The following subsections present the conversion algorithm
3.4.3. Determination of keys
Step 1. Choose a key for every abstract category, excluding
subcategories of other categories, as follows.
a. (single-attribute key)
if the category has an attribute which is 1:1, time-
invariant, and total, then let that attribute be the key;
b. (``forced'' single-attribute key)
else if the category has an attribute which can be
implementationally restricted to be 1:1, time-invariant, and
total, without very harmful alteration of the real world,
then make that attribute into a key (declare the
implementational restriction);
| |
Example 3-26. |
| ------- - -- |
| |
name-key of COURSE |
| ---- --- ------ |
* The following is a formal definition of the composition in
Predicate Calculus for the inference rules of userviews. Here, the
category C is the range of R .
1
userview relation: x R y
where exists z in C:
(x R z and z R y)
1 2
The following is a formal definition of the relation-spilt in
Predicate Calculus for the inference rules of userviews.
userview category C (R : x, R : y) where x R y
1 2
| It is not a very far reaching alteration of the real world to |
| make this implementation restriction: "Every course has |
| exactly one name, and this name may never be changed." |
| |
| |
c. (multiattribute key)
else if the category has a collection of attributes which
are time-invariant and total, and jointly identify all the
objects in the category, then let a minimal such collection
be the key;
| |
Example 3-27. |
| ------- - -- |
| |
(season-in-key, year-in-key) of QUARTER |
| ------ -- --- ---- -- --- ------- |
| |
d. (``forced'' multiattribute key)
else if the category has a collection of attributes which
can be implementationally restricted to be time-invariant
and total, and to jointly identify all the objects of the
category, without very harmful alteration of the real world,
then make a minimal such collection of attributes into a
key;
e. (inferred key)
else if a collection of attributes can be inferred from the
information existing in the schema and from keys of
other categories, so that
o these attributes can be implementationally
restricted, without very harmful alteration of the
real world,
(i) to be time-invariant and total, and
(ii) to jointly identify all the objects of the
category,
then
(i) choose a minimal such collection of inferable
attributes;
(ii) add to the schema those attributes from the
collection which are not already in the schema;
(iii)make this collection of attributes into a key
(declare the implementational restrictions);
(iv) convert the inference rule of these attributes
into constraints. (Since these will now be new
attributes, their values will be updated by the
users with possible inconsistency relative to the
information from which these attributes are
inferable.)
| |
Example 3-28. |
| ------- - -- |
| |
To obtain a key for DEPARTMENT we alter the real world |
| ----------
| slightly: we require every department to have at least one |
| name; we shall call the first name ever given to a department |
the main-name, and we require that the main-name of a |
| ---- ---- ---- ----
| department may never be changed. We add the new attribute |
| |
| |
[] main-name-key - attribute of DEPARTMENT, range: |
| ---- ---- --- ----------
String (m:1) |
| ------ - - |
| |
| and the constraint |
| |
| |
for every d in DEPARTMENT: |
| ---------- |
| |
d NAME d.MAIN-NAME-key |
| ---- ---- ---- --- |
| |
Note: In conjunction with the implicit constraint -key, the |
| ---
| above constraint means that the main-name is the first name |
| ever given to the department, and that it will remain the |
| department's name forever. |
| |
| |
| |
Example 3-29. |
| ------- - -- |
| |
| More characteristic examples of inferred keys are for the |
categories COURSE-OFFERING and COURSE-ENROLLMENT. These will |
| ------ -------- ------ ----------
be given and generalized after we have a key for PERSON. |
| ------ |
| |
f. (enumerator ID key)
else create a new external enumeration for the objects in
the category (thus altering the real world) and add it as an
attribute, which will be the chosen key.
| |
Example 3-30. |
| ------- - -- |
| |
The key of PERSON will be a new attribute id-key. |
| ------ -- --- |
| |
Pragmatically, a program should be written to generate new
values of an enumerator id key. These numbers will be
assigned by the user to the new objects of the category. The
numbers may not be reused when an object is removed. The
numbers themselves should bear no correlation to the other
information in the database, since the other information may
change in time, while the key is time-invariant.
It is also advisable not to assign the numbers sequentially
but rather in an arbitrary sequence. Otherwise, the
irrelevant information on the ``seniority'' of objects will
be hidden in the ID. Any hidden information will be abused
by the application programmers. Since it is not always
possible to update such hidden information correctly, the
programs will not produce the expected results in some
special cases.
Note: The step of finding keys is performed simultaneously for
all the categories, since we might need to know the key of
one category in order to find a key of a related category.
| |
Example 3-31. |
| ------- - -- |
| |
An inferred key of COURSE-OFFERING can be obtained when keys |
| -------- --- ------ --------
for QUARTER, COURSE, and PERSON have been chosen. The inferred |
| ------- ------ ------
key of COURSE-OFFERING will be |
| ------ -------- |
| |
| {the name of the course, the year of the quarter, |
| the season of the quarter, ID of the instructor} |
| |
| |
Hence, we add four new attributes to COURSE-OFFERING. The |
| ------ --------
inferred key of COURSE-ENROLLMENT will be five new attributes |
| -------- ------ ---------- |
| |
| {ID of the student, the key of the offering}. |
| |
| |
| (The "key of the offering" consists of four attributes. Thus, |
there is a total of five attributes in the key of COURSE- |
| ------
ENROLLMENT.) |
| ---------- |
| |
| |
| Example 3-32. |
| ------- - -- |
| The category COURSE-OFFERING is now: |
| ------ -------- |
| [] COURSE-OFFERING - category |
| ------ -------- |
| [] instructor-id-in-key - attribute of COURSE-OFFERING, |
| range: Integer (m:1) |
| ------- - - |
| [] course-name-in-key - attribute of COURSE-OFFERING, |
| range: String (m:1) |
| ------ - - |
| |
[] year-in-key - attribute of COURSE-OFFERING, range: |
| ---- -- --- ------ --------
1980..1995 (m:1) |
| ---- ---- - - |
| |
[] season-in-key - attribute of COURSE-OFFERING, range: |
| ------ -- --- ------ --------
String (m:1) |
| ------ - - |
| |
| |
| |
The above is an example of the prevalent case of an inferred key. The
following is a generalization of this example.
Assume that a category C is the domain of total functional relations
f ,..., f which jointly identify all the objects of the category.
1 n
| |
Example 3-33. |
| ------- - -- |
| |
| Every course offering is uniquely identified by its |
| instructor, course, and quarter. Thus, the total functional |
| relations |
| |
| |
THE-INSTRUCTOR, THE-COURSE, THE-QUARTER |
| --- ---------- --- ------ --- ------- |
| |
| jointly identify all the objects of their domain, the category |
COURSE-OFFERING. |
| ------ -------- |
| |
The above assumption means that there is an integrity constraint
for every x in C:
for every y in C:
if x.f = y.f and ... and x.f =y.f
1 1 n n
then x = y
| |
| Example 3-34. |
| ------- - -- |
| |
| for every x in COURSE-OFFERING: |
| ------ -------- |
| for every y in COURSE-OFFERING: |
| ------ -------- |
| if |
| |
| x.THE-INSTRUCTOR=y.THE-INSTRUCTOR and |
| --- ---------- --- ---------- |
| x.THE-COURSE=y.THE-COURSE and |
| x.THE-QUARTER=y.THE-QUARTER |
| --- ------- --- ------- |
| |
| then x=y |
| |
| |
In this case, once the keys of the ranges of the functional relations
f ,..., f are known, a key of C can be inferred from them. Let the
1 n -
keys of the ranges be k ,..., k . Let k -of-f be the set of inferred
1 n i i
attributes obtained by the composition of the attributes comprising
the key k and the relation f .
i i
| |
Example 3-35. |
| ------- - -- |
| |
| There are three such sets of inferred attributes for the |
category COURSE-OFFERING: |
| ------ -------- |
| |
| o id-of-the-instructor |
| |
| |
| o the-name-of-the-course |
| |
| |
| o the-year-of-the-quarter, the-season-of-the-quarter |
| |
| |
The key of C is contained in the union of compositions of the
relations f onto the keys of their ranges, that is,
i
{(k of f ),..., (k of f )}
1 1 n n
Notice that the key of C is contained in the above union of
compositions. Usually the key of C is equal to that union of
compositions but sometimes it is properly contained.
| |
| Example 3-36. |
| ------- - -- |
| Let us change the meaning of COURSE-OFFERING. Now, it does |
| not have to occur in one particular quarter but can last |
| several quarters, as long as the quarters are within one |
| academic year (Figure 3-5). There are two relations between |
| offerings and quarters: |
| |
| [] beginning-quarter - relation from COURSE-OFFERING to |
| QUARTER (m:1,total) |
| ------- - - ----- |
| [] ending-quarter - relation from COURSE-OFFERING to |
| QUARTER (m:1,total) |
| ------- - - ----- |
| The key of COURSE-OFFERING is properly contained in |
| ------ -------- |
| {the name of the course; |
| the year and season of the beginning quarter; |
| the year and season of the ending quarter; |
| ID of the instructor} |
| |
| |
The attribute THE-YEAR-OF-THE-ENDING-QUARTER is not a part of |
| --- ---- -- --- ------ -------
| the key, since this attribute is not needed for identification |
| of the offerings. For a given beginning quarter and the |
| season of the ending quarter, we can deduce the year of the |
| ending quarter, since we know that the offering is within one |
| academic year. |
| |
| |
[] QUARTER - category
[] COURSE-OFFERING - category
[] ending-quarter - relation from COURSE-OFFERING to QUARTER
(m:1,total)
[] beginning-quarter - relation from COURSE-OFFERING to
QUARTER (m:1,total)
[] year-in-key - attribute of QUARTER, range: 1980..1995 (m:1)
[] season-in-key - attribute of QUARTER, range: String (m:1)
[] instructor-id-in-key - attribute of COURSE-OFFERING, range:
Integer (m:1)
[] course-name-in-key - attribute of COURSE-OFFERING, range:
String (m:1)
[] year-of-the-beginning-quarter-in-key - attribute of COURSE-
OFFERING, range: 1980..1995 (m:1)
[] season-of-the-beginning-quarter-in-key - attribute of
COURSE-OFFERING, range: String (m:1)
[] season-of-the-ending-quarter-in-key - attribute of COURSE-
OFFERING, range: String (m:1)
Figure 3-5. The case of multiquarter course offerings limited to one
academic year. The key is shown.
| |
Example 3-37. |
| ------- - -- |
| |
| The semantic schema of the university application has been |
| converted so far into the schema on the following page. |
| |
| |
[] DEPARTMENT - category
[] PERSON - category
[] STUDENT - subcategory of PERSON
[] INSTRUCTOR - subcategory of PERSON
[] QUARTER - category
[] COURSE - category
[] COURSE-OFFERING - category
[] COURSE-ENROLLMENT - category
[] works-in - relation from INSTRUCTOR to DEPARTMENT (m:m)
[] name - attribute of DEPARTMENT, range: String (1:m) (A
department may have several names, but every name is
unique.)
[] last-name - attribute of PERSON, range: String (m:1)
[] first-name - attribute of PERSON, range: String (m:1)
[] birth-year - attribute of PERSON, range: 1870..1990 (m:1)
[] address - attribute of PERSON, range: String (m:1)
[] major - relation from STUDENT to DEPARTMENT (m:1)
[] minor - relation from STUDENT to DEPARTMENT (m:1)
[] the-instructor - relation from COURSE-OFFERING to
INSTRUCTOR (m:1)
[] - relation from COURSE-OFFERING to COURSE (m:1)
[] the-quarter - relation from COURSE-OFFERING to QUARTER
(m:1)
[] final-grade - attribute of COURSE-ENROLLMENT, range: 0..100
(m:1)
[] - relation from COURSE-ENROLLMENT to COURSE-OFFERING
(m:1)
[] the-student - relation from COURSE-ENROLLMENT to STUDENT
(m:1)
[] main-name-key - attribute of DEPARTMENT, range: String
(m:1)
[] id-key - attribute of PERSON, range: Integer (m:1)
[] year-in-key - attribute of QUARTER, range: 1980..1995 (m:1)
[] season-in-key - attribute of QUARTER, range: String (m:1)
[] name-key - attribute of COURSE, range: String (m:1)
[] instructor-id-in-key - attribute of COURSE-OFFERING, range:
Integer (m:1)
[] course-name-in-key - attribute of COURSE-OFFERING, range:
String (m:1)
[] year-in-key - attribute of COURSE-OFFERING, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-OFFERING, range: String
(m:1)
[] instructor-id-in-key - attribute of COURSE-ENROLLMENT,
range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-ENROLLMENT, range:
String (m:1)
[] year-in-key - attribute of COURSE-ENROLLMENT, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-ENROLLMENT, range:
String (m:1)
[] student-id-in-key - attribute of COURSE-ENROLLMENT, range:
Integer (m:1)
Figure 3-6. The university schema with keys.
3.4.4. Disjointness of categories
Step 2. Convert the intersecting abstract categories into disjoint
categories by the following procedure for every group of
intersecting categories.
| |
Example 3-38. |
| ------- - -- |
| |
| Here is a group of intersecting categories: STUDENT, |
| INSTRUCTOR, PERSON. |
| |
| |
A. Consider a complete group of categories so that every category
outside the group is disjoint from every category in the group.
Let C denote the union of all the categories in the group. If
such a category C does not already exist in the schema, then add
it.
Let S ,S ,...,S be the other categories in the group. (All of
1 2 n
them are direct or indirect subcategories of C.)
| |
Example 3-39. |
| ------- - -- |
| |
C=PERSON, S =INSTRUCTOR, S =STUDENT. |
| 1 2 |
| |
Let
n
S = C - U S
0 i
i=1
S is the hypothetical category consisting of the objects of C
0
which do not belong to any of the subcategories. The category S
0
is considered in order to ensure that no information is lost
during the conversion. It is not added to the schema at this
time. It may or may not be added to the schema at a later step,
depending on decisions made at that step.
| |
Example 3-40. |
| ------- - -- |
| |
| If there may be other persons in addition to instructors and |
| students, then |
| |
| |
S =OTHER-PERSON |
| 0 |
| |
Otherwise, S =empty, and it would not have to be added to the |
| 0
| schema at any step. |
| |
| |
| In the continuation of this case study in the examples we will |
| assume the latter case: no "other persons." |
| |
| |
B. Estimate the intersection factors J and K.
In order to chose the best way of conversion, we shall need to
estimate the following quantities.
| |
Example 3-41. |
| ------- - -- |
| |
| For the above group of intersecting categories, the choice of |
| the method to eliminate the intersection of the categories |
| will depend on the correlation of two parameters: |
| |
| o The percentage of people who are both students and |
| instructors, J |
| |
| |
| o The percentage of relations specific to students or |
| instructors among all the relations which can be relevant |
| to persons, K |
| |
| |
number of relations whose domain or range is S or ... or S
1 n
K = ---------------------------------------------------------------
number of relations whose domain or range is C or S or...or S
1 n
| |
Example 3-42. |
| ------- - -- |
| |
| For the group of the previous examples, K=5/10. |
| |
| |
expected total number of objects in the intersections
J = -----------------------------------------------------
expected total number of objects in C
The above formula is rather informal.*
| |
Example 3-43. |
| ------- - -- |
| |
| To estimate J, we have to predict the future of our database. |
| It is reasonable to assume that about 5 percent of all persons |
| would be simultaneously students and instructors, so J=0.05. |
| |
| |
| [diagram skipped] |
| |
| |
* More formally:
expected cardinality of U (S S )
i j
i=j
J = ------------------------------------
expected cardinality of C
| |
Example 3-44. |
| ------- - --
| |
| If we had several intersecting categories, we would count all |
| the intersections: |
| |
| |
| |
| |
| [diagram skipped] |
| |
| |
| |
| |
C. Select the best conversion into disjoint categories.
| |
Example 3-45. |
| ------- - -- |
| |
| Consider the nondisjoint categories INSTRUCTOR and STUDENT, |
which are subcategories of the category PERSON=INSTRUCTOR U |
| ------ ----------
STUDENT. The following are several possibilities of |
| -------
| conversion. We will later select the best of the |
| possibilities, depending on the circumstances. |
| |
| |
a. Conversion into one category (Union)
| |
| Example 3-46. |
| Substitute the whole group of categories by their union, the |
| category PERSON. This category will serve as the domain or the |
| range for all the relations whose domain or range was one of |
| the original categories. In addition, this category will have |
| two Boolean attributes, IS-AN-INSTRUCTOR and IS-A-STUDENT, |
| associating the value TRUE with objects representing |
| instructors and students respectively. |
| |
| [] PERSON - category |
| ------ |
| [] id-key - attribute of PERSON, range: Integer (m:1) |
| -- --- ------ ------- - - |
| [] last-name - attribute of PERSON, range: String |
| (m:1) |
| - - |
| [] first-name - attribute of PERSON, range: String |
| (m:1) |
| - - |
| [] birth-year - attribute of PERSON, range: 1870..1990 |
| (m:1) |
| [] address - attribute of PERSON, range: String (m:1) |
| ------- ------ ------ - - |
| |
[] is-a-student - attribute of PERSON, range: Boolean |
| -- - ------- ------ -------
(m:1) |
| - - |
| |
[] is-an-instructor - attribute of PERSON, range: |
| -- -- ---------- ------
Boolean (m:1) |
| ------- - - |
| |
| |
| |
[] major - relation from PERSON to DEPARTMENT (m:1) |
| ----- ------ ---------- - - |
| |
[] minor - relation from PERSON to DEPARTMENT (m:1) |
| ----- ------ ---------- - - |
| |
[] the-instructor - relation from COURSE-OFFERING to |
| --- ---------- ------ --------
PERSON (m:1) |
| ------ - - |
| |
[] the-student - relation from COURSE-ENROLLMENT to |
| --- ------- ------ ----------
PERSON (m:1) |
| ------ - - |
| |
[] works-in - relation from PERSON to DEPARTMENT (m:m) |
| ----- -- ------ ---------- - - |
| |
b. Conversion into artificially disjoint categories of Events
| |
| Example 3-47. |
| Substitute these categories by two disjoint categories of |
| events: Event-of-being-a-STUDENT and Event-of-being-an- |
| INSTRUCTOR (usually abbreviated just STUDENT and INSTRUCTOR, |
| but the meaning of the full names is intended). |
| |
| An instructor who is also a student will be represented by two |
| distinct objects of the aforementioned categories. |
| |
| The objects of the new categories are not persons but rather |
| their "hats" -- a person may have two "hats": one as an |
| instructor and one as a student. The two categories of "hats" |
| are disjoint. |
| |
| The relations whose domain or range is the category PERSON, |
| for example, the relation ADDRESS, will be replaced by two |
| relations having the new categories as their domains or |
| ranges, such as the relations STUDENT'S-ADDRESS and |
| INSTRUCTOR'S-ADDRESS. |
| ---------- - ------- |
| [] STUDENT - category |
| ------- |
| [] INSTRUCTOR - category |
| ---------- |
| [] id-key - attribute of STUDENT, range: Integer (m:1) |
| [] last-name - attribute of STUDENT, range: String |
| ---- ---- ------- ------
(m:1) |
| - - |
| |
[] first-name - attribute of STUDENT, range: String |
| ----- ---- ------- ------
(m:1) |
| - - |
| |
[] birth-year - attribute of STUDENT, range: |
| ----- ---- -------
1870..1990 (m:1) |
| ---- ---- - - |
| |
[] address - attribute of STUDENT, range: String (m:1) |
| ------- ------- ------ - - |
| |
[] id-key - attribute of INSTRUCTOR, range: Integer |
| -- --- ---------- -------
(m:1) |
| - - |
| |
[] last-name - attribute of INSTRUCTOR, range: String |
| ---- ---- ---------- ------
(m:1) |
| - - |
| |
[] first-name - attribute of INSTRUCTOR, range: String |
| ----- ---- ---------- ------
(m:1) |
| - - |
| |
[] birth-year - attribute of INSTRUCTOR, range: |
| ----- ---- ----------
1870..1990 (m:1) |
| ---- ---- - - |
| |
[] address - attribute of INSTRUCTOR, range: String |
| ------- ---------- ------
(m:1) |
| - -
| |
| |
| |
It may appear that by introducing the categories of "hats"
we have succeeded in fooling the system. Actually, we have
fooled ourselves. Without understanding the relationship
between two hats of one person, the system will not be able
to correctly interpret some queries of naive users and may
cause inconsistency in the stored information and other
problems:
o When the address of a person is updated, it may
get updated in one category but not in the other.
The database will become inconsistent.
o A naive query like "How many people are there?"
will involve double count of persons who are
instructors and students simultaneously.
c. Conversion into Union+Events
| |
| Example 3-48. |
| ------- - -- |
| As Figure 3-7 shows, we can retain the category PERSON with |
| all its relationships and define two categories of events |
| which will inherit all the relationships of STUDENT and |
| INSTRUCTOR, and additionally will have keys and special 1:1 |
| relationships with the category PERSON: |
| ------ |
| |
[] PERSON - category (retains its relations from the |
| ------
| binary schema) |
| |
| |
[] Event-of-being-a-STUDENT - category (inherits all |
| ----- -- ----- - -------
the relations of STUDENT) |
| ------- |
| |
[] Event-of-being-an-INSTRUCTOR - category (inherits |
| ----- -- ----- -- ----------
all the relations of INSTRUCTOR) |
| ---------- |
| |
[] student-person - relation from Event-of-being-a- |
| ------- ------ ----- -- ----- -
STUDENT to PERSON (1:1,total) |
| ------- ------ - - ----- |
| |
[] instructor-person - relation from Event-of-being- |
| ---------- ------ ----- -- -----
an-INSTRUCTOR to PERSON (1:1,total) |
| -- ---------- ------ - - ----- |
| |
[diagram skipped]
Figure 3-7. Union+events
| |
| Example 3-49. |
| ------- - -- |
| To further explore the differences between the three |
| approaches, consider the formulation of the query ``Print the |
| names of all the students.'' |
| |
| Events: |
| |
| get s. LAST-NAME |
| ---- ---- |
| where s is a STUDENT |
| ------- |
| Union: |
| |
| get s. LAST-NAME |
| ---- ---- |
| where |
| |
| s is a PERSON and |
| ------ |
| s. IS-A-STUDENT |
| -- - ------- |
| Union+Events: |
| |
| get p. LAST-NAME |
| where |
| p is a PERSON and |
| ------ |
| |
exists s in STUDENT: |
| ------- |
| |
s. ID-key = p. ID-key |
| -- --- -- --- |
| |
Relative disadvantages of each approach
The principal disadvantage of Events is the redundancy. For example,
the birth-year of an instructor who is also a student has to be
logically represented twice in the database, which can cause
inconsistency and other problems.
The principal disadvantages of Union are the unnaturalness of the
schema and the under-coverage of integrity constraints. For example,
an additional integrity constraint has to be defined to prevent
association of a nonstudent instructor with a major department of
studies. Another important deficiency is the null-values, causing
significant problems in formulation of queries. (We say that
``p.MAJOR is null'' if the person p is not related to any department
by the relation MAJOR.)
The principal disadvantages of Union+Events are the unnaturalness of
the schema and significant difficulties in the formulation of queries
and other operations. These difficulties, however, can be overcome by
the use of userviews which would conveniently redefine the concepts of
the schema. This requires that the DBMS provide a high-level support
for userviews, including the capability to specify updates through
userviews. Most relational DBMS, however, do not provide sufficient
support of userviews.
Conclusion
Unless the DBMS provides sufficient support for userviews as
discussed above, we have to exclude the Union+Events approach.
Both other approaches, Union and Events, would result in low-quality
schemas, but the relational database designer has to choose the better
of the two.
| |
Example 3-50. |
| ------- - -- |
| |
| The choice should usually depend on the correlation of two |
| parameters: the percentage of people who are both students and |
| instructors, J, and the percentage of relations specific to |
| students or instructors among all the relations which can be |
| relevant to persons, K. |
| |
| |
The relative redundancy in Events increases when J increases and when
K decreases. The unnaturalness and the undercoverage of constraints
in Union increase when J decreases and when K increases.
The following provides a decision criteria for an arbitrary group of
categories. The decision is made according to the J:K ratio. A
comparison quotient of 0.6 is suggested, which is quite often
reasonable, as has been shown by analysis of a class of databases. If
J/K>0.6 then the Union approach would usually be preferable. In some
special cases, however, the database designer should consider a
different comparison quotient. The number 0.6 is only a "rule of
thumb."
When there is chain of sub-sub-categories, the approach Events becomes
too complicated and is not recommended. It is, however, the most
natural approach in the majority of situations, because in the
majority of cases J is small, the subcategory hierarchy is rather flat
(no sub-sub-categories), and the DBMS does not provide a sufficient
support for userviews.
D. Convert the group of categories into disjoint categories.
a. if the DBMS provides a high level support for userviews,
including specification of updates, then (Union+Events):
(i) Substitute every direct or indirect subcategory S of C
in the schema being converted by the category Event-
of-being-a[n]-S. Each object in this new category is
an event of membership in the category S; that is, if x
is an S then "x is an S" is one element in Event-of-
being-an-S. (The categories of events are disjoint.
For simplicity, the former names S may be kept but the
new meaning is assumed.)
| |
Example 3-51. |
| ------- - -- |
| |
| |
| |
[] [Event-of-being-a-]STUDENT - category |
| ----- -- ----- - ------- |
| |
[] [Event-of-being-an-]INSTRUCTOR - category |
| ----- -- ----- -- ---------- |
| |
| |
| Example 3-52. |
| ------- - -- |
| If we also had |
| |
| [] TENURED-FACULTY - subcategory of INSTRUCTOR |
| ------- ------- ---------- |
| then we would convert it into |
| [] Event-of-being-TENURED-FACULTY - category |
| ----- -- ----- ------- ------- |
| |
(ii) Retain the category C.
| |
Example 3-53. |
| ------- - -- |
| |
| |
[] PERSON - category |
| ------ |
| |
(iii)Connect every new category of events S to each
immediate supercategory of S by a new relation.
Specify integrity constraints that these new relations
are one-to-one and total.
| |
Example 3-54. |
| ------- - -- |
| |
| |
| |
[] student-person - relation from Event-of-being-a- |
| ------- ------ ----- -- ----- -
STUDENT to PERSON (1:1,total) |
| ------- ------ - - ----- |
| |
(iv) Let every new category of events S have all the
relations that the former category S had.
| |
Example 3-55. |
| ------- - -- |
| |
| |
| |
[] major - relation from Event-of-being-a-STUDENT to |
| ----- ----- -- ----- - -------
DEPARTMENT (m:1) |
| ---------- - - |
| |
(v) Specify and add a key for every category of events S.
The simplest way to do this is to inherit the key of C.
| |
Example 3-56. |
| ------- - -- |
| |
| |
| |
[] id-key - attribute of Event-of-being-a-STUDENT, |
| -- --- ----- -- ----- - -------
range: Integer (1:1,total) |
| ------- - - ----- |
| |
b. else if J/K > 0.6 or there is a chain of sub-sub-categories
then (Union):
(i) Replace the whole group of categories by one category
C.
| |
Example 3-57. |
| ------- - --
| |
| |
| |
[] PERSON - category |
| ------ |
| |
(ii) Bring all the relations exiting or entering the former
subcategories to C.
| |
Example 3-58. |
| ------- - -- |
| |
| |
| |
[] the-student - relation from COURSE-ENROLLMENT to |
| --- ------- ------ ----------
PERSON (m:1) |
| ------ - - |
| |
(iii)Add to C total Boolean attributes named is-a[n]-S for
every direct and indirect subcategory S of C in the
schema being converted.
| |
Example 3-59. |
| ------- - -- |
| |
| |
| |
[] is-a-student - attribute of PERSON, range: Boolean |
| -- - ------- ------ -------
(m:1,total) |
| - - ----- |
| |
[] is-an-instructor - attribute of PERSON, range: |
| -- -- ---------- ------
Boolean (m:1,total) |
| ------- - - -----
| |
(iv) Add an integrity constraint stating that any object of
C may participate in a former S's corresponding
relation only if the respective function is-an-S gives
TRUE.
| |
| Example 3-60. |
| ------- - -- |
| |
| for every p in PERSON: |
| ------ |
| if exists d in DEPARTMENT: p WORKS-IN d |
| then p.IS-AN-INSTRUCTOR |
| -- -- ----------
| |
| |
Example 3-61. |
| ------- - -- |
| |
| |
| |
for every p in PERSON: |
| ------ |
| |
if not (p MAJOR null) |
| ----- |
| |
then p.IS-A-STUDENT |
| -- - ------- |
| |
(v) Whenever there are attributes is-an-S and is-an-S ,
1 2
where S is a subcategory of S in the original schema,
1 2
add a constraint enforcing that in terms of the new
attributes.
| |
Example 3-62. |
| ------- - -- |
| |
| If we had: |
| |
| |
[] UNDERGRADUATE - subcategory of STUDENT |
| ------------- ------- |
| |
| then we would add a constraint: |
| |
| |
for every s in PERSON: |
| ------
| |
if s. IS-AN-UNDERGRADUATE |
| -- -- ------------- |
| |
then s. IS-A-STUDENT |
| -- - ------- |
| |
(vi) Whenever there are attributes is-an-S and is-an-S ,
1 2
and S are whereinS in the original schema,
1 2
add a constraint enforcing that in terms of the new
attributes.
| |
| Example 3-63. |
| ------- - -- |
| If the category UNDERGRADUATE was disjoint from the category |
| INSTRUCTOR, then we would add a constraint: |
| ---------- |
| for every s in PERSON: |
| ------ |
| if s. IS-AN-UNDERGRADUATE |
| then not (s. IS-AN-INSTRUCTOR) |
| -- -- ---------- |
| |
c. else (Events):
(i) Substitute the categories S ,..., S by the
1 n
corresponding n categories Event-of-being-a-S ,...,
1
Event-of-being-a-S of the events of membership in
n
categories; that is, if x is an S then "x is an S " is
i i
one element in the category Event-of-being-an-S . (The
i
categories S are disjoint. For simplicity, former
i
names S may be kept but the new meaning is assumed.)
i
| |
Example 3-64. |
| ------- - -- |
| |
| |
| |
[] Event-of-being-a-STUDENT - category |
| ----- -- ----- - ------- |
| |
| disjoint from |
| |
| |
[] Event-of-being-an-Instructor - category |
| ----- -- ----- -- ---------- |
| |
(ii) If there are, or may be in the future, objects in C
that do not belong to any of the subcategories S ,...,
1
S , then add a new category S to the schema. This will
n 0
be the category of the objects that do not belong to
any of the subcategories. This category is usually
called other-C.
| |
Example 3-65. |
| ------- - -- |
| |
| |
| |
[] OTHER-PERSON - category |
| ----- ------ |
| |
(iii)Replace every relation R whose domain or range is C, by
new relations of the same name as R but having the
categories S as their domains or ranges. (The
i
relation R is partitioned into several relations
according to the restricted domains or ranges S .)
i
| |
Example 3-66. |
| ------- - -- |
| |
| [] birth-year - attribute of Event-of-being-a-STUDENT, |
| ----- ---- ----- -- ----- - -------
range: 1870..1990 (m:1) |
| ---- ---- - - |
| |
(iv) Eliminate the category C.
(v) Specify integrity constraints to prevent inconsistency
of the redundant information:
``If an object x of the category Event-of-
being-an-S has the same key values as an
i
object y of the category Event-of-being-an-
S , then the other relations of C (inherited
j
by the categories of events) must be equal
for x and y.''
| |
Example 3-67. |
| ------- - -- |
| |
| We choose this alternative (Events) for the intersecting group |
of the subcategories of PERSON in the case-study database. |
| ------ |
| |
| The schema now has redundancy, which should be controlled by |
| an integrity constraint, if possible. The integrity |
| constraint is |
| |
| |
for every s in Event-of-being-a-STUDENT: |
| ----- -- ----- - ------- |
| |
for every i in Event-of-being-an-INSTRUCTOR: |
| ----- -- ----- -- ---------- |
| |
| if |
| |
| |
(s.ADDRESS = i.ADDRESS or |
| ------- ------- |
| |
s.LAST-NAME = i.LAST-NAME or |
| ---- ---- ---- ---- |
| |
s.FIRST-NAME = i.FIRST-NAME or |
| ----- ---- ----- ---- |
| |
s.BIRTH-YEAR = i.BIRTH-YEAR) |
| ----- ---- ----- ---- |
| |
then s.ID-key = i.ID-key |
| -- --- -- --- |
| |
| (Note: The constraint could have been written without |
| negations, ``in a positive spirit,'' but then the meaning of |
| the absent values could be misinterpreted.) |
| |
| |
| |
| Example 3-68. |
| ------- - -- |
| The semantic schema of the university application has been |
| converted so far (by Events) into the schema on the following |
| page. |
| |
| |
[] DEPARTMENT - category
[] Event-of-being-a-STUDENT - category
[] Event-of-being-an-INSTRUCTOR - category
[] QUARTER - category
[] COURSE - category
[] COURSE-OFFERING - category
[] COURSE-ENROLLMENT - category
[] works-in - relation from Event-of-being-an-
INSTRUCTOR to DEPARTMENT (m:m)
[] name - attribute of DEPARTMENT, range: String
(1:m) (A department may have several names, but
every name is unique.)
[] last-name - attribute of Event-of-being-a-STUDENT,
range: String (m:1)
[] first-name - attribute of Event-of-being-a-
STUDENT, range: String (m:1)
[] birth-year - attribute of Event-of-being-a-
STUDENT, range: 1870..1990 (m:1)
[] address - attribute of Event-of-being-a-STUDENT,
range: String (m:1)
[] last-name - attribute of Event-of-being-an-
INSTRUCTOR, range: String (m:1)
[] first-name - attribute of Event-of-being-an-
INSTRUCTOR, range: String (m:1)
[] birth-year - attribute of Event-of-being-an-
INSTRUCTOR, range: 1870..1990 (m:1)
[] address - attribute of Event-of-being-an-
INSTRUCTOR, range: String (m:1)
[] major - relation from Event-of-being-a-STUDENT to
DEPARTMENT (m:1)
[] minor - relation from Event-of-being-a-STUDENT to
DEPARTMENT (m:1)
[] the-instructor - relation from COURSE-OFFERING to
Event-of-being-an-INSTRUCTOR (m:1)
[] the-course - relation from COURSE-OFFERING to
COURSE (m:1)
[] the-quarter - relation from COURSE-OFFERING to
QUARTER (m:1)
[] final-grade - attribute of COURSE-ENROLLMENT,
range: 0..100 (m:1)
[] the-offer - relation from COURSE-ENROLLMENT to
COURSE-OFFERING (m:1)
[] the-student - relation from COURSE-ENROLLMENT to
Event-of-being-a-STUDENT (m:1)
[] main-name-key - attribute of DEPARTMENT, range:
String (m:1)
[] id-key - attribute of Event-of-being-a-STUDENT,
range: Integer (m:1)
[] id-key - attribute of Event-of-being-an-
INSTRUCTOR, range: Integer (m:1)
[] year-in-key - attribute of QUARTER, range:
1980..1995 (m:1)
[] season-in-key - attribute of QUARTER, range:
String (m:1)
[] name-key - attribute of COURSE, range: String
(m:1)
[] instructor-id-in-key - attribute of COURSE-
OFFERING, range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-OFFERING,
range: String (m:1)
[] year-in-key - attribute of COURSE-OFFERING, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-OFFERING,
range: String (m:1)
[] instructor-id-in-key - attribute of COURSE-
ENROLLMENT, range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-
ENROLLMENT, range: String (m:1)
[] year-in-key - attribute of COURSE-ENROLLMENT,
range: 1980..1995 (m:1)
[] season-in-key - attribute of COURSE-ENROLLMENT,
range: String (m:1)
[] student-id-in-key - attribute of COURSE-
ENROLLMENT, range: Integer (m:1)
Figure 3-8. The university schema with the categories made
artificially disjoint.
3.4.5. Removal of relations
Step 3. Convert every proper 1:m or m:m relation whose range is a
concrete category into a new abstract category with its two
functional relations through a relation-split.
| |
Example 3-69. |
| ------- - -- |
| |
| Instead of the relation |
| |
| |
[] name - relation from DEPARTMENT to String (1:m) |
| ---- ---------- ------ - - |
| |
| we shall have |
| |
| |
[] DEPARTMENT-NAMING - category |
| ---------- ------ |
| |
[] the-department - relation from DEPARTMENT-NAMING to |
| --- ---------- ---------- ------
DEPARTMENT (m:1,total) |
| ---------- - - ----- |
| |
[] the-name - relation from DEPARTMENT-NAMING to |
| --- ---- ---------- ------
String (1:1,total) |
| ------ - - ----- |
| |
Step 4. Convert every 1:m relation into an m:1 relation by changing
its direction and its name.
| |
Example 3-70. |
| ------- - -- |
| |
| We do not have such relations in the university schema. If we |
| assume we have the relation |
| |
| |
[] provides - relation from DEPARTMENT to COURSE (1:m) |
| -------- ---------- ------ - - |
| then we would change it into |
| |
| |
[] the-department-providing-the-course - relation from |
| --- ---------- --------- --- ------
COURSE to DEPARTMENT (m:1) |
| ------ ---------- - - |
| |
Step 5. Convert every proper many-to-many relation into a category
and two functional relations through a relation-split.
| |
Example 3-71. |
| ------- - -- |
| |
We split the relation WORKS-IN into a new category WORK and |
| ----- -- ----
its two functional relations THE-DEPARTMENT and THE- |
| --- ---------- ---
INSTRUCTOR. |
| ---------- |
| |
| |
Example 3-72. |
| ------- - -- |
| |
| If we had the following m:m relation |
| |
| |
[] COURSE - category |
| ------ |
| |
[] INSTRUCTOR - category |
| ---------- |
| |
[] is-able-to-teach - relation from INSTRUCTOR to |
| -- ---- -- ----- ----------
COURSE (m:m) |
| ------ - - |
| |
| |
| |
| then we would split it into |
| |
| |
[] COURSE - category |
| ------ |
| |
[] INSTRUCTOR - category |
| ----------
| |
[] ABILITY-TO-TEACH - category |
| ------- -- ----- |
| |
[] the-instructor - relation from ABILITY-TO-TEACH to |
| --- ---------- ------- -- -----
INSTRUCTOR (m:1) |
| ---------- - - |
| |
[] the-course - relation from ABILITY-TO-TEACH to |
| --- ------ ------- -- -----
COURSE (m:1) |
| ------ - - |
| |
| |
| |
[] DEPARTMENT - category
[] STUDENT - category
[] INSTRUCTOR - category
[] QUARTER - category
[] COURSE - category
[] COURSE-OFFERING - category
[] COURSE-ENROLLMENT - category
[] WORK - category
[] the-instructor - relation from WORK to INSTRUCTOR (m:1)
[] the-department - relation from WORK to DEPARTMENT (m:1)
[] DEPARTMENT-NAMING - category
[] the-department - relation from DEPARTMENT-NAMING to
DEPARTMENT (m:1)
[] the-name - attribute of DEPARTMENT-NAMING, range: String
(m:1)
[] last-name - attribute of STUDENT, range: String (m:1)
[] first-name - attribute of STUDENT, range: String (m:1)
[] birth-year - attribute of STUDENT, range: 1870..1990 (m:1)
[] address - attribute of STUDENT, range: String (m:1)
[] last-name - attribute of INSTRUCTOR, range: String (m:1)
[] first-name - attribute of INSTRUCTOR, range: String (m:1)
[] birth-year - attribute of INSTRUCTOR, range: 1870..1990
(m:1)
[] address - attribute of INSTRUCTOR, range: String (m:1)
[] major - relation from STUDENT to DEPARTMENT (m:1)
[] minor - relation from STUDENT to DEPARTMENT (m:1)
[] the-instructor - relation from COURSE-OFFERING to
INSTRUCTOR (m:1)
[] the-course - relation from COURSE-OFFERING to COURSE (m:1)
[] the-quarter - relation from COURSE-OFFERING to QUARTER
(m:1)
[] final-grade - attribute of COURSE-ENROLLMENT, range: 0..100
(m:1)
[] the-offer - relation from COURSE-ENROLLMENT to COURSE-
OFFERING (m:1)
[] the-student - relation from COURSE-ENROLLMENT to STUDENT
(m:1)
[] main-name-key - attribute of DEPARTMENT, range: String
(m:1)
[] id-key - attribute of STUDENT, range: Integer (m:1)
[] id-key - attribute of INSTRUCTOR, range: Integer (m:1)
[] year-in-key - attribute of QUARTER, range: 1980..1995 (m:1)
[] season-in-key - attribute of QUARTER, range: String (m:1)
[] name-key - attribute of COURSE, range: String (m:1)
[] instructor-id-in-key - attribute of COURSE-OFFERING, range:
Integer (m:1)
[] course-name-in-key - attribute of COURSE-OFFERING, range:
String (m:1)
[] year-in-key - attribute of COURSE-OFFERING, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-OFFERING, range: String
(m:1)
[] instructor-id-in-key - attribute of COURSE-ENROLLMENT,
range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-ENROLLMENT, range:
String (m:1)
[] year-in-key - attribute of COURSE-ENROLLMENT, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-ENROLLMENT, range:
String (m:1)
[] student-id-in-key - attribute of COURSE-ENROLLMENT, range:
Integer (m:1)
Figure 3-9. The university schema after all the relation splits have
been performed.
| |
Example 3-73. |
| ------- - -- |
| |
| The semantic binary schema of the university application has |
| been converted so far into the schema in Figure 3-9. |
| |
| |
Step 6. Choose a key for every category produced through a relation-
split as follows.
For every category which was obtained through a relation-split, a
key is contained in the union of the compositions of its two
functional relations on the keys of their ranges.
| |
Example 3-74. |
| ------- - -- |
| |
The key of WORK is two new attributes of this category: |
| ---- |
| |
| {main-name of the department, instructor-id of the instructor} |
| |
| |
| |
Example 3-75. |
| ------- - -- |
| |
The key of DEPARTMENT-NAMING is contained in |
| ---------- ------ |
| |
| {the-name, main-name of the department} |
| |
| |
Since the-name is 1:1, the key of DEPARTMENT-NAMING is {the- |
| --- ---- ---------- ------ ---
name}. |
| ---- |
| |
| |
Example 3-76. |
| ------- - -- |
| |
| The semantic binary schema of the university application has |
| been converted so far into the schema in Figure 3-10. |
| |
| |
[] DEPARTMENT - category
[] STUDENT - category
[] INSTRUCTOR - category
[] QUARTER - category
[] COURSE - category
[] COURSE-OFFERING - category
[] COURSE-ENROLLMENT - category
[] WORK - category
[] instructor-id-in-key - attribute of WORK, range:
Integer (m:1)
[] department-main-name-in-key - attribute of WORK, range:
String (m:1)
[] the-instructor - relation from WORK to INSTRUCTOR
(m:1)
[] the-department - relation from WORK to DEPARTMENT
(m:1)
[] DEPARTMENT-NAMING - category
[] the-department - relation from DEPARTMENT-NAMING to
DEPARTMENT (m:1)
[] main-name-key - attribute of DEPARTMENT, range: String
(m:1)
[] name-key - attribute of DEPARTMENT-NAMING, range:
String (m:1)
[] id-key - attribute of STUDENT, range: Integer (m:1)
[] id-key - attribute of INSTRUCTOR, range: Integer (m:1)
[] last-name - attribute of STUDENT, range: String (m:1)
[] first-name - attribute of STUDENT, range: String (m:1)
[] birth-year - attribute of STUDENT, range: 1870..1990
(m:1)
[] address - attribute of STUDENT, range: String (m:1)
[] last-name - attribute of INSTRUCTOR, range: String
(m:1)
[] first-name - attribute of INSTRUCTOR, range: String
(m:1)
[] birth-year - attribute of INSTRUCTOR, range:
1870..1990 (m:1)
[] address - attribute of INSTRUCTOR, range: String (m:1)
[] major - relation from STUDENT to DEPARTMENT (m:1)
[] minor - relation from STUDENT to DEPARTMENT (m:1)
[] year-in-key - attribute of QUARTER, range: 1980..1995
(m:1)
[] season-in-key - attribute of QUARTER, range: String
(m:1)
[] name-key - attribute of COURSE, range: String (m:1)
[] the-instructor - relation from COURSE-OFFERING to
INSTRUCTOR (m:1)
[] the-course - relation from COURSE-OFFERING to COURSE
(m:1)
[] the-quarter - relation from COURSE-OFFERING to QUARTER
(m:1)
[] instructor-id-in-key - attribute of COURSE-OFFERING,
range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-OFFERING,
range: String (m:1)
[] year-in-key - attribute of COURSE-OFFERING, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-OFFERING, range:
String (m:1)
[] the-offer - relation from COURSE-ENROLLMENT to COURSE-
OFFERING (m:1)
[] instructor-id-in-key - attribute of COURSE-ENROLLMENT,
range: Integer (m:1)
[] course-name-in-key - attribute of COURSE-ENROLLMENT,
range: String (m:1)
[] year-in-key - attribute of COURSE-ENROLLMENT, range:
1980..1995 (m:1)
[] season-in-key - attribute of COURSE-ENROLLMENT, range:
String (m:1)
[] student-id-in-key - attribute of COURSE-ENROLLMENT,
range: Integer (m:1)
[] final-grade - attribute of COURSE-ENROLLMENT, range:
0..100 (m:1)
[] the-student - relation from COURSE-ENROLLMENT to
STUDENT (m:1)
Figure 3-10. The university schema after the relation splits have
been performed and keys have been chosen for every category.
Step 7. Replace every m:1 relation f whose range is an abstract
category by the composition of f on the chosen key of its range,
that is, by attributes b ,...,b , where x.b = (x.f).a , and
1 n i i
a ,..., a is the chosen key of f's range.
1 n
| |
Example 3-77. |
| ------- - -- |
| |
| Instead of |
| |
| |
[] major - relation from STUDENT to DEPARTMENT (m:1) |
| ----- ------- ---------- - - |
| |
| we shall have |
| |
| |
[] major-dept-main-name - attribute of STUDENT, range: |
| ----- ---- ---- ---- -------
String (m:1) |
| ------ - - |
| |
Step 8. Remove redundant nonkey attributes.
From every category remove attributes which are not in the key
but are inferable from other attributes of the same category.
These attributes would usually have resulted from a ``blind''
application of this algorithm, particularly
a. A nonkey attribute which is always equal to an attribute in
the key.
Note: It is possible that step (7) brought to a category C
an attribute b which is always equal to an attribute in the
key of C.
b. A redundant Boolean attribute brought in step (1):
Suppose:
o We were converting intersecting categories C, S ,...,
1
S into disjoint categories.
n
o We replaced them by one Union category C.
o one of the categories S was disjoint from all of the
i
rest S 's.
j
Then the new attribute is-an-S is inferable from the rest
i
of the attributes is-an-S .
j
This attribute should be removed from the schema. (It may be
present in a userview, where it would be an inferred
attribute.)
| |
Example 3-78. |
| ------- - --
| |
| If we had |
| |
| |
[] ILLITERATE - subcategory of PERSON (disjoint from |
| ---------- ------
| INSTRUCTOR and from STUDENT) |
| |
| |
| and furthermore, there were no other persons but students, |
instructors, and illiterate persons, then the attribute is- |
| --
illiterate would be derivable: |
| ---------- |
| |
| for every p in PERSON: |
| |
| |
p. IS-ILLITERATE = |
| -- ---------- |
| |
(not p. IS-A-STUDENT and not p. IS-AN-INSTRUCTOR) |
| -- - ------- -- -- ---------- |
| |
Note: The removal of several attributes should not be
performed simultaneously. Otherwise, two
attributes mutually inferable, but not inferable
from the rest, might be removed.
3.4.6. Integrity constraints
Step 9. Translate the integrity constraints into the terms of the new
schema:
a. The constraints of the original schema.
b. The additional constraints accumulated during the conversion
process.
| |
Example 3-79. |
| ------- - -- |
| |
| The semantic schema of the university application has been |
converted into the relational schema of Figure . |
| --- |
| |


Figure Ref-2. A relational schema for the university
application.