Chapter 9
*Aspects of DBMS Implementation

This chapter discusses aspects  of  DBMS  implementation.   Section  1
describes  an  efficient  algorithm for the implementation of semantic
databases.  Section 2 addresses  questions  of  transaction  handling,
including   the  enforcement  of  integrity  constraints,  backup  and
recovery, and concurrency control.  Section 3 addresses issues of data
definition languages and data dictionaries.

Chapter  10  addresses  object-oriented   databases.    This   chapter
discusses  the  similarities  and  the  minor  difference  between the
semantic and  object-oriented  databases  and  augments  the  Semantic
Binary  Model  (SBM) with object-oriented features related to modeling
database behavior.

9.1.   On the Implementation of Semantic Databases

               A File Structure for Semantic Databases

                            Naphtali Rishe

                      School of Computer Science
                  Florida International University -
               The State University of Florida at Miami
                   University Park, Miami, FL 33199
                              This paper
    presents a highly efficient file structure for the storage of
 semantic databases.  A low-level access language is presented, such
      that an arbitrary query can be performed as one or several
    elementary queries of the language.  Most elementary queries,
  including such nontrivial queries as range queries and others, can
         be performed in just one single access to the disk.

    Keywords: Semantic databases, implementation, Semantic Binary
                                Model,
   file structure, query languages, transactions, indices, B-tree,
   efficiency, database access primitives, facts, inverted storage.

This research has been supported in part by grants from
the  U.S.  Department  of the Interior and Florida High
Technology and Industry Council.

9.1.1.   Introduction

Since [Abrial-74], many semantic data models have been studied in  the

Computer  Science  literature.   Although  somewhat different in their

terminology  and  their  selection  of  tools  used  to  describe  the

semantics of the real world, they have several common principles:

 o   The entities of the real world are represented in the database in

     a  manner invisible to the user.  (Unlike that, in the relational

     model the entities are represented by the values of keys of  some

     tables;  in  the  network  model  the entities are represented by

     record    occurrences.)    Hereinafter,    the     user-invisible

     representations   of  real-world  entities  are  referred  to  as

     "abstract  objects".  The  "concrete  objects",   or   "printable
     values",  are  numbers,  character  strings,  etc.  The  concrete
     objects have conventional representations on  paper  and  in  the

     computer.

 o   The entities are classified into types, or categories, which need

     not be disjoint.  Meta-relations of inclusion are defined between

     the categories.

 o   Logically-explicit relationships  are  specified  among  abstract

     objects  (e.g.,  "person  p   is  the  mother  of person p2") and
               - -              1
     between abstract objects and concrete objects (e.g.,  "person  p
                                                    - -              1
     has first name `Jack'").  There are no direct relationships among

     concrete objects.  In most semantic models, only binary relations

     are allowed, since higher order relations do not add any power of

     semantic   expressiveness    ([Bracchi&al.-76],    [Rishe-87-RM],
     [Rishe-88-DDF])  but  do decrease the flexibility of the database

     and representability of partially-unknown  information,  and  add

     complexity and potential for logical redundancy ([Rishe-88-DDF]).

The advantages of the semantic models versus the Relational and  older

models  with  respect  to  database design, database maintenance, data

integrity, conciseness of languages, and ease of DML programming  have

been  discussed  in  many  works,  e.g. in [Rishe-88-DDF].  This paper
advocates  the  potential  of  an  efficient  implementation  for  the

semantic models.

Several semantic data models have been implemented  as  interfaces  to

database   managements   systems  in  other  data  models,  e.g.,  the

relational or the network model [Tsur&Zaniolo-84].   (There  are  also

less  typical,  direct implementations, e.g. [Lien&al.-81], [Chan&al.-
82],   [Benneworth&al.-81].)    The   efficiency   of   an   interface
implementation  is  limited  to  that of the conventional DBMS, and is

normally much  worse  due  to  the  interface  overhead.   The  direct

implementations  are  commonly  believed  to  have  to  be  less time-

efficient than the conventional systems, as a trade-off for the  extra

services  that  semantic  databases  provide.   However,  this  author

contends that the semantic models have the potential for a  much  more

efficient  implementation  than the conventional data models.  This is

due to two reasons:

 o   All the physical aspects of representation of information by data

     are invisible to the user in the semantic models.  This creates a

     greater potential for optimization: more things  may  be  changed

     for   efficiency   considerations   without  affecting  the  user

     programs.  The Relational Model has more data  independence  than

     the  older  models.  For example, the order of rows in the tables

     (relations) is invisible to the user.  The semantic  models  have

     even  more data independence.  For example, the representation of

     real-world entities by printable values is invisible to the user.

     One  may  recall  that  not  long  ago  the  Relational Model was

     criticized  as  being  less  efficient  than  the   Network   and

     Hierarchical  Models.   However,  it is clear now that optimizing

     relational database systems have the potential to  be  much  more

     efficient  than  the  network and hierarchical systems due to the
     data independence of the relational model.

 o   In the semantic models, the system knows more about  the  meaning

     of  the  user's data and about the meaningful connections between

     such data.  This knowledge can be utilized to organize  the  data

     so  that  meaningful  operations  can  be performed faster at the

     expense of less meaningful or meaningless operations.

In this paper,  the  author  uses  the  Semantic  Binary  Model  (SBM)

[Rishe-87-RM],  [Rishe-88-DDF],  [Rishe-89-SD]),  a  descendant of the

model proposed in [Abrial-74]. This model does not  have  as  rich  an

arsenal  of  tools  for  semantic  description as can be found in some

other semantic models, e.g.  the IFO  model  [Abiteboul&Hull-84],  SDM

[Hammer&McLeod-81]    (implementation    [Jagannathan&al.-88]),    the
Functional Model [Shipman-81] (implementation [Chan&al.-82]),  SEMBASE
[King-84],      NIAM      ([Nijssen-81],     [Verheijen&VanBekkum-82],

[Leung&Nijssen-87]), GEM [Tsur&Zaniolo-84], TAXIS  [Nixon&al.-87],  or
the  semi-semantic Entity-Relationship Model [Chen-76].  Nevertheless,

the SBM has a small set of sufficient simple tools by  which  all  the

semantic  descriptors  of  the  other models can be constructed.  This

makes SBM easier to use for  the  novice,  easier  to  implement,  and

usable  for  delineation  of  the  common  properties  of the semantic

models.  The results of this paper are practically independent of  the

choice  of  a  particular  semantic model, and therefore they apply to

almost all of the other semantic models.

The  semantic  binary  model  represents   the   information   of   an

application's  world as a collection of elementary facts of two types:

unary facts categorizing objects of the real world  and  binary  facts

establishing  relationships of various kinds between pairs of objects.

The graphical database schema and the integrity constraints  determine

what  sets of facts are meaningful, i.e. can comprise an instantaneous

database (the database as may be seen at some instance of time.)

Example 9-1.

Consider a database of which the following is a subschema:

      []  COMPANY - category

      []  PRODUCT - category

      []  company-name - attribute of COMPANY, range: String
          (1:1)

      []  description - attribute of PRODUCT, range: String
          (1:1)

      []  manufactures - relation from COMPANY to PRODUCT
          (m:m)

            Figure 9-1.  A subschema of a database

The following set  of  facts  can  be  a  part  of  a  logical

instantaneous database:

1.   object1  COMPANY

2.   object1  COMPANY-NAME  `IBM'

3.   object1  MANUFACTURED  object2

4.   object1  MANUFACTURED  object3

5.   object2  PRODUCT

6.   object2  DESCRIPTION  `IBM/SYSTEM-2'

7.   object3  PRODUCT

8.   object3  DESCRIPTION  `MONOCHROMATIC-MONITOR'

The formal semantics of  the  semantic  binary  model  is  defined  in

[Rishe-87-DS]  using  the  methodology  proposed in [Rishe-86-OD]. The

syntax and informal semantics of the model  and  its  languages  (data

definition  languages,  4-th  generation  data manipulation languages,

non-procedural  languages  for  queries,  updates,  specification   of

constraints,  userviews,   etc.)  are given in [Rishe-88-DDF].  A non-
procedural  semantic  database  language  of  maximal   theoretically-

possible  expressive  power  is  given  in  [Rishe-86-PS].   (In  this

language,  one  can  specify  every  computable  query,   transaction,

constraint, etc.)

The following section proposes an efficient storage structure for  the

Semantic Binary Model.

9.1.2.   Storage structure

9.1.2.1.   Abstracted level

Every abstract object in the  database  is  represented  by  a  unique

integer  identifier.   The  categories and relations of the schema are

also treated as abstract objects and  hence  have  unique  identifiers

associated  with  them.   Information  in  the  database  can  then be

represented using two kinds of facts, denoted xC and xRy, where  x  is

the  identifier  associated  with  an abstract object, C and R are the

identifiers associated with a category or a relation respectively, and
y  is  either  an  identifier corresponding to an abstract object or a

concrete object (a number or a text string);  xC  indicates  that  the

object x belongs to the category C; xRy indicates that the object x is

associated with the object  y  by  the  relation  R.   Logically,  the

instantaneous database is a set of such facts.

9.1.2.2.   Goals

9.1.2.2.1.   Efficiency of retrieval requests

At the intermediate level of processing queries and program  retrieval

requests,  the queries are decomposed into atomic retrieval operations
of the types listed below.  The primary  goal  of  the  physical  file

structure  is  to  allow  a very efficient performance for each of the

atomic requests.   Namely,  each  atomic  retrieval  request  normally
requires  only  one  disk  access,  provided the output information is
small enough to fit into one block.  When the  output  is  large,  the

number  of  blocks  retrieved is close to the minimal number of blocks

needed to store the output information.

1.   aC        Verify the fact aC. (For a given abstract object a  and

               category  C,  verify  whether  the  object  a is in the

               category C.)

2.   aRy       Verify the fact aRy.

3.   a?        For a given abstract object a, find all the  categories

               to which a belongs.

4.   ?C        For a given category, find its objects.

5.   aR?       For a given abstract object a and relation R,  retrieve

               all  y such that aRy. (The objects y may be abstract or

               concrete.)

6.   ?Ra       For a given abstract object a and relation R,  retrieve

               all abstract objects x such that xRa.

7.   a?+a??+??aRetrieve  all  the  immediate  information   about   an

               abstract object.  (That is, for a given abstract object

               a,  retrieve   all    of   its   direct   and   inverse

               relationships,  that  is, the relations R and objects y

               such that aRy or yRa, and the  categories  to  which  a

               belongs.)

               (Although this request can be decomposed into a  series

               of  requests  of the previous types, we wish to be able

               to treat it separately in  order  to  ensure  that  the

               whole  request  will  normally be performed in a single

               disk access.  This  will  also  allow  a  single-access

               performance  of requests which require several, but not

               all, of the facts about an object,  e.g.,  a  query  to

               find  the  first  name, the last name, and the age of a

               given person.)

8.   ?Rv       For a given relation (attribute) R and a given concrete

               object (value) v, find all abstract objects x such that

               xRv.

9.   ?R[v1,v2] For a given relation (attribute) R and a given range of

               concrete  objects  [v ,  v ], find all objects  x and v
                                    1    2
               such that xRv and v =670  and

p.COST<=680

(This query prints the names  and  addresses  of  the  limited

companies   founded  before  1989  that  manufacture  products

costing between $670 and $680.  It is assumed that LTD-COMPANY

is a subcategory of COMPANY.)

A query processor/optimizer can perform this as follows:

1.   Perform (? COSTS [670,680]) (resulting, say,  in  objects

     p , p , and p ).
      1   2       3

2.   For each of i in 1..3 perform (? MANUFACTURES p ) (let us
                                                    i
     assume  that  the  union   of  the  results  of the three

     queries is c , c , c , and c ).
                 1   2   3       4

3.   For   each   j   in   1..4   perform    the    elementary

     (c ?+c ??+??c ),   obtaining  the  immediate  information
       j   j      j
     about the company  c .   This  includes  the  information
                         j
     necessary  to  check  (c.YEAR-FOUNDED<1989  and  c  is an

     LTD-COMPANY) as well as the values of NAME and ADDRESS to

     be printed if the result of the latter is positive.

The total number of elementary queries here was eight.

9.1.2.2.2.   Efficiency of update transactions

Efficient performance of update  transactions  is  required,  although

more than one disk access per transaction is allowed.

A transaction is a set of interrelated update requests to be performed

as   one   unit.   Transactions  are  generated  by  programs  and  by

interactive users.  A  transaction  can  be  generated  by  a  program

fragment  containing  numerous update commands, interleaved with other

computations.  However, until the last command within a transaction is

completed,  the  updates  are  not physically performed but rather are

accumulated by the DBMS.  Upon completion of the transaction, the DBMS

checks  its  integrity  and  then physically performs the update.  The

partial effects of the transaction may be inconsistent.  Every program

and   user  sees  the  database  in  a  consistent  state:  until  the

transaction is committed, its effects are invisible.

A completed transaction is composed of a set of facts  to  be  deleted

from  the  database,  a set of facts to be inserted into the database,

and  additional  information  needed  to  verify  that  there  is   no

interference  between  transactions  of  concurrent  programs.  If the

verification produces a positive result, then  the  new  instantaneous

database  is:  ((the-old-instantaneous-database)  - (the-set-of-facts-

to-be-deleted)) U (the-set-of-facts-to-be-inserted).

Example 9-3.

Consider the database of Example 9-1.

The following is  a   transaction  to  rename  Burroughs  into

Unisys,  transfer  all  business  from  Sperry to  Unisys, and

delete Sperry.

transaction

  for b where b COMPANY-NAME `Burroughs' do
   for s where s COMPANY-NAME `Sperry' do
     begin

       b.COMPANY-NAME := `Unisys';
       for p where s MANUFACTURES p do

                      relate b MANUFACTURES p;

       decategorize s from COMPANY

       end

Let us assume that before the transaction  the  two  companies

are  objects  b   and  s  respectively and Sperry manufactures
               0        0
products p  and p . The following queries were performed  from
          1      2

within this transaction:

1.   ? COMPANY-NAME `Burroughs' (results in b )
       ------- ----                          0

2.   ? COMPANY-NAME `Sperry' (results in s )
       ------- ----                       0

3.   s  MANUFACTURES ? (results in { p  , p  })
      0                               1    2

At the end of the programmatic  transaction,  the  accumulated

transaction   will  be  (V,D,I),  where  V,  the  verification

specification, is the above three  queries  with  their  time-

stamps;  D  is  the following specification of the facts to be

deleted:

   s  COMPANY
    0
   s  COMPANY-NAME *
    0 ------- ----
   s  MANUFACTURES *
    0
   b  NAME *
    0

and I is the following set of facts to be inserted:

   b  NAME `Unisys'
    0
   b  MANUFACTURES p
    0               1
   b  MANUFACTURES p
    0               2

9.1.2.3.   Solution: a file structure achieving the goals

The following file structure supports  the  above  requirements.   The

entire  database  is  stored in a single file.  This file contains all

the facts of the database (xC and  xRy)  and  additional  information,

called  inverted  facts,  which  are  described  below.   The  file is
maintained as a B-tree.  The variation of the B-tree used here  allows

both  sequential  access  according  to the lexicographic order of the

items comprising the facts and the inverted facts, as well  as  random

access by arbitrary prefixes of the facts and inverted facts.

The inverted facts do introduce some physical redundancy  (no  logical

redundancy  since  they are invisible to the user), which results in a

storage overhead and update-time overhead.  However, as  it  is  shown

below,  this  overhead  is  not  greater than if index structures were

used.  Of course, it is impossible to achieve any reasonable retrieval

efficiency  without  physical  redundancy,  such  as  the  indices  in
conventional implementations or the inverted facts  proposed  in  this

paper.

The facts which are close to each other  in  the  lexicographic  order

reside close together in the file.  (Notice, that although technically

the B-tree-key is the entire fact, it is of varying length and on  the

average  is  only several bytes long, which is the average size of the

encoded fact xRy. The total size of the data stored in the index-level

blocks  of  the  B-tree  is  less  than  1  percent of the size of the

database: e.g., each 10,000-byte data block may be represented in  the

index  level by its first fact - 5 bytes - and block address - 3 bytes

- which would amount to 0.08 percent of the data block.  Thus, all the

index blocks will fit into even a relatively small main memory.)

The file contains the original facts and  additionally  the  following

"inverted facts":

1.   In addition to xC, we store its inverse Cx.  (C  is  the  system-

     chosen  identifier to represent the inverse information about the
     category C. For example, it can be defined  as  C = 0-C.)  (If  a

     category  C  is a subcategory of category C , an object a belongs
                1                               2
     to C  and, thus, also to  C ,   then  we  choose  to  store  both
         1           --      -- 2
     inverted  facts C a and C a.  When the user requests the deletion
                      1       2
     of the fact aC , it triggers automatic deletion of the facts aC ,
     --         -- 2                                                1
     C a,  and  C a  in  order  to  guarantee consistency.)  Thus, the
      1          2
     elementary query ?C to find all the objects of  the  category  C,

     can be answered by examining the (inverted) facts whose prefix is
     C.  The latter inverted  facts  are  clustered  together  in  the

     lexicographic order of the physical database.

2.   In addition  to xRv, where v is a concrete object  (a  number,  a
     string,  or  a  value  of another type), we store Rvx.  Thus, the

     range query "?R[v1,v2]" is satisfied by all and only the inverted
     facts   which   are  positioned  in  the  file  between  Rv   and
     -                                                          1
     Rv HighSuffix. (HighSuffix is a suffix which is lexicographically
       2
     greater  than any other possible suffix.)   Thus, the result will

     most probably appear in one physical block, if it  can  fit  into

     one block.

3.   In addition to xRy, where both x and y are abstract  objects,  we
     store   yRx.    Thus,   for   any  abstract  object  x,  all  its

     relationships xRy, xRv, zRx, and xC can be found in one place  in

     the  file:  the  regular  and inverted facts which begin with the

     prefix x. (The infixes are:  categories for xC, relations for xRy
     and   xRv,  and  inverse relations xRz from which we find  z such

     that zRx.)

Example 9-4.

Consider the  instantaneous  database  of  Example  9-1.   The

additional inverted  facts stored in the database are:

            inv
1.   COMPANY     object1

                 inv
2.   COMPANY-NAME    `IBM' object1

                         inv
3.   object2 MANUFACTURED    object1

                         inv
4.   object3 MANUFACTURED    object1

            inv
5.   PRODUCT    object2

                inv
6.   DESCRIPTION    `IBM/SYSTEM-2' object2

            inv
7.   PRODUCT    object3

                inv
8.   DESCRIPTION    `MONOCHROMATIC-MONITOR' object3

Notice that facts xRa and xRv (x and a are abstract objects,  v  is  a

value)  are  inverted dissimilarly.  This is because we have different

types of atomic retrieval requests concerning  abstract  and  concrete

objects:

 o   Concrete objects can be used to form range queries  (e.g.,  "Find

     all persons with salaries between $40,000 and $50,000").  In such

     queries we know  the  identifier  of  the  relation  and  partial

     information  about  the  value.   Therefore  we  need  to use the

     inverted facts with the inverse of  R   as  the  prefix.   Unlike

     concrete  objects,  ranges  of  abstract  objects  cannot  form a

     meaningful range query.

 o   On the other hand, we  have  multiple-fact  retrievals  about  an

     abstract  object,  that  is,  "Find all the immediate information

     about a given person p" (while such a request  about  a  concrete

     object  would be meaningless: "Find all the information about the

     number 5" makes no sense, as opposed to a meaningful query  "Find

     information about item(s) whose price is $5").   Here we know the

     object but do not know the identifiers of the inverted relations.

     We  need  to  cluster  together all the inverted relations of one

     object.  Therefore, the inverted relation should  appear  in  the

     infix.

Example 9-5.

When  the  set  of   original   facts   is   interleaved   and

lexicographically  sorted  with  the  inverted  facts  of  the

previous example, we obtain:

1.   object1  COMPANY

2.   object1  COMPANY-NAME  `IBM'

3.   object1  MANUFACTURED  object2

4.   object1  MANUFACTURED  object3

5.   object2  DESCRIPTION  `IBM/SYSTEM-2'

6.   object2  PRODUCT

                         inv
7.   object2 MANUFACTURED    object1

8.   object3  DESCRIPTION  `MONOCHROMATIC-MONITOR'

9.   object3  PRODUCT

                         inv
10.  object3 MANUFACTURED    object1

            inv
11.  COMPANY     object1

                inv
12.  DESCRIPTION    `IBM/SYSTEM-2' object2

                inv
13.  DESCRIPTION    `MONOCHROMATIC-MONITOR' object3

                 inv
14.  COMPANY-NAME    `IBM' object1

            inv
15.  PRODUCT    object2

            inv
16.  PRODUCT    object3

Example 9-6.

To answer the elementary query to  find  all  the  information

about object3, including its direct and inverse relationships,

we find all  the  entries  whose  prefix  is  object3.   These

entries are clustered together in the sorted order.

The  elementary  query  "Find  all  objects  manufactured   by

object1"   we   find   all   the   facts   whose   prefix   is

object1 MANUFACTURED.   (` '  denotes  concatenation.)   These
entries are clustered together in the sorted order.

The  query  to  print  the   descriptions   of   the   objects

manufactured  by  the companies whose names are between `IATA'

and `K-mart', we solve several elementary subqueries:

1.   Find the companies whose names are in the above range

     strings.  (We search for inverted facts which are
                                            inv
     lexicographically between  COMPANY-NAME    `IATA' and
                 inv
     COMPANY-NAME    `K-mart' HighSuffix.
     For the instantaneous database given in the previous

     example we find only one inverted fact
                 inv
     COMPANY-NAME    `IBM' object1.  The suffix object1 is the
     company we are looking for.)

2.   Find the products manufactured by object1.   (From  facts

     with   prefix   object1 MANUFACTURED   we  find  suffixes
     {object2, object3}.)

3.   Find    the    description    of    object2.      (Prefix

     object2 DESCRIPTION);

4.   Find    the    description    of    object3.      (Prefix

     object3 DESCRIPTION);

The sorted file is maintained in a structure similar to a B-tree.  The

"records"  of  the  B-tree  are  the  regular and inverted facts.  The

records are of varying length.  The B-tree-keys of the  "records"  are

normally   the   entire   B-tree-records  (i.e.,  facts,  regular  and

inverted).  (An exception to this is when the  record  happens  to  be

very  long.   The  only   potentially long records represent facts xRv

where v is a very long character string.  We employ a special handling

algorithm  for  very  long character strings.)   Access to this B-tree

does not require knowledge of the entire key: any prefix will do.  All

the index blocks of the B-tree can normally be held in cache.

At the most physical level, the data in the  facts  is  compressed  to

minimal  space.   Also,  since  many  consecutive facts share a prefix

(e.g., an abstract object identifier), the prefix need not be repeated

for  each  fact.   In  this way the facts are compressed further.  The

duplication in the number of facts due to the inverses is 100 percent,

since  there  is  only one inverse per each original fact (with a rare

exception of the storage of redundant  inverses  of  supercategories).

The  B-tree  causes  additional  30  percent overhead.  (This overhead

occurs because in a B-tree the data blocks are only 75 percent full on

the  average,  though this can be improved by periodic reorganization.

The overhead for the index blocks of the B-tree is no more than 1 to 2

percent  since they contain only one short fact per every data block.)

The total space used for the database  is  therefore  only  about  160

percent more than the amount of information in the database (i.e., the

space minimally required to store the database in the most  compressed

form  with  no  regard to the efficiency of data retrieval or update).

Thus, the data structure described herein is more efficient  in  space

and  time than the conventional approach with separate secondary index

files for numerous fields.

No separate index files are needed for the file structure proposed  in

this  paper.   The  duplication  of  data  (i.e.,  inverted relations)

together with the primary sparse index which is a part of  the  B-tree

effectively   eliminate   the  need  for  secondary  (dense)  indices.

Furthermore, it eliminates the horrendous  I/O  operations  caused  by

sequentially retrieving along a secondary index, since the sequence of

information represented by our primary sparse index is also stored  in

consecutive  physical  locations.   These  claims  are  proven  in the

following section.

9.1.2.4.   Proof of time-efficiency of the file structure.

Lemma 1.  Let the file be logically perceived as  a  lexicographically
ordered  sequence  of  facts.  Let req be an atomic retrieval request.

Then there is a contiguous segment in the sequence, so that:

      o   All the facts in the segment satisfy the request req.

      o   No fact outside the segment satisfies the req.

      o   The boundaries fact      and fact    of the segment  can  be
                             start         end
          derived  from the syntax of the request req.  (Thus, all the

          output facts are  lexicographically  between  fact       and
                                                            start
          fact   .  The boundaries may be inclusive or exclusive.)
              end

Proof of Lemma 1.

The following are the ranges for the segments for each of  the  atomic

requests    (the    symbol   HighSuffix   denotes   the   bit   string

"11111111111111..."  which  is  lexicographically  greater  than   any

possible suffix in a fact):

Request        Segment

1.   aC        aC < fact < aC
               (Verify the fact aC.)

2.   aRy       aRy < fact < aRy

               (Verify the fact aRy.)

3.   a?        aC    < fact < aC    (Here, it is assumed that all  the
                 min -      -   max
               categories  of the schema are enumerated by identifiers

               between C    and C   .)
                        min      max
               (For a given abstract object a,  find  what  categories

               the object belongs to.)

4.   ?C        C < fact < CHighSuffix
               (For a given category, find its objects.)

5.   aR?       aR < fact < aRHighSuffix
               (For a given abstract object a and relation R, retrieve

               all y such that aRy.)

6.   ?Ra       aR < fact < aRHighSuffix
               (For a given abstract object a and relation R, retrieve

               all abstract objects x such that xRa.)

7.   a?+a??+??aa < tuple < aHighSuffix
               (Retrieve  all  the  immediate  information  about   an

               abstract object.)

8.   ?Rv       Rv < fact < RvHighSuffix
               (For  a  given  relation  (attribute)  R  and  a  given

               concrete  object  v,  find  all abstract objects x such

               that xRv.)

9.   ?R[v1,v2] Rv  < fact < Rv HighSuffix
                 1        -   2
               (For a given relation R and a given range  of  concrete

               objects  [ v , v ], find all objects x and v such that:
                           1   2
               xRv and v 
Reference Schemas
The following are the semantic and relational schemas for the university case study application. These schemas are referred to in most of the examples in this text.


Figure Ref-1. A semantic schema for a university application.


Figure Ref-2. A relational schema for the university application.