ISO TC184/SC4/WG10 Architecture N134
Date: 1997-12-19
Source: Author

Ontological Foundations of an Architecture for Industrial Data

Matthew West
Shell Information Services Limited


Date sent:        Fri, 19 Dec 1997 04:20:09 -0500
From:             Matthew West 
Subject:          Ontological Foundations of an Architecture for Industrial Data
To:               "'EPISTLE'" ,
        "'WG10'" , "'WG11'" 

Dear Colleagues,

I have been doing some work in recent weeks towards an
Architecture for Industrial Data, part of the PWI in WG10.
Whilst others have been looking at business and technical
requirements, and the overall shape of the architecture, I have
been looking at the ontological foundations. This means looking
at what things are. In particular, I have been trying to
identify the dependence of concepts, to find those on which
others are dependent, so that we can know and understand what
things we have to "Accept as given" (or perhaps have done
without realising it).

The good news is that the list is quite small: Things, Classes,
Individuals, Relationships, and Classification. Below is an
(attempt at) an EXPRESS model that gives definitions of these
concepts, and some relationships between them.

*************************************
class

A thing that is a number or collection of things.

A number of individuals (persons or things) possessing common
attributes, and grouped together under a general or 'class'
name; a kind, sort, division. 

A class may have a basis (explicit or implied) for membership of
the class.


EXPRESS specification:

*)
ENTITY class
        SUBTYPE OF (thing);
END_ENTITY;

(*

classification

A relationship that indicates that a thing is a member of a
class.

Role_1 = member
Role_2 = class

EXPRESS specification:

*)
ENTITY classification
        SUBTYPE OF (relationship);
END_ENTITY;

(*

individual

A single object or thing, or a group of things forming a single
complex idea, and regarded as a unit; a single member of a
natural class, collective group, or number.

Not a class.

EXPRESS specification:

*)
ENTITY individual
        SUPERTYPE OF (relationship)
        SUBTYPE OF (thing);
END_ENTITY;

(*

relationship

An individual that is a link between one thing and another. The
related things each play a role in the relationship.

A relationship is asserted to exist or not exist.

EXPRESS specification:

*)
ENTITY relationship
        SUPERTYPE OF (ONEOF(specialisation,classification))
        SUBTYPE OF (individual); role_1 : thing; role_2 : thing;
        exists : BOOLEAN;
END_ENTITY;

(*

Attribute definitions:

role_1: The first role in a relationship played by a thing.

role_2: The second role in a relationship played by a thing.

exists: Indicates if the relationship is asserted to exist
(#TRUE), or not to exist (#FALSE).

This enables negative statements to be made e.g. This thing is
NOT a pump.


specialisation

A relationship that indicates that the members of the sub-class
are also members of the super-class.

Role_1 = sub_class
Role_2 = super_class

EXPRESS specification:

*)
ENTITY specialisation
        SUBTYPE OF (relationship);
END_ENTITY;

(*

thing

Anything that exists, real or imagined.

EXPRESS specification:

*)
ENTITY thing
        SUPERTYPE OF (ONEOF(class,individual));
END_ENTITY;

(*

************************************
It is worth noting that specialisation has been included in the
model, primarily for convenience. On the other hand if
classification is not declared as a fundamental concept, then it
is not possible to declare a classification relationship without
creating an "exploding" model.

Those amongst you with a practical nature, will have noticed
that this model is not actually capable of holding any
information that is human interpretable (text, numbers etc.).
This is deliberate. The choice of data types in EXPRESS is in
principle arbitrary, and does not form part of the fundamentals
of the world about us, only the fundamentals of EXPRESS, which I
am not trying to capture here.

Creating this model made me aware of some limitations of EXPRESS
for data integration (as opposed to data exchange, for which it
is well suited). For example, within even the model above, I
have no mechanism for showing that the entity types thing,
individual, relationship, etc. are themselves members of the
entity type class. Also, there was no ("proper") way to
specialise the roles of the relationship entity type, to the
suptype/supertype roles that would have been appropriate for
specialisation for example.

This is by no means a limitation of EXPRESS alone, but of all
Entity Relationship based data definition languages. If you
want/need to model relationships between classes, you cannot
make those classes entity types, because there is no mechanism
to model relationships between classes (only between members of
classes). Realising this is almost a relief, since it explains
why there are so many arguments between people about what level
of abstraction some model should be at. It is in the end caused
by having to make a choice because of the limitations of the
language(s) we use.

In order to show that this is not necessary, I have created a
simple notation that allows models to be created that cross
levels (even including instances that would normally appear in a
part 21 file). The notation consists of:

Box (with name) - class
Ellipse (with name) - reference to an individual
line - a relationship
arrow - classification relationship (thing classified at
arrowhead end) thick arrow - specialisation relationship
(subtype at arrowhead end)

Note a relationship has three classification arrows pointing to
it, one for the class of relationship, and one for the class of
role played in the relationship by the object at each end.

Try it and see what you think. You have to be careful, because
relationships in this scheme of things are different from what
we normally find in EXPRESS as attributes, which are classes of
relationship. I.e. there are instances of them in part 21 files.

My experience so far is that you get in an awful mess if you try
to put a complete model on one piece of paper. On the other
hand, if you limit what you try to show to what you can get on
an overhead slide (and read at the back of the room) then it is
quite manageable.

The next question that arises is what to do about it. As far as
I can see there are at least 2 alternatives:

1. Extend EXPRESS to be able to show relationships between
entity type objects.

2. Create a "new" language specifically for the purposes of data
integration.

3. Settle for this as nearly all the data model you need, and
handle everything else as data.

4. ???

I don't have a special preference, and can see advantages and
disadvantages to each approach. Answers on a Christmas Card
please :-)

Merry Christmas


Regards  
      Matthew
=========================
Asset Information Management
Matthew West, ISCM, Shell Centre, London, SE1 7NA.
Tel: +44 171 934 4490 Fax: 6649
Exchange e-mail: Matthew.M.R.West@is.simis.com
Shell Information Services Limited.
=========================