Integration of Industrial Data for Exchange, Access and Sharing (IIDEAS)

Matthew West

ISO TC184/SC4/WG10 N71
August 2, 1996


Dear Colleagues,

Please find below the result of an action item placed on me by WG10 at Kobe. That is to write up some ideas on how we could move forward on issues like data (instance) integration, data sharing, AP interoperability etc.

Comments welcomed (would it make any difference? :-))

Regards,
Matthew West
Technology Consultant - Information Services
Shell International Limited

HTML Markup and the Table of Contents added by Julian Fowler, September 15, 1996


Contents

  • Introduction
  • Objectives
  • Data integration
  • Industrial data models
  • Conceptual data models
  • Formal mappings between data models
  • Ontologies
  • Identification
  • Implementation
  • Principles
  • Relationship to other standards
  • Examples of use

  • Introduction

    This paper has been developed following discussion with the PPC and WG10, and was specifically requested by WG10 so that the suitability of a more formal proposal can be assessed.

    It gives an outline for a possible standard to be developed by ISO TC184/SC4 to complement and extend the capabilities of the existing standards developed by SC4 in the area of the integration and sharing of industrial data.

    Industrial data is taken to be any data that is useful in an industrial context and includes:

    It could either be an extension to STEP, a new version of STEP, or a new standard. Which of these is chosen would be primarily a political decision.


    Objectives

    This standard has the following objectives:

    1. to support the integration and storage of any industrial data from any source, and access to that data
    2. to provide a mechanism for the standardisation of data models that are found useful to industry
    3. to develop standardised data models that are conceptual in nature to support the data requirements of a number of explicit data models to standardise a formal two way mapping between data models, to support either:
    4. to determine a satisfactory basis for identification that two pieces of data are about the same object
    5. to define an implementation form for a data store which supports a multi-user, multi-application API that takes account of e.g. access control requirements, and a batch interface that can be used for e.g. the transfer of data from (at least) one conformant data store to another.

    Data Integration

    The prime objective of this standard is to support the integration of data instances from different data sets, according to different data models, into a single data set - and a single data model. The data in this conceptual model might then need to be viewed from a perspective defined by yet another data model. (Source and destination models are external models in ANSI/SPARC terms.)

    To achieve this a data model conceptual to the source data models need to be developed, a formal two way mapping between the source and conceptual data models, and from the destination and conceptual data models.

    In addition an environment needs to be defined in which data sets can be integrated, including identification and consolidation of information about the same thing.

    Another form of integration is partial integration of two data models where they overlap. Here a single data set is not achieved, but the overlap data is mapped/viewed in both data models.


    Industrial data models

    The emphasis here is with documenting data models clearly and unambiguously. This would be based on a data dictionary approach, with graphical and language based representations. It would not be concerned with matters of style, or for existing data models even correctness, although fitness for purpose would be considered for data models not yet in use. Constraints and behaviour appropriate to the purpose of the model are supported.

    It is expected that data models would be accompanied by an activity model which identified the activities the data model was intended to support.

    Data models already standardised need not be restandardised.


    Conceptual data models

    Here a conceptual data model is defined to mean a data model that fully satisfies the data requirements of two or more data models, though not necessarily the constraint or behaviour requirements. It defines the relationship between two data models, rather than an absolute state. The external data models may be standardised either in a standard (not necessarily ISO) or within this standard. Development of a data model that is conceptual to all others is expected to be a technical objective.

    The mapping process will probably identify missing elements in existing conceptual models and ontologies that are being mapped to. It will therefore be important to have a standardisation process that supports incremental additions to the conceptual data models and ontologies. Fortunately principles exist, which if applied enable the development of conceptual data models that are stable, yet flexible to different usages of data, and extensible for additional data requirements.


    Formal mappings between data models

    Mapping involves two things:

    1. it takes the (implicit) context of an external data model, and maps it to/from explicit elements in the conceptual data model
    2. it takes the explicit elements in the external data model and maps them to/from equivalent elements in the conceptual data model.

    This gives the basis for mapping data elements according to the external data model to and from the conceptual data model.

    This process can also be adopted privately for mapping application data structures to standardised data models.


    Ontologies

    An ontology here is taken to be a classification scheme. Two types of ontologies are recognised:

    1. Pure ontologies, here the basis for classification is the same throughout the classification hierarchy. Such ontologies can be expected to be orthogonal. Here orthogonal will mean that classes at a level will be mutually exclusive. On the other hand an object can be a member of a class in more than one ontology.
    2. Mixed ontologies, these are generally of more practical use, but can easily overlap with each other. The overlaps can be managed through the elements that make up the mixed ontologies coming from pure ontologies.

    Identification

    One of the biggest problems when combining data from different data sets is identifying which data is duplicated, and needs to be consolidated.

    This is not a matter of "yet another global identifier". There are enough of these already. It is about determining for an entity type what is considered sufficient certainty of data being duplicate.


    Implementation

    The implementation requirements for IIDEAS are for a data access interface, and for an exchange file format. Whilst this can (and should) build on the work done in STEP, the requirements are more demanding.

    The data access interface will need to support multi-user, multi-application access. In addition this may be access through one data model to data held in another. Data integration services will also need to be provided, as well as the usual access control services.

    The file exchange format will need to be more of a batch command interface, so that operations as well as data can be exchanged in a file. Source and destination information may also be necessary.

    There is much existing work that may contribute to meeting these requirements, including CORBA and SQL3.


    Principles

    1. The primary principle of the structure of this proposal is not to assume we know all the answers, and thus to propose a structure that will enable the standard to progress gracefully.
    2. Management of data independent of application, rather than exchange of data between applications.
    3. Data about a product needs to be managed and maintained throughout the life of the product.

    Relationship to other standards

    STEP

    IIDEAS differs from/extends STEP in having a broader scope than Product Data, and having an emphasis on sharing and integration, rather than exchange in the STEP sense. STEP is a potential source of data models that might require data from them integrating, and that might want to use ontologies standardised in IIDEAS.

    IIDEAS would use much of what STEP has developed, including at least EXPRESS as a way of defining data models, and the proposed EXPRESS-X for defining mappings between data models.

    Parts Library

    P>Like STEP, Parts Library has a limited scope and intent. Thus IIDEAS would be able to support the data requirements of a Parts Library, and an IIDEAS implementation might choose to provide a Parts Library conformant implementation.

    CDIF

    CDIF provides a meta model for the exchange of meta data. This could be used, together with the EXPRESS meta-model as a basis for the data dictionary for this standard, although account would need to be taken of the need for history, configuration control, and version control. This standard treats meta data in the same way as other industrial data.

    IRDS

    This standard might identify some of the services required by an implementation.

    CSMF

    Theoretical but useful stuff to reference or build on.

    Basic Semantic Repository

    This seems to have some ideas similar to those presented here for ontologies. It would need investigating to see how it could contribute.


    Examples of Use

    Examples of the use of IIDEAS are given from the process industry. They are currently being pursued in industry.

    Example 1: Standardising the integration of several APs

    The design of a process plant involves a number of disciplines, and the handover of the complete design data for a plant would involve the use of a number of APs. In particular: 221, 227, 231, 212, 225, 230. However, a plant owner does not want a number of separate files of data, which would have significant overlap in data content. What is desired is a single integrated database to act as a reference database to support operations and maintenance activities and data.

    Using IIDEAS, a conceptual data model and set of ontologies can be developed from the APs involved. The APs can then be explicitly mapped to the conceptual model and ontologies. This then enables the batch implementation form to integrate the data into a single database. This can be accessed by operations and maintenance applications by the data access interface to provide the necessary reference data. The operations and maintenance applications could also store their data through the data access interface, using a suitable data model for their needs, integrated with the model for plant design.

    Example 2: Ensuring commonality between APs

    AP221 has developed a set of standard classes of equipment (about 2000 of them) that are commonly used in the process industry. AP227 wishes to use them in their AP as a way of achieving both utility and commonality. Rather than make a reference from one AP to another it is preferred to standardise the standard classes (an ontology) externally, and make reference to it from both APs, so that the content of the ontology can be managed independent of the data models that use it. IIDEAS provides a place to do this.

    Example 3: Interoperability between S88 and STEP

    S88 is a standard for the control of batch processes. It includes data models to support the information requirements of batch control which are already implemented. They would like S88 to be able to share/exchange data with STEP, in particular AP221 and AP231. In order to achieve this they wish to perform a mapping from their model (using OMT) to the STEP APs so that (parts of) their model can be a standardised view on the relevant STEP APs.