Matthew West
ISO TC184/SC4/WG10 N71
August 2, 1996
Dear Colleagues,
Please find below the result of an action item placed on me by WG10 at Kobe. That is to write up some ideas on how we could move forward on issues like data (instance) integration, data sharing, AP interoperability etc.
Comments welcomed (would it make any difference? :-))
Regards,
Matthew West
Technology Consultant - Information Services
Shell International Limited
HTML Markup and the Table of Contents added by Julian Fowler, September 15, 1996
This paper has been developed following discussion with the PPC and WG10, and was specifically requested by WG10 so that the suitability of a more formal proposal can be assessed.
It gives an outline for a possible standard to be developed by ISO TC184/SC4 to complement and extend the capabilities of the existing standards developed by SC4 in the area of the integration and sharing of industrial data.
Industrial data is taken to be any data that is useful in an industrial context and includes:
It could either be an extension to STEP, a new version of STEP, or a new standard. Which of these is chosen would be primarily a political decision.
This standard has the following objectives:
The prime objective of this standard is to support the integration of data instances from different data sets, according to different data models, into a single data set - and a single data model. The data in this conceptual model might then need to be viewed from a perspective defined by yet another data model. (Source and destination models are external models in ANSI/SPARC terms.)
To achieve this a data model conceptual to the source data models need to be developed, a formal two way mapping between the source and conceptual data models, and from the destination and conceptual data models.
In addition an environment needs to be defined in which data sets can be integrated, including identification and consolidation of information about the same thing.
Another form of integration is partial integration of two data models where they overlap. Here a single data set is not achieved, but the overlap data is mapped/viewed in both data models.
The emphasis here is with documenting data models clearly and unambiguously. This would be based on a data dictionary approach, with graphical and language based representations. It would not be concerned with matters of style, or for existing data models even correctness, although fitness for purpose would be considered for data models not yet in use. Constraints and behaviour appropriate to the purpose of the model are supported.
It is expected that data models would be accompanied by an activity model which identified the activities the data model was intended to support.
Data models already standardised need not be restandardised.
Here a conceptual data model is defined to mean a data model that fully satisfies the data requirements of two or more data models, though not necessarily the constraint or behaviour requirements. It defines the relationship between two data models, rather than an absolute state. The external data models may be standardised either in a standard (not necessarily ISO) or within this standard. Development of a data model that is conceptual to all others is expected to be a technical objective.
The mapping process will probably identify missing elements in existing conceptual models and ontologies that are being mapped to. It will therefore be important to have a standardisation process that supports incremental additions to the conceptual data models and ontologies. Fortunately principles exist, which if applied enable the development of conceptual data models that are stable, yet flexible to different usages of data, and extensible for additional data requirements.
Mapping involves two things:
This gives the basis for mapping data elements according to the external data model to and from the conceptual data model.
This process can also be adopted privately for mapping application data structures to standardised data models.
An ontology here is taken to be a classification scheme. Two types of ontologies are recognised:
One of the biggest problems when combining data from different data sets is identifying which data is duplicated, and needs to be consolidated.
This is not a matter of "yet another global identifier". There are enough of these already. It is about determining for an entity type what is considered sufficient certainty of data being duplicate.
The implementation requirements for IIDEAS are for a data access interface, and for an exchange file format. Whilst this can (and should) build on the work done in STEP, the requirements are more demanding.
The data access interface will need to support multi-user, multi-application access. In addition this may be access through one data model to data held in another. Data integration services will also need to be provided, as well as the usual access control services.
The file exchange format will need to be more of a batch command interface, so that operations as well as data can be exchanged in a file. Source and destination information may also be necessary.
There is much existing work that may contribute to meeting these requirements, including CORBA and SQL3.
IIDEAS differs from/extends STEP in having a broader scope than Product Data, and having an emphasis on sharing and integration, rather than exchange in the STEP sense. STEP is a potential source of data models that might require data from them integrating, and that might want to use ontologies standardised in IIDEAS.
IIDEAS would use much of what STEP has developed, including at least EXPRESS as a way of defining data models, and the proposed EXPRESS-X for defining mappings between data models.
CDIF provides a meta model for the exchange of meta data. This could be used, together with the EXPRESS meta-model as a basis for the data dictionary for this standard, although account would need to be taken of the need for history, configuration control, and version control. This standard treats meta data in the same way as other industrial data.
This standard might identify some of the services required by an implementation.
Theoretical but useful stuff to reference or build on.
This seems to have some ideas similar to those presented here for ontologies. It would need investigating to see how it could contribute.
Examples of the use of IIDEAS are given from the process industry. They are currently being pursued in industry.
The design of a process plant involves a number of disciplines, and the handover of the complete design data for a plant would involve the use of a number of APs. In particular: 221, 227, 231, 212, 225, 230. However, a plant owner does not want a number of separate files of data, which would have significant overlap in data content. What is desired is a single integrated database to act as a reference database to support operations and maintenance activities and data.
Using IIDEAS, a conceptual data model and set of ontologies can be developed from the APs involved. The APs can then be explicitly mapped to the conceptual model and ontologies. This then enables the batch implementation form to integrate the data into a single database. This can be accessed by operations and maintenance applications by the data access interface to provide the necessary reference data. The operations and maintenance applications could also store their data through the data access interface, using a suitable data model for their needs, integrated with the model for plant design.
AP221 has developed a set of standard classes of equipment (about 2000 of them) that are commonly used in the process industry. AP227 wishes to use them in their AP as a way of achieving both utility and commonality. Rather than make a reference from one AP to another it is preferred to standardise the standard classes (an ontology) externally, and make reference to it from both APs, so that the content of the ontology can be managed independent of the data models that use it. IIDEAS provides a place to do this.
S88 is a standard for the control of batch processes. It includes data models to support the information requirements of batch control which are already implemented. They would like S88 to be able to share/exchange data with STEP, in particular AP221 and AP231. In order to achieve this they wish to perform a mapping from their model (using OMT) to the STEP APs so that (parts of) their model can be a standardised view on the relevant STEP APs.