Consider the following application scenario: a multimedia expert is searching for jobs located in the USA or Canada which are related to her domain of expertise.
Figure 2: Fragments of the export schemas of example information sources
Assume that the relevant information sources
that are currently available include
a Recruitment Agencies HTML document,
a Job Listing database at Autodesk,
a Job Postings text file,
Job Articles from Usenet News,
a Placement Agency Database,
a Business Directory Database,
a Companies on the Web HTML document,
and a Resume Resources BibTeX file.
We refer to these information sources as component information sources
of the given enterprise system. The relations or classes are described
as boxes and the links indicate the primary key and foreign key
relationships or object reference relationships. The superclass and
subclass relationships are depicted using arrow links.
Figure 2 contains fragments of the export schemas
for these information sources.
In order to find most of the relevant job opportunities, the job seeker (information consumer) would have to search and possibly revisit many different data sources. This task may involve browsing and navigating through the above mentioned repositories. What is required is a more efficient and scalable system that interconnect a consumer to all available job repositories and make querying easier over these heterogeneous repositories.
Queries to multiple information sources are expressed in the DIOM interface query language (DIOM IQL) [21], an object-oriented extension of SQL. IQL queries use the naming conventions and terminology defined in the information consumer's domain model. There is no need for the query writers to be aware of the many different naming conventions and terminology used in the underlying information sources. Given an IQL query expressed in terms of the consumer's domain model, the first step in processing this query is to select the relevant information sources for answering the query, then the query will be decomposed into a collection of subqueries, each expressed in terms of an information producer's source model. These reformulated subqueries will then be forwarded to the wrappers of corresponding sources for subquery translation and execution.
For example, the query
: `` find all jobs relating to
multimedia and located in North America'' can be expressed as
follows.
SELECT Job->title, Job->pay, Job->descrip,
Company->name, Company->address, Company->city,
Company->prov/state, Company->country, Company->mail_code,
Company->descrip, Company->URL
FROM Job, Company
WHERE Job->descrip contains 'multimedia'
AND (Company->country contains 'Canada' || 'USA')
In representing this query, the query writer does not need to be concerned with the different naming conventions and different structural designs used by various information sources. Instead, she can write a query using the terminology that focuses on how she would like the result of the query to be represented and displayed. The IQL preprocessor will first check with the DIOM interface repository where all the interface definitions are stored. If the IDL specification of the target interfaces given in the FROM clause does not exist, an appropriate IDL specification will be generated [18].
Assuming that the IDL definition of Job and Company
interfaces have been generated,
the query
can simply be expressed as shown in
Figure 3: