The query scope description describes the synonyms
(alternative descriptions) for each atom name used in the FROM
clause of Q and how these synonyms participate in the
alternation relationship with the term, namely the
alternation constraints. Four different types of
alternations are considered here: total&disjoint,
total&overlap, partial&disjoint, partial&overlap.
We model the query scope, denoted as
, using
scope records of the form (VI, SY, AL),
where VI is the name of a virtual Interface, SY
denotes a set of synonyms of VI, and AL denotes the alternation
constraint over SY.
The query capacity description describes the qualification
requirements for the arguments of the query.
We model the query capacity, denoted as
,
using capacity records of the form
(Attr, SY, AL, Dtype, Mattr, IOtype, Bopt), where
Attr denote an attribute argument, SY
denotes a set of synonyms of Attr, AL is the alternation constraint
over the set of synonyms SY, Dtype denotes the data type and
Mattr denotes the metadata properties of the attribute Attr,
IOtype indicates if the attribute is used as
an input argument or output argument or both, and Bopt denotes
the binding option (mandatory or optional) for the argument Attr.
Note that all the input arguments in a query need to be
satisfied and thus their binding options are mandatory, but
sometimes we may allow certain outputs to be absent [7]
in the answer by indicating their Bopt to be optional.
In our initial implementation, the user query profiles are created using an interactive dialog interface program. A dialog screen will be pop up whenever the user poses a query on the fly. For instance, when the query Q in Example 1 is posed, the user may annotate each virtual interface class in the FROM clause using the query scope description record as shown in Figure 2(a).
Figure 2: Metadata Profiles
The user who posed query Q may annotate each input or output attribute of the query Q by defining the synonyms, their alternation constraints, and the data type, the metadata properties, the input/output type and the bounding option. These semantic definitions will be captured in the user query capacity record as shown in Figure 2(b). Data are not in this font are entered by the system using default settings derived from the query statement of Q or inferred using available on-line ontology.
Since we assume that the term supplier is restricted to the semantic context that only two types of suppliers are of interest: book stores and publishers, we describe such context by indicating book store and publisher as synonyms of supplier with alternation constraint. The totaldisjoint alternation constraint means that the given set of synonyms of supplier (book store and publisher in this case) presents a total and disjoint categorization of all supplier objects. The default alternation constraint is set to partial overlap. Similarly, the default for all the non-numeric attributes is String type. The default for all the numeric attributes is the corresponding data type of their numeric values such as integer, float, double. The default for binding option of an output argument is optional.
Important to note is that, for each user query posed to the global information system, we create a virtual interface schema and a user query profile, independent of the structure, the number, and the connectivity of data sources, or the existence of the data being requested. Since the collection of information sources available is large and frequently changing, the logical data independence as such allows us to add new data sources or incorporate changes seamlessly into the query processing system at any time without affecting the way how queries are posed and how answers are delivered, thus higher scalability is achieved.