DeXIN is a loosely coupled data integration tool which integrates multiple, heterogeneous, highly distributed and rapidly changing web data sources in different formats, e.g. XML, RDF and relational data. DeXIN is a data integration web service which integrates heterogeneous distributed data sources, including data services (DaaS – data as a service). At the heart of DeXIN is an XQuery extension that allows users/applications to execute a single query against distributed, heterogeneous web data sources or data services.
DeXIN Architecture
XMLDOC SPARQLQuery(String sparqlQuery,URI sourceURI)
The value returned by a call to this function is of type XMLDOC. The function SPARQLQuery() has two parameters: The first parameter is of type String and contains the SPARQL query that has to be executed. The second parameter is of type URI and either contains the URI or just the name of the data source on which the SPARQL query has to be executed. The name of the data source refers to an entry in the database of known data sources maintained by the Metadata Manager. If the indicated data source is reachable and the SPARQL query is successfully executed, then the result is wrapped into XML according to the W3C Proposed Recommendation. A query in extended XQuery format retrieving the desired information is shown:
for $a in doc(“Peer1/License.xml”)/agreement,
$b in SPARQLQuery(“ SELECT ?title ?ExecutionTime
WHERE { ?x <http://www.w3c.org/2001/sub#title> ?title. ?x <http://www.w3c.org/2001/sub#time > ?ExecutionTime}”, Peer2/Qos.rdf) /result
WHERE $a/Servicetitle = $b/title AND $a/peruse > 1
RETURN <Result> ………. <Result> |
An example extended XQuery for DeXIN
The query tree returned by the Parser has to be traversed in order to search for all calls of the SPARQLQuery() function. Suppose that we have n such calls. For each call of this function, the Query Decomposer retrieves the SPARQL query qi and the data source di on which the query qi shall be executed. The result of this process is a list {(q1, d1), . . . , (qn, dn)} of pairs consisting of a query and a source. The Executor then poses each query qi against the data source di. The order of the execution of these queries and possible parallelization have to take the dependencies between these queries into account. If the execution of each query qi was successful, its result is transferred to the site where DeXIN is located and converted into XML-format. The resulting XML-document ri is then stored temporarily. Moreover, in the query tree received from the Parser, the call of the SPARQLQuery() function with query qi and data source di is replaced by a reference to the XML-document ri. The resulting query tree is a query tree of pure XQuery without any extensions. It can thus be executed locally by the XQuery engine used by DeXIN.