RDFTranslator - XSLT for RDF

RDF Translator

RDF Tranformations Made Easy

Goals

We needed something as powerfull and as simple in the same way as XSLT for MarcOnt Mediation Service?. We failed to use: * TRIPLE (it was not working), * Sesame Inferencing (we missed things like regular expressions and functions for String processing), * FLORA-2 (it was harder to integrate FLORA in to our project that write own one)

So I decided to spend <10h -> and resulted with this implementation :)

Architecture

Overview

The architecture has been based on existing XSLT tools - keeping it as simple as possible. A trasformation object is generated from XML-based transformation definition. It takes RDF model in N-TRIPLES format from a file. It results with Jena Model that can be serialized to RDF/XML, N3 and other format.

XML Schema definition

[[Image:Xmlschema.png]]

How Does It Work ?

Translation definition consists of rules. Each rule consists of a list of premises (triples matching rules) and consequents (triples generating templates). Premise consists of subject, predicate and object, defined as URI of resource or text value of literal to be matched, or regular expressions of either.

The literal values can be typed with rdf:datatype attribute and internationalized with xml:lang attribute. It is assumed that xml:lang is used only when rdf:datatype is empty or equals to http://www.w3.org/2001/XMLSchema#string.

The string value defining URI, text value or regular expression can be enriched with variables with

{$variableName}

syntax or function calls with

{marcont:functionName(arg1,arg2)}

syntax.

Each premise that has been matched in the rule that is being processed generates 3 variables:

$PSn $PPn $POn

where n - is the position number of the premise in the rule definition. It is than possible to use this variables in next premises and consequents.

So far the list of supported functions consists of: * marcont:generateId('namespace:') that generates an resource URI within given namespace; * marcont:clone($resource, 'namespace:') that generates resource URI within given namespace but with ID extracted from resource represented by $resource variable.

Each time all premises are matched a bounch of triples is generated from premises templates. Each template can be constructed the same way as premises have been constructed. Although, regular expressions are not allowed here.

Each consequent that has been generated in the rule that is being processed adds 3 variables for the next processing:

$CSn $CPn $COn

If there was a successfull match of premiese and rule calls are defined, the list of rule calls is being invoked one by one, until the last one comes back. So far there is no validation of rule calls. So be carefull, not to create loops in these recurrence calls. It is possible to send variables to rule being called. The values of variables can be enriched with use variables and functions, just like in premises and consequents values.

After triples are generated, the last premise is being matched again. If it succeeds, than next run of triples generation is performed. If it fails, the previous premise is being matched again, and so on. Until all the possible combinations of premises matching are fullfilled.

The processing of rules goes on - until rule with terminate attribute is reached. The the process is completed.

Transformation Processing Flow

[[Image:Transformationflow.png]]

Installation

In general you will need RDF Translator (RDFT) library and a list of libraries this package depends on:

  • antlr.jar
  • concurrent.jar
  • icu4j.jar
  • jena.jar
  • log4j-1.2.7.jar
  • rdft.jar
  • xml-apis.jar
  • commons-logging.jar
  • jakarta-oro-2.0.5.jar
  • junit.jar
  • rdf-api-2001-01-19.jar
  • xercesImpl.jar
  • xpp3.jar

You can find all of them in TAR-GZIPPED archive of RDF Translator. See the example below how to use this tool.

Example

transformation definition

For sample transformation from MARC-RDF ontology to MarcOnt ontology see: Sample Translation Rules? or RDF Rules? (when using rules in RDF format make sure that <rdf:RDF> tag will be within first 1 kB of the file).

input data

For sample MARC-RDF based source model see: MARC-RDF Source?

execution

/*

  • Code sniplet from org.marcont.rdftranslator.Translate */

// --- load transformation description from file FileInputStream? fis = new FileInputStream?(new File(args[0])); // --- create transformation Translation t = TranslationFactory?.createTranslation(fis); // --- execute tranformation on the source model loaded from file

t.execute(new FileInputStream?(new File(args[1])));

System.out.println(" ---- source -----"); t.getSrcModel().write(System.out, "N-TRIPLE"); System.out.println(" ---- source -----"); System.out.println(" ---- results -----"); t.getDestModel().write(System.out, "N-TRIPLE"); System.out.println(" ---- results -----");

result of transformation

For sample MarcOnt based result model see: MarcOnt Result

RDF Translator Predefined Functions

FN_CLONE

clone($var, 'namespace:')' - clones the resource of URI = $var to new 'namespace:'. The namespace that the $var has must be registered in Translation.namespaces

FN_GENERATEID

generateId() - generates a number based random URI

generateId('foaf:')' - generetes URI based on foaf: namespace

generateId('namespace:', Object ... args)' - generates random URI in given namespace based on sha1sum(args).

Args can be varibles, but not regexps.

FN_ITERATOR

iterator('rdf:',$seq_ID)' - generates numbered nodes with increasing number for each such Seq collection

output: e.g. 'rdf:_1'

FN_BNODE

output: generates a bnode

Open Issues

Mapping Tool Use Case Diagram

[[Image:Mapptooluc.jpg]]

Related pages

If you are interested in examples of data used by RDFTranslator you can visit these useful sites:

* RDF Schema - used in RDFTranslator * XML Schema - for RDFTranslator * RDF Source - example of source RDF for RDFTranslator * Sample RDF Rules - for RDFTranslator * Sample RDF Rules