Skip to content
Martin Ledvinka edited this page Jan 20, 2019 · 11 revisions

OntoDriver is a data access layer used by JOPA. Splitting storage access and the object-triple/ontology mapping allows storage-specific OntoDriver implementations to be added easily. In addition, the application can then switch the underlying storage by merely changing a few configuration parameters in the persistence setup.

This page gives an overview of setting up the persistence and some basic insight into the configuration of the particular OntoDriver implementations.

Persistence Setup

Initializing persistence is done by calling the Persistence.createEntityManagerFactory() method with appropriate properties. There are several required parameters:

Parameter Explanation
cz.cvut.jopa.scanPackage The package in which entity declarations reside.
javax.persistence.provider This is mostly a legacy of JOPA's resemblance to JPA. Use cz.cvut.kbss.jopa.model.JOPAPersistenceProvider for JOPA.
cz.cvut.jopa.dataSource.class DataSource implementation to use. This is how the OntoDriver implementation is specified. Each of the drivers has a dedicated DataSource implementation class.
cz.cvut.jopa.ontology.physicalURI Physical URI of the storage. This can be a remote RDF4J repository URL, a folder for Jena TDB or an OWL file location. A physical URI is required also for in-memory storage. However, its value is not important in this case.

JOPA no longer requires a logical ontology IRI, since a) RDF4J and Jena don't need it for RDF access and b) OWLAPI also supports anonymous ontologies, so it is not strictly speaking required to specify a logical IRI.

OntoDriver Configuration

Common OntoDriver configuration parameters can be found, together with their explanation, in class cz.cvut.kbss.ontodriver.config.OntoDriverProperties.

Jena OntoDriver

The Jena OntoDriver is the latest implementation of OntoDriver. It can use the following underlying storages:

  • In-memory - a transactional Jena Dataset is created.
  • File - Jena RDFDataMng is used to load a model from a file.
  • TDB

Support for Jena SDB was also considered, but since the SDB project development has stopped it is unlikely that it will be implemented in the Jena OntoDriver.

Jena OntoDriver configuration parameters and their possible values can be found in cz.cvut.kbss.ontodriver.jena.config.JenaOntoDriverProperties.

Note also that to Jena applies the cz.cvut.jopa.reasonerFactoryClass parameter, which allows to specify a Jena-compatible reasoner implementation.

The driver supports two transaction isolation strategies:

Read-committed

Each transaction keeps a local model consisting of added and removed statements. Find statements operations are run against a shared model and their results are enhanced with the transaction-local changes. However, this means that when the shared model changes, e.g. another transaction commits changes to it, a find operation may produce different results in subsequent calls.

The local model changes do not apply to SPARQL query results, which are run against the shared model only.

Snapshot-based

Each transaction on begin creates a complete snapshot of the dataset and operates on it. On commit, the changes done to it are merged into the main dataset. This strategy provides better isolation, but is more demanding in terms of memory.

This strategy is also used when a reasoner is specified for the driver, because the reasoner has to operate on a single model (i.e. it is not possible to use a reasoner and the read-committed strategy).

OWL API OntoDriver

The OWL API driver supports access to proper OWL (2) ontologies via OWL API.

The driver supports some specific features, namely module extraction where signature-based submodules of the main ontology can be extracted to improve performance. Also, OWL API allows a mapping file to be specified which is used when resolving logical IRIs of ontologies, e.g. in imports.

The parameters are explained in cz.cvut.kbss.ontodriver.owlapi.config.OwlapiOntoDriverProperties.

OWL API also can make use of the cz.cvut.jopa.reasonerFactoryClass parameter - a OWLReasoner-compatible implementation class has to be specified.

The driver supports SPARQL DL queries by relying on OWL2Query. However, the range of operators implemented is limited.

The OWL API driver uses the snapshot-based transaction isolation strategy, described above.

RDF4J OntoDriver

The RDF4J driver is still referred to as Sesame driver in the code, because it was written against the original Sesame API. However, it has since been modified to use the more recent RDF4J API. But it is still possible to connect to Sesame servers.

Driver configuration parameters with explanation can be found in cz.cvut.kbss.ontodriver.sesame.config.SesameOntoDriverProperties. Mainly, it is about deciding whether and RDFS forward chaining rule-based reasoner should be used or not.

Supported storages are:

  • In-memory - has to be configured via the cz.cvut.kbss.ontodriver.sesame.use-volatile-storage property.
  • RDF4J native store - specify a valid folder on file system and RDF4J will create the necessary binary files (or load the data if they already exist).
  • RDF4J server repository - if a URL of remote RDF4J repository is specified, the driver will connect to it.

Repository Configuration

Since version 0.11.0 it is also possible to pass a path to a repository configuration file (usually TTL) to the driver. The content of this file is then used to configure the embedded storage (in-memory or local native). This way, a SPIN storage with custom rules can be created in memory or as a local native repository without starting a full-blown RDF4J server. To exploit this configuration, pass a path to the repository configuration file using the cz.cvut.kbss.ontodriver.sesame.repository-config property to the driver. The path may be relative or absolute (more is described in Javadoc to the SesameOntoDriverProperties class). Examples of various repository configuration files can be found in the RDF4J GitHub repository - https://github.com/eclipse/rdf4j/tree/master/repository/api/src/main/resources/org/eclipse/rdf4j/repository/config.

The RDF4J driver uses the read-committed transaction isolation strategy described above.

Clone this wiki locally