-
Notifications
You must be signed in to change notification settings - Fork 16
OntoDriver
OntoDriver is a data access layer used by JOPA. Splitting storage access and the object-triple/ontology mapping allows storage-specific OntoDriver implementations to be added easily. In addition, the application can then switch the underlying storage by merely changing a few configuration parameters in the persistence setup.
This page gives an overview of setting up the persistence and some basic insight into the configuration of the particular OntoDriver implementations.
Initializing persistence is done by calling the Persistence.createEntityManagerFactory()
method with appropriate properties. There are several required parameters:
Parameter | Explanation |
---|---|
cz.cvut.jopa.scanPackage |
The package in which entity declarations reside. |
javax.persistence.provider |
This is mostly a legacy of JOPA's resemblance to JPA. Use cz.cvut.kbss.jopa.model.JOPAPersistenceProvider for JOPA. |
cz.cvut.jopa.dataSource.class |
DataSource implementation to use. This is how the OntoDriver implementation is specified. Each of the drivers has a dedicated DataSource implementation class. |
cz.cvut.jopa.ontology.physicalURI |
Physical URI of the storage. This can be a remote RDF4J repository URL, a folder for Jena TDB or an OWL file location. A physical URI is required also for in-memory storage. However, its value is not important in this case. |
JOPA no longer requires a logical ontology IRI, since a) RDF4J and Jena don't need it for RDF access and b) OWLAPI also supports anonymous ontologies, so it is not strictly speaking required to specify a logical IRI.
Common OntoDriver configuration parameters can be found, together with their explanation, in class cz.cvut.kbss.ontodriver.config.OntoDriverProperties
.
The Jena OntoDriver is the latest implementation of OntoDriver. It can use the following underlying storages:
- In-memory - a transactional Jena
Dataset
is created. - File - Jena
RDFDataMng
is used to load a model from a file. - TDB
Support for Jena SDB was also considered, but since the SDB project development has stopped it is unlikely that it will be implemented in the Jena OntoDriver.
Jena OntoDriver configuration parameters and their possible values can be found in cz.cvut.kbss.ontodriver.jena.config.JenaOntoDriverProperties
.
Note also that to Jena applies the cz.cvut.jopa.reasonerFactoryClass
parameter, which allows to specify a Jena-compatible reasoner implementation.
The driver supports two transaction isolation strategies:
Each transaction keeps a local model consisting of added and removed statements. Find statements operations are run against a shared model and their results are enhanced with the transaction-local changes. However, this means that when the shared model changes, e.g. another transaction commits changes to it, a find operation may produce different results in subsequent calls.
The local model changes do not apply to SPARQL query results, which are run against the shared model only.
Each transaction on begin creates a complete snapshot of the dataset and operates on it. On commit, the changes done to it are merged into the main dataset. This strategy provides better isolation, but is more demanding in terms of memory.
This strategy is also used when a reasoner is specified for the driver, because the reasoner has to operate on a single model (i.e. it is not possible to use a reasoner and the read-committed strategy).
The OWL API driver supports access to proper OWL (2) ontologies via OWL API.
The driver supports some specific features, namely module extraction where signature-based submodules of the main ontology can be extracted to improve performance. Also, OWL API allows a mapping file to be specified which is used when resolving logical IRIs of ontologies, e.g. in imports.
The parameters are explained in cz.cvut.kbss.ontodriver.owlapi.config.OwlapiOntoDriverProperties
.
OWL API also can make use of the cz.cvut.jopa.reasonerFactoryClass
parameter - a OWLReasoner
-compatible implementation class has to be specified.
The driver supports SPARQL DL queries by relying on OWL2Query. However, the range of operators implemented is limited.
The OWL API driver uses the snapshot-based transaction isolation strategy, described above.
The RDF4J driver is still referred to as Sesame driver in the code, because it was written against the original Sesame API. However, it has since been modified to use the more recent RDF4J API. But it is still possible to connect to Sesame servers.
Driver configuration parameters with explanation can be found in cz.cvut.kbss.ontodriver.sesame.config.SesameOntoDriverProperties
. Mainly, it is about deciding whether and RDFS forward chaining rule-based reasoner should be used or not.
Supported storages are:
- In-memory - has to be configured via the
cz.cvut.kbss.ontodriver.sesame.use-volatile-storage
property. - RDF4J native store - specify a valid folder on file system and RDF4J will create the necessary binary files (or load the data if they already exist).
- RDF4J server repository - if a URL of remote RDF4J repository is specified, the driver will connect to it.
Since version 0.11.0 it is also possible to pass a path to a repository configuration file (usually TTL) to the driver. The content of this file is then used to configure the embedded storage (in-memory or local native). This way, a SPIN storage with custom rules can be created in memory or as a local native repository without starting a full-blown RDF4J server.
To exploit this configuration, pass a path to the repository configuration file using the cz.cvut.kbss.ontodriver.sesame.repository-config
property to the driver. The path may be relative or absolute (more is described in Javadoc to the SesameOntoDriverProperties
class). Examples of various repository configuration files can be found in the RDF4J GitHub repository - https://github.com/eclipse/rdf4j/tree/master/repository/api/src/main/resources/org/eclipse/rdf4j/repository/config.
The RDF4J driver uses the read-committed transaction isolation strategy described above.