Land Portal Linked Open Data: generation process
The process to offer all the Land Portal data as Linked Open Data is not a trivial task. Some task are run for this purpose. Let's have a one on them (see the numbers in the circles):
- The first process is to generate RDF from the statistical data that is not available as Linked Open Data in the Land Book LOD Data Model. The statistical data, that comes from a variety of datasets and in a divertisty of formats (excel files, CSV, APIs, JSON, XML...) is passed throught a one of the landbook-importers (available on the github repository). So, after the import process, a list of RDF files are generated and ready to be uploaded to the triple store.
- The generated RDF files (that contains the statistical information) are uploaded to Virtuoso, a triple store (also know as RDF store), where the information can be queried using the SPARQL protocol.
This process is focused in uploading the data from landportal.org into the Virtuoso triple store in the Land Portal LOD data model. landportal.org (this portal that is running over a Drupal instance) hosts a lot of data saved in a MySQL database. In order to push all this data into the Land Portal Virtuoso a process is runned. This process, using the SPARQL Update protocol, is a combination of a customized fork of the rdfx module (that shapes the RDF generated by Drupal) and a customized fork of the RDF Drupal Indexer module. The latter module uses the Drupal Search API to publish triples to a triple store (in a selected graph).