# Basic Lens
Basic lenses can be used on one base relation, over which we can apply a filter, an extended projection, and additional constraints.
# Projection
For this section, we first look at the table museums
from the DuckDB database. This table has the following schema:
column | type |
---|---|
museum_id | integer |
name | string |
address | string |
yearly_income | integer |
yearly_spendings | integer |
ratings | array of floats |
The column museum_id
is a primary key.
Now, we make two decisions:
- The
ratings
column is not important to us, so we want to remove it. - The columns
yearly_income
andyearly_spendings
are not interesting, but we would like to know the yearly profit (income - spendings)
Both of these goals can be achieved using a single basic lens. The basic lens has the following structure:
{
"name": [String],
"baseRelation": [String],
"columns": {
"added": [{
"name": String,
"expression": String
}],
"hidden": [String]
},
"filterExpression": String,
"type": "BasicLens"
}
We can prepare our lenses.json
file like this:
{
"relations": [
{
"name": ["lenses", "museum_projection"],
"baseRelation": ["museums"],
"columns": {
"added": ...,
"hidden": ...
},
"type": "BasicLens"
}
]
}
As we do not require a filter, we do not need the field filterExpression
.
The field columns
can be used to list existing columns that should be removed and new columns that should be added.
Above, we decided that the fields ratings
, yearly_income
and yearly_spendings
are not interesting to us. We can easily hide them by including their names in the hidden
list.
To add a new yearly_profit
field, we have to add one entry to the added
list, using yearly profit
as its name, and using the expression yearly_income - yearly_spendings
in its expression
field.
NOTE
Even though we have decided to hide the columns yearly_income
and yearly_spendings
, we can still use them for expressions within the same lens.
After making these changes, the full lenses.json
file should look like this:
{
"relations": [
{
"name": ["lenses", "museum_projection"],
"baseRelation": ["museums"],
"columns": {
"added": [
{
"name": "yearly_profit",
"expression": "CAST(yearly_income - yearly_spendings as INTEGER)"
}
],
"hidden": ["ratings", "yearly_income", "yearly_spendings"]
},
"type": "BasicLens"
}
]
}
NOTE
In the expression for yearly_profit
, we cast the result of the subtraction to an INTEGER
. This way, Ontop is guaranteed to know that the column will be of type INTEGER
.
# Mapping
Now that we have created a basic lens to re-format the input table, we can use the lens in a mapping. For this, we start with the mapping template provided in the tutorial files. This template already contains basic mappings that define individuals of the classes :Museum
and :Worker
.
We now want to set the :yearlyProfit
datatype property for all museums. We can achieve that by adding the following mapping:
mappingId MAPID-museum-profit
target data:museum/{museum_id} :yearlyProfit {yearly_profit} .
source SELECT museum_id, yearly_profit FROM lenses.museum_projection;
After doing that, we can copy the lenses.json
and mapping.obda
files into the Ontop endpoint input
directory, as described in the setup page and start the endpoint. Once the endpoint is started, we can open the SPARQL query editor at http://localhost:8080 (opens new window) and run the following query to test it:
PREFIX : <http://example.org/museum_kg/>
SELECT ?name ?profit WHERE {
?museum a :Museum .
?museum :name ?name .
?museum :yearlyProfit ?profit .
}
If the lenses and mappings were constructed correctly, this query should return a list of museum names, together with different values for their yearly profit.
NOTE
Notice how by changing the source query by adding ratings
, yearly_income
, or yearly_spendings
to it, the execution of the mapping fails, stating that the columns were not found. This is because they were hidden by the lens.
# Filter
Another use case of basic lenses is to filter out some specific rows from the input table. As an example, we will look at the table workers
. This table has the following schema:
column | type |
---|---|
worker_id | integer |
full_name | string |
role | string |
museum_id | integer |
titles | array of strings |
access_level | integer |
The column worker_id
is a primary key. The column museum_id
is a foreign key that references the table museums
.
The column role
is a string that indicates the name of the worker's role. It can take the following three values: "manager"
, "guide"
, and "guard"
. For our VKG, we decide that we want to designate all managers as individuals of the class :Manager
. One way to achieve this is to use the filter feature of basic lenses.
We can prepare our lenses.json
file like this:
{
"relations": [
{
"name": ["lenses", "managers_filter"],
"baseRelation": ["workers"],
"filterExpression": ...,
"type": "BasicLens"
}
]
}
The columns
field can be removed, as it is not required for this example. Now, we just need to define a value for the field filterExpression
that ignores all rows for which the column role
is not equal to "manager"
. The filterExpression
is defined as a SQL expression in the same style as SQL WHERE
clauses.
After making this change, the full lenses.json
file should look like this:
{
"relations": [
{
"name": ["lenses", "managers_filter"],
"baseRelation": ["workers"],
"filterExpression": "role = 'manager'",
"type": "BasicLens"
}
]
}
# Mapping
We can once again use the generated lens in our mapping file. For this, we will again extend the mapping template with one new mapping:
mappingId MAPID-worker-managers
target data:worker/{worker_id} a :Manager .
source SELECT worker_id FROM lenses.managers_filter;
Then, we start the Ontop endpoint and open the SPARQL editor to run the following query:
PREFIX : <http://example.org/museum_kg/>
SELECT ?name WHERE {
?worker a :Manager .
?worker :name ?name .
}
This should result in a list of 3 manager names.
WARNING
All examples in the lens sections are compatible with each other. However, all lenses referenced in the mapping file need to be included in lenses.json
. If you wish to continuously extend the mapping file throughout the tutorial,
you will also need to keep all previous lenses in the relations
list of the lenses file, otherwise, Ontop will throw an error. Alternatively, you can always start each of the sections from a new mapping template file. This way, only the lenses of the current exercise have to be included.
As an extension to this exercise, you can now try to define similar lenses and mappings for the "guide"
and "guard"
roles, assigning them to the classes :Guide
and :Guard
, respectively.
# Adding Constraints
Another valuable feature of lenses is adding further constraints to relations. This feature is supported for all types of lenses, but we will cover it in this section only.
Generally, Ontop can infer many constraints from the base relation used by a lens. For instance, the field museum_id
in the table museumS
is a primary key, so it is unique and not null. Our previously defined lens will be able to infer that the output relation is still unique and not null. On the other hand, for composite primary keys, if one part of the composite key is hidden by a lens, then Ontop knows that the remaining part is no longer guaranteed to be unique.
In many instances, however, expert knowledge can be used to define further constraints for lenses. A full list of all supported constraints and how they can be defined can be found in the documentation of lenses.
For this section, we want to use our expert knowledge of the table workers
to provide the following constraints:
- The name of a worker is unique and not null.
- The role of a worker is not null
- There is a functional dependency from role to access_level.
- The column
museum_id
is a foreign key that references the tablemuseums
(DuckDB does not have a notion of foreign keys, so it is useful to add it explicitly). - The
role
column is IRI-safe (all possible values of the column can be safely included in an IRI without further encoding).
We can create a basic lens over the table workers
to define these constraints, following the guidelines from the documentation of lenses. A possible solution could look like this:
{
"relations": [
{
"name": ["lenses", "workers_constraints"],
"baseRelation": ["workers"],
"uniqueConstraints": {
"added": [
{
"name": "uc",
"determinants": ["full_name"]
}
]
},
"nonNullConstraints": {
"added": [
"full_name",
"role"
]
},
"otherFunctionalDependencies": {
"added": [
{
"determinants": ["role"],
"dependents": ["access_level"]
}
]
},
"foreignKeys": {
"added": [
{
"name": "fk",
"from": ["museum_id"],
"to": {
"relation": ["museumS"],
"columns": ["museum_id"]
}
}
]
},
"iriSafeConstraints": {
"added": [
"role"
]
},
"type": "BasicLens"
}
]
}
Lenses can reference each other! As a further exercise, you can try using this newly created lens as the baseRelation
of the managers filter from before. Once that is done, you can query it again - the results should be the same as before.
NOTE
We could also have added all of these constraints to the same managers_filter
lens from the previous section to achieve the same results. This has the advantage of reducing the work required for the lens setup and reducing the total number of relations accessible by Ontop, but it is less flexible, as we would have to copy all the constraints to the guides_filter
and guard_filter
lenses as well, resulting in a lot of duplication.
Generally, the optimal solution depends on the specific scenario.