BI SQL Schema Mappings

I'm playing around with the BI Server and SQL schemas and I have a few questions/issues. I'm using Stardog 8.1.1.

Issue 1

I've imported a SHACL shapes graph and a SQL schema is being generated from it since I can see the tables in a database tool. I have sql.schema.auto.source set to shacl. However, I'm noticing that property shapes defined for a class show up as columns only in the table for that class, but not as columns in any of the tables for subclasses of that class. For example, if I load the following TriG file:

@prefix ex: <http://example.org/> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:schema {
  org:Organization
    a owl:Class, sh:NodeShape ;
    sh:property [
      a sh:PropertyShape ;
      sh:path org:subOrganizationOf ;
      sh:maxCount 1 ;
      sh:class org:Organization ;
    ] ;
  .
  org:FormaOrganization
    a owl:Class, sh:NodeShape ;
    rdfs:subClassOf org:Organization ;
    sh:property [
      a sh:PropertyShape ;
      sh:path ex:officialName ;
      sh:maxCount 1 ;
      sh:datatype xsd:string;
    ] ;
  .
}

ex:data {
  ex:Test1
    a org:Organization ;
  .
  ex:Test2
    a org:FormalOrganization ;
    ex:officialName "Test Formal Organization" ;
  .
}

I get two tables, each with two columns:

  • Organization, with columns id and subOrganizationOf
  • FormalOrganization with columns id and officialName

Is that the expected behavior? I would have expected that the FormalOrganization table to also have the subOrganizationOf column.

Issue 2

If I go to the BI Mapping tab in Studio after uploading the above TriG file I see the following:

@prefix : <http://api.stardog.com/> .
@prefix stardog: <tag:stardog:api:> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sql: <tag:stardog:api:sql:> .

<http://www.w3.org/ns/org#OrganizationTableMapping> a sql:TableMapping ;
    sql:class <http://www.w3.org/ns/org#Organization> ;
    sql:tableName "Organization" .

<http://www.w3.org/ns/shacl#PropertyShapeTableMapping> a sql:TableMapping ;
    sql:class <http://www.w3.org/ns/shacl#PropertyShape> ;
    sql:tableName "PropertyShape" .

<http://www.w3.org/ns/shacl#NodeShapeTableMapping> a sql:TableMapping ;
    sql:class <http://www.w3.org/ns/shacl#NodeShape> ;
    sql:tableName "NodeShape" .

<http://www.w3.org/ns/org#FormalOrganizationTableMapping> a sql:TableMapping ;
    sql:class <http://www.w3.org/ns/org#FormalOrganization> ;
    sql:extends <http://www.w3.org/ns/org#OrganizationTableMapping> ;
    sql:tableName "FormalOrganzation" .

There are no fields defined for these tables, but based on the documentation and this example referenced in the documentation, I would have expected fields to have been autogenerated. I have tried variations of the TriG file that all provided the same autogenerated result, including sh:targetClass instead of implicit class targets and having the property shapes not be blank nodes.

I am able to see the expected behavior with the subOrganizationOf column appearing in both tables if I manually create the following mapping:

@prefix : <http://api.stardog.com/> .
@prefix stardog: <tag:stardog:api:> .
@prefix ex: <http://example.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sql: <tag:stardog:api:sql:> .

org:OrganizationTableMapping a sql:TableMapping ;
    sql:class org:Organization ;
    sql:tableName "Organization" ;
    sql:hasField [
        sql:property org:subOrganizationOf ;
        sql:refersTo org:OrganizationTableMapping ;
        sql:optional true ;
    ] ;
.
org:FormalOrganizationTableMapping a sql:TableMapping ;
    sql:class org:FormalOrganization ;
    sql:tableName "FormalOrganization" ;
    sql:extends org:OrganizationTableMapping ;
    sql:hasField [
        sql:property ex:officialName ;
        sql:optional true ;
    ] ;
.

Issue 3

After I use the manually created mapping, I see that the Organization table has 1 row and the FormalOrganization table has 1 row. Is this the expected behavior? I would have expected to see both example instances in the Organization table, i.e. I would have expected subclass inferencing.

Hi Matt,

I'll ask Evren to provide more detail on the first issue, but it looks like Stardog is getting confused by the mixed ontology+shacl definition. We have a few advanced users who do this, but most separate the ontology or SHACL into their own files and ultimately their own named graphs.

Because the mappings are just RDF, this also leaves open the door for more advanced workflows, like down select an ontology (e.g. in a construct) and using that to auto-gen, or use custom properties in the ontology and a separate construct query to create the mapping. We hope to expand on how to take advantage of the BI/SQL, as it will be getting a lot of attention this year.

Regarding the subclass inferencing, that should work if you run the SQL query with the reasoning parameter set. Otherwise, the structural relationship in the mapping isn't going to automatically perform the inference. I could see the argument that it should, and would be a nice way to build these types of table views w/o having the full implications of inferencing turned on for all other fields. I'll take this feedback to product for consideration.

We have a lot of ontologies, both developed in-house and third-party, that use implicit class targeting. And as I said, I saw the same results regardless of whether I split it out into a separate shapes graph.

I did discover something new, however- if I run the stardog data model command as mentioned here to generate mappings, then it actually generates the fields appropriately (but without the sql:extends triples). So it seems that it actually is doing the right thing under the hood, but Studio is not showing the right thing when the mappings are autogenerated as described above.

As for the inferencing option- I had missed that when I read the documentation. That does exactly what I want, thanks for pointing it out!

To clarify my above point- when I load the TriG file included under Issue 1 above and let Stardog autogenerate a mapping, this is the resulting ERD:
image

Note that there are two tables and the subOrganizationOf column appears in only the Organization table, i.e. sql:extends doesn't seem to be used.

If I open the BI Mapping tab in Studio, I see the mapping found under Issue 2 above. Clearly that mapping does not correspond to what I'm actually seeing- if it did, I would see 4 tables, each with only an id column.

However, if I run the stardog data model command as mentioned here to generate mappings, I get the following mapping:

@prefix : <http://api.stardog.com/> .
@prefix stardog: <tag:stardog:api:> .
@prefix ex: <http://example.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sql: <tag:stardog:api:sql:> .

org:OrganizationTableMapping a sql:TableMapping ;
    sql:class org:Organization ;
    sql:hasField [
        sql:fieldName "subOrganizationOf" ;
        sql:optional true ;
        sql:property org:subOrganizationOf ;
        sql:refersTo org:OrganizationTableMapping
    ] ;
    sql:tableName "Organization" .

org:FormalOrganizationTableMapping a sql:TableMapping ;
    sql:class org:FormalOrganization ;
    sql:hasField [
        sql:fieldName "officialName" ;
        sql:optional true ;
        sql:property ex:officialName ;
        sql:type xsd:string
    ] ;
    sql:tableName "FormalOrganization" .

This mapping actually matches the ERD I'm seeing, so Stardog is doing the correct thing under the hood.

If I add the following quad to this file:

org:FormalOrganization sh:node org:Organization ex:schema

and regenerate the mappings, then I get the sql:extends triple I expected.

If I add the manual mapping from the above post, then Studio shows the manual mapping in the BI Mapping tab.

Next, if I upload the following TriG file:

@prefix ex: <http://example.org/> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

ex:schema {
  org:Organization
    a owl:Class ;
  .
  org:FormalOrganization
    a owl:Class ;
    rdfs:subClassOf org:Organization ;
  .
  org:subOrganizationOf
    a owl:ObjectProperty ;
    rdfs:domain org:Organization ;
    rdfs:range org:Organization ;
  .
  ex:officialName
    a owl:DatatypeProperty ;
    rdfs:domain org:FormalOrganization ;
    rdfs:range xsd:string ;
  .
  org:OrgShape
    a sh:NodeShape ;
    sh:targetClass org:Organization ;
    sh:property ex:Org-suborg ;
  .
  ex:Org-suborg
    a sh:PropertyShape ;
    sh:path org:subOrganizationOf ;
    sh:maxCount 1 ;
    sh:class org:Organization ;
  .
  org:FormalOrgShape
    a sh:NodeShape ;
    sh:targetClass org:FormalOrganization ;
    sh:property ex:Formal-name ;
  .
  ex:Formal-name
    a sh:PropertyShape ;
    sh:path ex:officialName ;
    sh:maxCount 1 ;
    sh:datatype xsd:string;
  .
}

ex:data {
  ex:Test1
    a org:Organization ;
  .
  ex:Test2
    a org:FormalOrganization ;
    ex:officialName "Test Formal Organization" ;
  .
}

and I try to autogenerate the mapping from SHACL, I get exactly the same result without sql:extends. Again, adding the appropriate quad with sh:node as predicate causes the expected sql:extends triple to show up. Therefore, it seems that implicit vs. explicit node shapes doesn't seem to make a difference.

Furthermore, if I I try to autogenerate the mapping from OWL, I get the following:

@prefix : <http://api.stardog.com/> .
@prefix stardog: <tag:stardog:api:> .
@prefix ex: <http://example.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sql: <tag:stardog:api:sql:> .

org:OrganizationTableMapping a sql:TableMapping ;
    sql:class org:Organization ;
    sql:hasField [
        sql:fieldName "subOrganizationOf" ;
        sql:optional true ;
        sql:property org:subOrganizationOf ;
        sql:refersTo org:OrganizationTableMapping
    ] ;
    sql:tableName "Organization" .

org:FormalOrganizationTableMapping a sql:TableMapping ;
    sql:class org:FormalOrganization ;
    sql:extends org:OrganizationTableMapping ;
    sql:hasField [
        sql:fieldName "officialName" ;
        sql:optional true ;
        sql:property ex:officialName ;
        sql:type xsd:string
    ] ;
    sql:tableName "FormalOrganization" .

sh:PropertyShapeTableMapping a sql:TableMapping ;
    sql:class sh:PropertyShape ;
    sql:tableName "PropertyShape" .

sh:NodeShapeTableMapping a sql:TableMapping ;
    sql:class sh:NodeShape ;
    sql:tableName "NodeShape" .

Autogenerating from OWL does add sql:extends for classes/subclasses.

Therefore, I see two potential issues:

  1. Why do I see the wrong autogenerated mapping in Studio?
  2. Why do autogenerated mappings for OWL include sql:extends triples for rdfs:subClassOf but autogenerated mappings for SHACL only include them for sh:node instead of for both sh:node and rdfs:subClassOf? It's redundant to have to specify both sh:node and rdfs:subClassOf, regardless of whether the shapes are implicit class targets or not. It is as if there were something like the --simple-target option for the stardog icv report command that was always enabled.

Any further thoughts based on the above two potential issues?