SHACL-SPARQL or SHACL + reasoner?

Hi folks,

I wondering which is the best approach to validate data for cases where a unique identifier value is not unique in the data. Take the example data below, where Person_3 has more than one UID. This is easy to detect this violation in SHACL, using sh:minCount and sh:maxCount for the sh:targetClass :Person.

:Person_1
    a :Person ;
    :hasUniqueID :UID_A .

:Person_2
    a :Person ;
    :hasUniqueID :UID_B .

:Person_3
    a :Person ;
    :hasUniqueID :UID_C .
    :hasUniqueID :UID_D .

:Person_4
    a :Person ;
    :hasUniqueID :UID_A   

The case for Person_1 and Person_4 sharing the same UID is more difficult.

Possible Solutions:

OPTION 1. SHACL-SPARQL
I am having no success translating this SPARQL query, which correctly detects UID_A as the offender, into SHACL-SPARQL. Where am I going wrong?

SELECT $this (COUNT($this) AS ?count) 
WHERE{
    ?personIRI a :Person ;
          :hasUniqueID $this .
  }  GROUP BY $this
     HAVING (?count >1)

OPTION 2 : Use a reasoner?
Define an "owl:inverseOf" for :hasUniqueID as :UIDAssignedTo , turn on the reasoner and do a sh:maxCount 1 for the sh:path :UIDAssignedTo, detecting that:

:UID_A :UIDAssignedTo :Person_4
:UID_A :UIDAssignedTo :Person_1

Employing a reasoner for this solution seems like overkill (plus, is it even possible in Stardog?)

So, what to do, and how best to do it?

Thanks for your help.

Tim

Hi Tim,

I shouldn't take credit because it was @evren who pointed out the most elegant solution here: just define sh:maxCount 1 on the path :hasUniqueID/^:hasUniqueID for the target class :Person. This is basically equivalent to declaring :hasUniqueID an inverse functional object property and using that as an ICV constraint (prior to SHACL).

You can also express it in SPARQL, of course.

Cheers,
Pavel

I'm almost there? This shape correctly flags Person_3:

# Correctly flags :Person_3 (two UIDs)
:UIDShape a sh:NodeShape ;
 sh:targetClass :Person ;
 sh:path        :hasUniqueID  ;
 sh:maxCount    1
 .

I don't follow how to employ :hasUniqueID/^:hasUniqueID .

Stardog Studio 1.11
Stardog 6.1.0

Hi Tim,

You would use the equivalent SHACL paths as in:

:PersonShape a sh:NodeShape ;
    sh:targetClass :Person ;
    sh:property [
        sh:path (:hasUniqueID [sh:inversePath :hasUniqueID]) ;
        sh:maxCount 1
    ] .

For completeness, let me mention another possibility using only the inverse path:

:UniqueIDShape a sh:PropertyShape ; 
    sh:targetObjectsOf :hasUniqueID ;
    sh:property [
        sh:path [sh:inversePath :hasUniqueID] ;
        sh:maxCount 1
    ] .

Best,
Evren

Thanks to you both for the quick turn-around on these elegant answers. The approach will be very useful.

Cheers!

Tim

One final thing here is that performance of SHACL validation was substantially improved in Stardog 6.2+. In 6.1 it's still a beta. Depending on the size of your data and/or the number of constraints, you may want to consider an upgrade.

Best,
Pavel

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.