Validate SHACL Shapes with pystrardog

Hi ,

I want to validate SHACL Shapes with pystardog query module ,

Lets say I have a query like this ,

VALIDATE GRAPH virtual://adv_db918144785464892329_vkg USING SHAPES
{
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@prefix owl: http://www.w3.org/2002/07/owl# .
@prefix sh: http://www.w3.org/ns/shacl# .
@prefix xsd: http://www.w3.org/2001/XMLSchema# .
@prefix skos: http://www.w3.org/2004/02/skos/core# .
@prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .

https://MySQL_-_8.0.33/AdventureWorks2014/AdventureWorks2014#Password-rowguid_dp
a sh:PropertyShape ;
sh:datatype xsd:string ;
sh:path https://MySQL_-_8.0.33/AdventureWorks2014/AdventureWorks2014#rowguid .

https://MySQL_-_8.0.33/AdventureWorks2014/AdventureWorks2014#Shift-Name_dp
a sh:PropertyShape ;
sh:datatype xsd:string ;
sh:path https://MySQL_-_8.0.33/AdventureWorks2014/AdventureWorks2014#Name .

}

I'm not quite sure what question you're asking but see ICV.report in the pystardog docs Modules — pystardog documentation

For this query , I'm using

query_result = conn.select( query= query, # offset=100, limit=1000, reasoning=True, default_graph_uri=["tag:stardog:api:context:all"] )

But getting error like,

An error occurred while processing the file: [406] : Unsupported Accept for query type: GRAPH"

The second approach I tried is adding ttl file through icv().add()
like

conn.begin()
conn.icv().add(stardog.content.Raw(ttl_file,content_type='text/turtle' ,name='data.ttl'))
conn.commit()
conn.close()

Where the ttl_file you will find in the uploaded file ,
advworks_generated_ttl.ttl (69.2 KB)

I'm facing the error like ,

detail": "An error occurred while processing the file: [400] 000IA2: Invalid UUID string: icv"

Then trying to generate the report with
validation_report = conn.icv().report()

Regards,
D

Hi Dipanjan,

It looks like you are calling select(), which is a convenience wrapper around the underlying __query() method, passing in sensible default parameters for that particular sparql query form. We do not currently have one of these methods for validate. The error stems from that select wrapper passing in sparql/json as the content type (ie return type) which is invalid for the validate query form.

The validate query form behaves like CONSTRUCT and returns an RDF payload, so you may try using the construct function as a workaround.

I've created issue 171 to add a new validate function to pystardog.

Cheers,
Al

Hi Al,

Can you explain how to use the construct function to validate the SHACL Shapes.
I tried,

CONSTRUCT GRAPH <virtual://adv_db918144785464892329_vkg> USING SHAPES { SHACL Shapes }

CONSTRUCT GRAPH is SPARQL which doesn't have a USING SHAPES clause unless there is some non-standard support in Stardog that I don't know about. Stardog supports SHACL through VALIDATE keywords and the SHACL query service. See this section of the docs Data Quality Constraints | Stardog Documentation Latest

Hi Zachary,

We have tried both of the options , but as mentioned by Al, is not currently available on pystardog .

So, can we conclude that we don't have functionality to validate shacl shapes with pystardog?

You can try the following:

with cf.connection() as conn:
    # get the validation report as serialized RDF
    validation_report = conn.graph("VALIDATE GRAPH <virtual://adv_db918144785464892329_vkg> USING SHAPES { SHACL Shapes }").decode("UTF-8")
    # parse the validation report into a graph
    g = Graph()
    g.parse(data=validation_report)

Can you tell me how you are importing Graph() .

Stardog graph is conn.graph()

That comes from rdflib, as we need a way to parse the RDF output from VALIDATE, as we would in CONSTRUCT.

from rdflib import Graph

If you'd like to use a SELECT in combination with your validation, that can done with interacting with this feature as a SPARQL service. The benefit is that you'll get results out in a normal tabular form, as opposed to an RDF graph and having to navigate results in code.

This is covered here:

From a use case perspective, I may opt to use the VALIDATE query to extract out the validation report as RDF, archive it, and then re-insert it back into another database/graph where I keep all of my data quality results. On the other hand, if you'd like to run validations, and look for specific things or build an application to render results, I would use SELECT with the validation service, as this will typically be easier to deal with for integrating into other environments (e.g. putting the results on a web page).

Thank you Al , this solves to issue.

Thank you for your support.

Hi ,

Actually Graph().parse(validation_report) did not work for me.

This might help to parse the response ,

results = []
pattern = re.compile(
    r"<http://www.w3.org/ns/shacl#sourceShape> <(?P<sourceShape>[^>]+)> ;\n"
    r"\s*<http://www.w3.org/ns/shacl#sourceConstraintComponent> <(?P<sourceConstraintComponent>[^>]+)> ;\n"
    r"\s*<http://www.w3.org/ns/shacl#focusNode> <(?P<focusNode>[^>]+)> ;\n"
    r"\s*<http://www.w3.org/ns/shacl#resultPath> <(?P<resultPath>[^>]+)> ;\n"
    r"\s*<http://www.w3.org/ns/shacl#value> \"(?P<value>[^\"]+)\" ;\n"
    r"\s*<http://www.w3.org/ns/shacl#resultMessage> \"(?P<resultMessage>[^\"]+)\" .\n"
)

matches = pattern.finditer(shacl_input)

for match in matches:
    result = {
        "sourceShape": match.group("sourceShape"),
        "sourceConstraintComponent": match.group("sourceConstraintComponent"),
        "focusNode": match.group("focusNode"),
        "resultPath": match.group("resultPath"),
        "value": match.group("value"),
        "resultMessage": match.group("resultMessage"),
    }
    results.append(result)