Hello!
I'm doing some custom machine learning work with stardog, and as part of that, I need to run a series of hundreds of PATHS ALL
queries. The individual queries all work well, returning results in ~1 second, but after running many of them in sequence, stardog becomes unresponsive.
As the queries run, stardog starts to use more and more memory. Once it reaches its memory allocation, then the CPU starts churning at 100% and the server is no longer responsive, requiring a restart. (Seems to me like there may be a memory leak?)
Running the server with more memory (have tried up to -Xmx16g -Xms16g
, direct memory 32g) allows the system to process more queries before locking up, but does not solve the problem. I'm running stardog through docker, and the container does have a higher memory limit than stardog is configured to use.
The database is really small too - only ~6K triples - so it really doesn't seem like it should need that much memory.
The specific query comes from the following python format string, where {s}
and {t}
are two specific entities, and {max_length}
is provided and is usually 5:
query = f"""
PATHS ALL
START ?s = {s}
END ?t = {t}
VIA {{
GRAPH <###subgraph_name###> {{
?s a ?sc . ?t a ?tc .
}}
?sc a <http://www.w3.org/ns/shacl#NodeShape> .
?tc a <http://www.w3.org/ns/shacl#NodeShape> .
# In the case of triples of the form s --(r)-> t
{{
?sc <http://www.w3.org/ns/shacl#property> ?_prop .
?_prop <http://www.w3.org/ns/shacl#class> ?tc ;
<http://www.w3.org/ns/shacl#path> ?p .
GRAPH <###subgraph_name###> {{ ?s ?p ?t . }}
BIND('forward' AS ?direction)
}}
UNION
# In the case of triples of the form t --(r)-> s
{{
?tc <http://www.w3.org/ns/shacl#property> ?_prop .
?_prop <http://www.w3.org/ns/shacl#class> ?sc ;
<http://www.w3.org/ns/shacl#path> ?p .
GRAPH <###subgraph_name###> {{ ?t ?p ?s . }}
BIND('inverse' AS ?direction)
}}
}}
MAX LENGTH {max_length}
"""
Any guidance would be very much appreciated. Thank you!