Hi all,
I just want to reopen the question asked in this previous related issue:
As we are suggesting in pyrdf2vec to setup a local SPARQL endpoint using Stardog, several people might have to deal with this related issue (crf. this blogpost).
Currently, we make simple select requests to our local SPARQL endpoint:
SELECT ?p ?o ?dt WHERE { BIND( IRI("' + noi + '") AS ?s ) ?s ?p ?o. BIND (datatype(?o) AS ?dt)
as the node of interest (noi) varies a lot, we just use some string concatenation to create the query.
Together with our host description, we make requests using a python http client to:
self.host + '/' + self.db + '/query?query=' + query
headers=[("Accept", "application/sparql-results+json")]
As we have a long list of nodes of interest, all of our code can be ran in a multiprocessing fashion.
When making a lot of these http requests using multiple processors, we start noticing that after a while, the response rate of the local SPARQL endpoint starts to decrease. When using the stardog-admin server status
command, I can see the rate/sec dropping after a while. Also the Memory Heap gets close to the predefined 8G.
As mentioned in the related issue, our script stops after a while with the following errors:
Unexpected error encountered: invalid http version, `-Error-Code: 000012`
Unexpected error encountered: invalid http version, `Stardog-Error-Message: GC overhead limit exceeded`
At that point, I can't stop the Stardog server anymore using the stardog-admin server stop
command but I have to kill the java process manually.
We did not notify these issues when using 1 single core (despite the fact that it would take a while to get all the results).