We stumbled over the issue that it takes a very long time to add and/or remove large graphs.
Context
Stardog 6.1.1 with default options plus
versioning.enabled=false
preserve.bnode.ids=false
A test with adding and removing data to/from a graph:
Create 35 million triples test data using BSBM:
$ ./generate -s nt -pc 100000
...
34872182 triples generated.
Add data:
$ stardog data add -g http://example.org/ stardog dataset.nt
Adding data from file: dataset.nt
34,872,182 triples added in 00:05:10.341
Remove data:
$ stardog data remove -g http://example.org/ stardog
34,872,182 triples removed in 00:04:14.675
The same behavior applies for the update query DROP SILENT GRAPH <http://example.org/>.
Issue
Our issue is that we need a way to trigger a graph removal which doesn't block the call until the graph is successfully removed. Is there any way to accomplish this? Like a way to trigger a graph removal in the background.
Note the “&” added to the end of your command which would send the command/transaction into the background and allow your script to continue otherwise.
Okay, sorry. I forgot to mention that we connect to Stardog over the Java API and we have our own browser-based interface. How can we achieve that the user doesn't have to wait until the remove operation is done. Maybe, can we use the transaction API for that? Like, triggering the removal and later get some feedback if the removal was successful. Do you now understand our problem?
I’m not sure exactly what you’re looking to do but you can always use named graph security to restrict access to the named graph and then delete it lazily.
It is not about access to the graph to be deleted, but how can we delete a large graph without letting the user wait until success. Yeah, we can implement threading in our application, however, I was wondering if Stardog itself already provides something for this?
Okay, if that long-running graph deletion process is encapsulated in a separate thread on client side, so that the user does not have to wait for its completion. What about inserting data into this very same graph afterwards. Could this be done instantly or does the user have to wait until the deletion has completed anyways?
Our workflows rely on dropping a graph entirely and re-running the import again. So it would be great to save the time we would need for deleting the graphs (especially if it is several minutes).
One would assume, if these are two independent transactions, that the Stardog server will know how to serialize and process them in the right order. It should work now and faster in the future. Look here: Home | Stardog Documentation Latest