Best method of adding triples to a named graph via Pystardog?

My properties are as follows for my database:
spatial.enabled = true
edge.properties = true
query.all.graphs = false
query.timeout = 20m
virtual.transparency = false
preserve.bnode.ids = false

(1) Is there a significant runtime difference among the various content types when adding triples to Stardog? I'm currently using text/turtle, but I also see support in the Pystardog codebase for other content types.

(2) Is there a significant runtime different between using stardog.content.Raw and stardog.content.File when adding triples?

(3) Should I add all my triples, at once, in bulk? Or, should I perform some sort of pagination when adding these triples?

Also, if it's better to add triples via a file -- is it more performant to add my triples without syntactic sugar (such as repeating / not repeating the subject -- :a :b :c ; :d : c . vs :a :b :c . :a :d :c .?)

Turtle without the syntactic sugar is basically n-triples. There may be some instances where you might want to prefer that but in general the syntactic sugar is there to make things easier and you should use them. It will make a big difference in file size unless you compress the n-triples in which case they end up being similar. I don't think there would be an appreciable difference in load times based on the serialization you choose but you'd have to test that to make sure.

Loading in bulk will be faster but you can only do that once at database creation. If you have a large amount of data that you'd like to load after db creation it may help to load it in batches.

#2: they are just helpers to load files (content.File) or strings/file-like things (content.Raw) with optional content type and encoding. They both end up sent to the Stardog server in the same way. There should not be a runtime difference.

https://pystardog.readthedocs.io/en/latest/source/stardog.html#module-stardog.content

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.