Identifying triples for update queries

Hi, I tried to gain more clarity on UPDATE queries but documentation on it is a bit limited and I can't find an answer to the following question.

How can I confidently identify a triple to mutate it? As far as I'm aware the complete triple (subject_id + property + value) is the identifier itself, there are no other ids that can uniquely identify it, right?

So for example, if I have a blog post and I want to edit its title in my CMS I need to target it with something like:

:post_123 :title "Current blog title"

If that's the only way to target it, by holding on to the value, what's the approach to concurrent edits? Imagine two users are editing the same blog post and have it open in their interface. One of them changes the title and shortly after the other one wants to make an edit as well. In a regular db world that would not be a problem, subsequent edits would override the previous ones. But here once one of them changes the title the other one won't be able to target it without first obtaining the updated value, right? Is that really the case? Are there any strategies to concurrent edits like the above?

I'd appreciate any help, thank you!

Hi Arek,

Good question! Yes, you're correct that there's no other statement ID rather than the subject, predicate, object, graph combination. That has the important consequence: RDF statements are atomic in the sense that any "modification" always creates a new statement without any connection to the previous statement. Unless you track modifications in your data model, every modification would just be a new statement (and the previous deleted).

Another thing to make clear is that there's no pessimistic concurrency control in Stardog. In simple terms, you cannot "lock" a statement in a transaction to prevent concurrent modifications. But there's some flexibility regarding optimistic concurrency control. By default Stardog implements the "last commit wins" semantics, i.e. if two transactions concurrently add or delete the same statement, the state of the database is determined by whichever commits last (note that it doesn't matter which has begun first). However, if you set the transaction.write.conflict.strategy database option to abort_on_conflict, Stardog will only let one of the concurrent transactions commit (amongst those which try to add or delete the same statement). All others will be rolled back and the error will be propagated to the client. This is similar to the standard Snapshot Isolation semantics.

Note that regardless of that option, any transaction reads data from the snapshot created at the time the transaction began. So if tx1 begins, then tx2 removes statement X and commits (possibly creating Y), tx1 will still be able to read that statement (repeatedly) for as long as it's active. But whether it will be able to commit after deleting it and creating a statement Z from it -- that will depend on transaction.write.conflict.strategy.

Does this help?
Pavel

More details: Snapshot isolation in Stardog

1 Like

Thank you for the answer @pavel and confirming my assumptions.

The Snapshot Isolation section you linked to states the following for the default "Last Commit Wins" strategy:

If two concurrent transactions try to add or remove the same quad the change made by the transaction last committed will be accepted while the other change is silently ignored.

I don't understand what does it mean exactly that "the other change is silently ignored"? I'd like to be notified if my intended change didn't have any effect.

In that case you need the Abort on Conflict strategy. The client will get an exception on commit and can decide what to do (re-try, display the error to the user, or something else). With LCW the commit will succeed but the change will be overridden by the other transaction.

Cheers,
Pavel

Thank you again!

On more follow up question. Could you shed more light on what's the comparison mechanism for identifying the triples when it comes to storing dates, boolean and also more complex types like arrays and objects (JSON, Geo coordinates, etc)? Does everything need to be converted to string and becomes simple string comparison? Perhaps there are some guidelines in the docs I missed and you could point me to?

This is defined in the RDF and SPARQL specs:

Stardog does not implement any special equality rules for any datatypes outside of the XSD spec, like geo, so the rules based on lexical comparison apply to them.

Best,
Pavel