Identifying triples for update queries

amytych · September 11, 2021, 5:32am

Hi, I tried to gain more clarity on UPDATE queries but documentation on it is a bit limited and I can't find an answer to the following question.

How can I confidently identify a triple to mutate it? As far as I'm aware the complete triple (subject_id + property + value) is the identifier itself, there are no other ids that can uniquely identify it, right?

So for example, if I have a blog post and I want to edit its title in my CMS I need to target it with something like:

:post_123 :title "Current blog title"

If that's the only way to target it, by holding on to the value, what's the approach to concurrent edits? Imagine two users are editing the same blog post and have it open in their interface. One of them changes the title and shortly after the other one wants to make an edit as well. In a regular db world that would not be a problem, subsequent edits would override the previous ones. But here once one of them changes the title the other one won't be able to target it without first obtaining the updated value, right? Is that really the case? Are there any strategies to concurrent edits like the above?

I'd appreciate any help, thank you!

pavel · September 13, 2021, 4:59pm

Hi Arek,

Good question! Yes, you're correct that there's no other statement ID rather than the subject, predicate, object, graph combination. That has the important consequence: RDF statements are atomic in the sense that any "modification" always creates a new statement without any connection to the previous statement. Unless you track modifications in your data model, every modification would just be a new statement (and the previous deleted).

Another thing to make clear is that there's no pessimistic concurrency control in Stardog. In simple terms, you cannot "lock" a statement in a transaction to prevent concurrent modifications. But there's some flexibility regarding optimistic concurrency control. By default Stardog implements the "last commit wins" semantics, i.e. if two transactions concurrently add or delete the same statement, the state of the database is determined by whichever commits last (note that it doesn't matter which has begun first). However, if you set the transaction.write.conflict.strategy database option to abort_on_conflict, Stardog will only let one of the concurrent transactions commit (amongst those which try to add or delete the same statement). All others will be rolled back and the error will be propagated to the client. This is similar to the standard Snapshot Isolation semantics.

Note that regardless of that option, any transaction reads data from the snapshot created at the time the transaction began. So if tx1 begins, then tx2 removes statement X and commits (possibly creating Y), tx1 will still be able to read that statement (repeatedly) for as long as it's active. But whether it will be able to commit after deleting it and creating a statement Z from it -- that will depend on transaction.write.conflict.strategy.

Does this help?
Pavel

More details: Snapshot isolation in Stardog

amytych · September 13, 2021, 5:36pm

Thank you for the answer @pavel and confirming my assumptions.

The Snapshot Isolation section you linked to states the following for the default "Last Commit Wins" strategy:

If two concurrent transactions try to add or remove the same quad the change made by the transaction last committed will be accepted while the other change is silently ignored.

I don't understand what does it mean exactly that "the other change is silently ignored"? I'd like to be notified if my intended change didn't have any effect.

pavel · September 13, 2021, 5:53pm

In that case you need the Abort on Conflict strategy. The client will get an exception on commit and can decide what to do (re-try, display the error to the user, or something else). With LCW the commit will succeed but the change will be overridden by the other transaction.

Cheers,
Pavel

amytych · September 14, 2021, 4:36am

Thank you again!

On more follow up question. Could you shed more light on what's the comparison mechanism for identifying the triples when it comes to storing dates, boolean and also more complex types like arrays and objects (JSON, Geo coordinates, etc)? Does everything need to be converted to string and becomes simple string comparison? Perhaps there are some guidelines in the docs I missed and you could point me to?

pavel · September 15, 2021, 2:02pm

This is defined in the RDF and SPARQL specs:

RDF: Resource Description Framework (RDF): Concepts and Abstract Syntax
SPARQL: see definitions of sameTerm and RDFterm-equal: SPARQL 1.1 Query Language
In a nutshell, yes, two terms are equal if their lexical forms are the same (including datatype URIs). However, there're additional rules about literal equality (see the Table in SPARQL 1.1 Query Language). SPARQL refers to certain XPath functions, like numeric-equal or dateTime-equal when it comes to comparing numbers or dates. That's why, for example, 1.10 is equal to 1.1 even though they are not the same RDF terms.

Stardog does not implement any special equality rules for any datatypes outside of the XSD spec, like geo, so the rules based on lexical comparison apply to them.

Best,
Pavel

system · September 29, 2021, 2:02pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Time out acquiring tx lock Bug	9	649	September 16, 2020
Concurrent Integrity Constraint Validation Support	3	413	July 20, 2018
Does Stardog support something like optimistic locking Support	3	551	September 12, 2017
Multiple query in one request? Support	3	649	September 3, 2020
Use of the HTTP API for ETL Support	18	303	February 12, 2025

Identifying triples for update queries

Related topics