Hi,
It's highlighted here (High Availability Cluster | Stardog Documentation Latest) that "Stardog Cluster uses a strong consistency model, meaning that all nodes in the cluster have all of the data. Therefore, when a client writes to Stardog it must be written to all cluster members before the cluster responds to the client. There is no sharding or eventual consistency in Stardog Cluster."
We have also this command stardog-admin cluster metrics to retrieve metrics for each server in the cluster. I'm wondering whether any of these metrics can show the total time each server has been locked for the consistency purposes of write transactions. If not, is there any way to retrieve such total lock time?
Thank you @pdmars. Let's imagine that the isolation level of a db is set to snapshot. In this case, if we're not mistaken reads (like SELECT queries) from the Stardog cluster are not transactional. Based on this, if we refer to this metric databases.YourDb.txns.latency.p999 and multiply this by databases.YourDb.txns.latency.count may get an upper-bound approximation of how much time each node spent to process transactions or somehow being locked. Does it make sense at all?
Write transaction latency is tracked by those metrics and read latency is found under metrics for queries. So in that sense I suppose your calculation would give a general idea toward an upper bound of how long a write lock has been held since the last restart of the node (when these metrics are reset).
However, the write lock does not prevent concurrent write transactions (or other write operations such as various admin actions) nor does it block read queries. If a write lock is held by a transaction, it would only prevent a node from joining the cluster or from the cluster attempting to get the join lock to enter cluster-wide read-only mode. The locks are fair, so a joining node would allow any writes to complete that started before it began waiting for the lock. Any writes that are attempted after a joining node attempts to get the lock will need to wait until the joining node either gets the lock and releases it, or times out waiting for it.