Concurrent modification of shared data can be a problem in any distributed system regardless of what data store you are using. With ACID-compliant relational databases, a common tactic is to use pessimistic locking at the table or row level. Most NoSQL data stores do not have a pessimistic lock operation, and even when they do, it is often considered a performance hazard. So, most applications do not lock objects before writing them to a NoSQL datastore (or they use an external lock of some sort). This can quickly become a problem when you have a distributed system with write contention, as shown in the figure below:
One of the nice features of Couchbase is its “CAS” operation. This provides the ability to do an atomic check-and-set operation. You can set the value of a key, providing the last known version identifier (called a “CAS value”). The write will succeed if the document has not been modified since you read it, or it will fail if it has been modified and now has a different CAS value.
Using this operation, we can easily build a higher-level operation to provide optimistic locking on our documents, using a CAS retry loop. The idea is simple: get the latest version of the document, apply your update(s), and write it back to Couchbase. If there are no conflicts, then all is well, and you can move on. If there is a conflict, you re-get the latest version of the document, fully reapply your modifications, and try again to write the document back to Couchbase. Repeat until the write succeeds.
With this, the figure above would look like this:
There are a few things that are important to note about this technique.
- You should have no unsaved modifications to the document before doing this. They will be lost.
- Your modification code must be re-runnable and not have undesired side-effects, because it may be run an unpredictable number of times.
- Your modification code should be commutative, since multiple clients may be operating at the same time, and we cannot guarantee order.
- If your modification is not commutative, you should be comfortable that this roughly amounts to a Last Writer Wins (LWW) strategy (although that is not strictly guaranteed without a real vector clock).
I have created a GitHub repository that implements this technique by extending the Couchbase Ruby client’s
Couchbase::Bucket class on which you normally call
set methods. You can, of course, put this elsewhere so that you don’t need to monkey-patch someone else’s library. Here is a look at the code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
With this monkey-patch loaded, you can now do the following:
1 2 3 4 5 6 7 8
It is important to note that, if your changes are not commutative, like our simple increment example, the code in your modification block will probably want to be smart enough to do some kind of merge logic for conflict resolution. It must recognize that the state of the document before calling
update_with_retry may not actually be the same state that the successful block operates on.
Test code for this method can be seen in the GitHub repository.
UPDATED Nov. 15, 2013: As Sergey Avseyev pointed out, there is a very similar method
Couchbase::Bucket#cas that already exists in the couchbase-ruby-client. The only thing it doesn’t do that I described above, is the retry upon collision. At his suggestion, I’ve extended that method to take a
retry option. This is probably a better solution anyway, since it handles both synchronous and asynchronous modes. Look for it in an upcoming release of the couchbase-ruby-client gem.