Every Programmer Should Know #2: Optimistic Concurrency Control

Every Programmer Should Know #2: Optimistic Concurrency Control
Sunshine

In the world of programming, there are many concepts that every developer should understand in order to build efficient and consistent systems. Among these, Optimistic Concurrency (also known as Optimistic Locking) stands out as a key mechanism for ensuring data consistency in the middle of concurrent changes. As applications become increasingly high-traffic and interactive, understanding and effectively implementing this concept is more important than ever. This article, drawing from my experience on the ABP framework core team, aims to explain optimistic concurrency for both junior and senior developers.

Why should programmers care about it?

First of all, we need to know that optimistic concurrency is only a method to perform concurrency control (also known as concurrency check).

In the following sections of this article, we will explore two commonly used concurrency control mechanisms/approaches(mainly optimistic concurrency) and their advantages and disadvantages relative to each other.

So first we need to understand concurrency control, and for that, we need to understand what concurrency means.

What is concurrency?

Rob Pike, co-creator of the Plan 9 operating system and Golang, in his famous "Concurrency is not parallelism" speech, defined concurrency as dealing with lots of things at once.

So, what does Rob Pike mean by dealing with lots of things at once?

Here is what it means; end-user applications use concurrency to respond to user input while writing to a database. Server applications use concurrency to respond to a second request while finishing the first request. You need concurrency any time you need an application to do one thing while it's working on something else. Many software applications nowadays make use of concurrency. But with concurrency, we have a few new problems that need to be solved.

Problem

Let's say you live with your family in a house with 10 rooms, and your mother locks one of these rooms and always carries the key with her. One day, you need to take a small item from this room, and just at that moment, your mother is not at home, she's out shopping. In that case, you would have to wait for your mother to enter the room. If you also had a key, you could have gone inside and taken what you wanted even without your mother being there. However, once inside, you might be tempted to rummage through the entire room out of curiosity, making a mess of everything, which your mother would not like at all. Similarly, in an application, if only you can update a piece of data, there is no problem, but if others can update it as well, then you need to ensure that the data is always consistent. For example, let's examine the sequence diagram below:

Almost simultaneously, two people retrieve the same record from the server, and make changes, and both think their update was successful. However, the issue is that one person's changes were overwritten, and nobody is aware of it. If you have a small application where you always want the latest update request to succeed, you may not see this as a problem. It's a race condition and as you know, in a race condition the last one wins and that's what you want. However, if you need to make your application smarter and avoid this kind of race condition, this problem can be solved using two different concurrency control mechanisms/approaches: pessimistic concurrency and optimistic concurrency. Let's start with pessimistic concurrency.

Pessimistic concurrency control

Pessimistic concurrency control prevents simultaneous updates to records and uses a locking mechanism.

Pessimistic Concurrency Control

Returning to the analogy of your mother locking the room, we couldn't enter because there was only one key and it was with your mother, who was out shopping. If we had waited for your mother to come home and asked her to get the item from the room, the room would have remained undisturbed. Similarly, pessimistic concurrency control prevents conflicts between concurrent operations by allowing only one transaction to access the data at a time.

If your mother lost the key while shopping or informed us that she wouldn't be able to return home in the evening, we would be unable to enter the room again or for a long time. This situation is analogous to what is known as a deadlock in the programming world. In a deadlock, two or more processes are unable to proceed because each is waiting for the other to release a resource, similar to how we would be waiting indefinitely for access to the room without any possibility of entry.

Additionally, there are several disadvantages to your mother constantly carrying the key.

  • The cost of carrying a key, even a light one
  • The time it takes to lock and unlock the room
  • Stress when someone wants to enter the room
  • ...

Similarly, in the programming world, locking has its disadvantages.

  • There can be performance issues if the lock duration is high
  • Affects application scalability
  • High resource consumption due to locking and waiting
  • ...

And now a brief pause for a personal note

I want to make sure that my newsletter is meeting your needs and providing you with valuable content. That's why I am taking a brief pause to ask for your input.
My next articles will be shaped according to your demands, so I want to hear from you! What topics would you like to see covered in future newsletters? Is there anything specific you're struggling with that you'd like to see addressed in our content?
Simply reply to this email and let me know your thoughts. I value your feedback and look forward to incorporating your suggestions into our upcoming newsletters.

A Small Request

As we take a brief pause in our article, I want to extend a special request to you. Just like your feedback shapes the content of my newsletter, your support can greatly help in continuing to provide valuable insights and information.

If you're enjoying the journey so far and would like to see more, consider supporting me with a coffee donation on Patreon. It's more than just a donation – it's a way to ensure that this newsletter continues to grow and align with your learning needs. Your support, however small, goes a long way in keeping this initiative alive and tailored to your preferences.

Every bit of support is deeply appreciated, and I can't wait to see where our collective efforts and your valuable contributions will take us in my upcoming articles. Thank you for being a part of this journey!

Coffee

$5 / month (plus VAT)

Donate

Optimistic concurrency control

Your mother's worry about locking the door was someone messing up the room. As a solution, your mother locks the room. Instead, if she were more optimistic and trusted us, thinking we wouldn't mess up the room, it might still stay clean, or it might not, and as a result, you could end up being scolded by your mother in the evening.

Similarly, optimistic concurrency control assumes that conflicts are rare and can be resolved later. It does not lock(lock-free) the data when it is read or updated, but it checks for changes before saving. If another user or process has modified the data since it was read, an exception is thrown, and the current user has to decide how to handle it.

Pessimistic vs. Optimistic Concurrency Control:

Pessimistic Concurrency Control: This approach is like having a single key for a room - it prevents simultaneous access, ensuring consistency but potentially leading to bottlenecks or deadlocks.

Optimistic Concurrency Control: Here, access is more open, with conflicts resolved after they occur. It's similar to trusting family members not to disturb the room always, with a plan to resolve issues if they arise.

How to implement optimistic concurrency with EF Core?

Let's imagine that we have an entity like below:

public class Customer
{
    public Guid Id { get; set; }
    
    public string Name { get; set; }

    public string Reputation { get; set; }
}

Solution #1: Native database-generated concurrency tokens

If you are using SQL Server as a database, the simplest solution for you is to add a property like below:

public class Customer
{
    public Guid Id { get; set; }
    
    public string Name { get; set; }

    public string Reputation { get; set; }

    public ulong RowVersion { get; set; } // this line added
}

After adding RowVersion property, map this property to the rowversion column in SQL Server. Configuring a SQL Server rowversion column as a concurrency token is done as follows:

Data Annotations
public class Customer
{
    public Guid Id { get; set; }
    
    public string Name { get; set; }

    public string Reputation { get; set; }

    [Timestamp]  // this line added
    public ulong RowVersion { get; set; }
}
Fluent API
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Customer>()
        .Property(p => p.RowVersion)
        .IsRowVersion();
}

In SQL Server, this configures a concurrency token that automatically changes in the database every time the row is changed.

Although implementing this solution is quite simple, it has a strong disadvantage like database dependency.

Solution #2: Application-managed concurrency tokens

Instead of the database automatically handling the concurrency token, you can control it in your app's code. This lets you use optimistic concurrency in databases like SQLite that don't have auto-updating features. Even with SQL Server, managing the concurrency token in your app gives you precise control over which column changes will regenerate the token.

The following configures a GUID property to be a concurrency token:

Data Annotations
public class Customer
{
    public Guid Id { get; set; }
    
    public string Name { get; set; }

    public string Reputation { get; set; }

    [ConcurrencyCheck]  // this line added
    public Guid Version { get; set; }
}
Fluent API
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Customer>()
        .Property(p => p.RowVersion)
        .IsConcurrencyToken();
}

Because this property is not auto-updating by the database, you need to set it in the application every time you save changes like below:

var customer = context.Customer.Single(b => b.FirstName == "Arthur");
customer.FirstName = "Bill"
customer.RowVersion = Guid.NewGuid(); // set manually
context.SaveChanges();

If you want a new GUID value to always be assigned, you can do this via a SaveChanges interceptor.

If your application assumes a lot of data conflicts, pessimistic concurrency might be more suitable for you, so you won't need to roll back transactions later. However, if your application has few data conflicts and needs to scale, you can use optimistic concurrency. As an additional note, if you are developing an application using the ABP framework, the ABP framework supports optimistic concurrency as its default concurrency control system, almost without needing you to do anything.

Conclusion

In this article, we examined what optimistic concurrency is, and what problem it solves by comparing it with alternatives, and finally looked at how it is implemented in real-world applications.

Every Programmer Should Know #1: Idempotency
In the world of programming, there are many concepts that every developer should understand in order to build efficient and reliable systems. One such vital concept is idempotency, which refers to the property of an operation or function that produces the same result when applied multiple times as it does

Read more

[TR] İş hayatına nasıl daha kolay adapte olunur? Yeni mezunlar nelere dikkat etmeli?

[TR] İş hayatına nasıl daha kolay adapte olunur? Yeni mezunlar nelere dikkat etmeli?

Herkese merhaba, bu yazıda Engincan Veske ile birlikte açtığımız yeni podcast kanalının (dotcode), ikinci bölümünde konuştuğumuz “İş hayatına nasıl daha kolay adapte olunur? Yeni mezunlar nelere dikkat etmeli?” konusuna hazırlanırken çıkardığım notlardan bahsediyor olacağım. Bu bölüm için hazırlanırken birkaç maddeden oluşan notlar çıkardım. Kendi tecrübe ettiğim şeyleri listelemek ve bunları