Using the Lock Statement to Synchronize Access to Data
Dec 3, 2018 • 8 Minute Read
The Problem
When access to data in your multithreaded application is not synchronized, race conditions can occur. Synchronizing access to data generally means that only one thread at a time should access such data. However, as explored in the first guide in this series, this can be trickier than it sounds. When it comes to multiple threads, true synchronization involves the order of execution. The lock statement in C# is one of the easiest ways to guarantee how your code is executed in a multithreaded context.
The Lock Statement in Practice
At a glance, usage of the lock statement looks like this:
lock (foo)
{
...
}
Let's consider an example, to see the lock statement in context. Consider the case of a multithreaded web crawler console application, where you are scanning HTML for URLs and adding those URLs to a file. With good reason, you want to ensure that only one thread is writing to the file at any given time. We previously discovered that the following approach does not provide the guarantees we need, as we were still bumping into the occasional IOException.
static bool fileIsInUse;
static void WriteToFile()
{
while (fileIsInUse)
{
System.Threading.Thread.Sleep(50);
}
try
{
fileIsInUse = true;
using (var fileStream = new FileStream("links.txt", ..., FileShare.None))
{
// write to the file stream
}
}
finally
{
fileIsInUse = false;
}
}
Since the above does not work, how can we apply the lock statement to the above example, to achieve the necessary synchronization? Using the lock statement is quite easy, actually, and simpler than what we were attempting above. For this example, we can modify our code as follows:
static object linksLock = new object();
static void WriteToFile()
{
lock (linksLock)
{
using (var fileStream = new FileStream("links.txt", ..., FileShare.None))
{
// write to the file stream
}
}
}
That's all there is to it! There is now a guarantee that only one thread at a time will attempt to create the FileStream specified here. As you can see, using the lock statement consists of nothing more than the lock keyword, following by a variable in parentheses, and a { } block to go along with it. Any code you put inside the block is guaranteed to be run by only one thread at a time. That is exactly what we needed for the web crawler to function without the danger of an IOException being thrown.
So, the lock statement can be very straightforward to use, but how does it work and what is the significance of the variable passed to the lock statement in parentheses?
What it Means to Use the Lock Statement
It's important to understand what happens when you have a lock statement in your code. As you might have intuited, the lock statement acquires a lock on behalf of a thread (giving that thread certain rights) and then releases the lock (relinquishing said rights) upon the thread's exiting of the block. A thread that manages to enter a lock statement's block has exclusive access to all the code in the block. That means that any other thread that encounters the lock statement, while its block is being accessed by another thread, has to wait until the other thread has finished before it can proceed.
What about the variable, such as the linksLock, in our web crawler example? The lock statement requires a variable in parentheses because it needs to acquire what's called a mutual-exclusion lock (i.e. mutex) on an object in order to function. Specifying an object also has the benefit of being able to use the same lock for multiple blocks. For example, imagine that in your web crawler application you had two methods that modified the links.txt file: one method that added a line of text and one that removed a line of text. Not only would both methods need exclusive access to data, but also the same data. You couldn't, therefore, allow AddLine and RemoveLine to execute simultaneously. For those cases, locking on the same variable, as follows, is not only acceptable but necessary.
static object linksLock = new object();
static void AddLine()
{
lock (linksLock)
{
using (var fileStream = new FileStream("links.txt"...))
{
// add a line
}
}
}
static void RemoveLine()
{
lock (linksLock)
{
using (var fileStream = new FileStream("links.txt"...))
{
// remove a line
}
}
}
Conversely, you may have unrelated data that each need access synchronized, but not with each other. In that case, you will want to use separate objects, to avoid locking unnecessarily. For example, imagine if you wanted to maintain two files in the web crawler application, one for <a href="... (i.e. "link") URLs and one for <img src="... image URLs. There is no need to make access to the links file wait while writing to the images file, or vice versa. In that case, you would use two objects for locking, one for links and one for images. As an example:
static object linksLock = new object();
static void WriteToLinksFile()
{
lock (linksLock)
{
using (var fileStream = new FileStream("links.txt"...))
{
...
}
}
}
static object imagesLock = new object();
static void WriteToImagesFile()
{
lock (imagesLock)
{
using (var fileStream = new FileStream("images.txt"...))
{
...
}
}
}
The Lock Statement Under the Hood
While it is not necessary to understand the internals of the lock statement for basic usage, it can be helpful in advanced scenarios to know more precisely what the lock statement does. For example, it might surprise you to know that there is a way—albeit seldom used—to release the lock while still inside a lock statement's block. The key to understanding how that is possible is knowing that the following:
lock (myLockObject)
{
// your code
}
is translated by the C# compiler to the equivalent of
bool lockAcquired = false;
try
{
Monitor.Enter(myLockObject, ref lockAcquired);
// your code
}
finally
{
if (lockAcquired)
{
Monitor.Exit(myLockObject);
}
}
The Monitor class lives in the System.Threading namespace and is also available to you. In fact, if you prefer, you can use the Monitor.Enter and Monitor.Exit methods directly, without using lock at all. The lock statement is there simply for convenience, much like C#'s using statement.
So, what do we learn from knowing that the Monitor class is used under the hood by the lock statement? First of all, we can see that try...finally is used, which means that the lock will be released even if an unhandled exception is thrown in the code you put in your lock statement's block. Second, the Monitor class has other methods that can be used in conjunction with Enter and Exit to achieve more complex coordination between threads. For example, Monitor.Wait releases the lock and immediately blocks until it is reacquired, and Monitor.Pulse essentially awakens such a waiting thread. Such methods can be useful for implementing a producer/consumer queue for example.
In Conclusion
The lock statement is one of the simplest and most common tools for C# developers writing multithreaded applications. It can be used to synchronize access to blocks of code, achieving thread safety by allowing only one thread at a time to execute the code in that block. This is critical when dealing with variables, files, and other data shared between multiple threads since unsynchronized access to data can lead to race conditions. Finally, since the lock statement is syntactic sugar for the Enter/Exit methods in the Monitor class, lock can also be used in conjunction with other Monitor methods for more advanced forms of thread coordination.
You have everything you need to get started with the lock statement, but there are a number of gotchas that can arise when using it. The next guide in this series will take a look at best practices and avoiding common mistakes when working with the lock statement.