Understanding String Immutability in C#
Dec 18, 2019 • 7 Minute Read
Introduction
In this guide, we will discuss strings and the immutability of strings. We will use the C# programming language to explain strings and give examples of them. If you have trouble understanding the concept of strings or you would like to be refreshed on strings and immutability, then this guide is for you.
By the end of this guide, you will know:
-
What are strings and what does immutable mean
-
Why are strings immutable
-
The pros and cons of string immutability
Strings and Immutability
In the programming world, a string is an array of System.Char characters that when put together represent text. In the C# programming language, you can declare a string and print out its value as follows:
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
When you create a string, it is immutable. That means it is read-only. When something is immutable or read-only, it means it cannot be changed at a later time.
Why Strings are Immutable
In C#, the CLR (Common Language Runtime) is responsible for determining where to store strings. In the last section I noted that a string is an array of characters. The CLR implements an array to store strings. Arrays are a fixed size data structure, meaning that they cannot be dynamically increased or decreased in size. Once an array is assigned a size, the size cannot be changed. To make an array larger, the data must be copied and cloned into a new array, which is put into a new block of memory by the CLR. If you edit a string, you are really not modifying that string; rather, the CLR is creating a new memory reference for the modified string, and the original string will get removed from memory via garbage collection.
Let's Look Under the Hood
While it is not imperative, it is absolutely important to know what is going on in memory while you are writing your code, whether it's with strings or data structures or something else. It helps in a variety of ways, whether it be fixing software bugs or diagnosing memory leaks or performance issues. I find the best comparison is knowing what is going on under the hood of your vehicle. While it is not critical to know, it is helpful to know the parts under the hood of a vehicle and what purpose they serve in case of vehicle issues.
Let's look closer at the example from the previous section. We will create a new string, then modify it and explain what is happening in memory during all of this.
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
In the program, we are creating a string object named str and assigning it the value "Hello World". However, in memory, the CLR is creating blocks of space in order to store this variable. For simplicity, let's say the CLR uses memory location 1000 to store str. Since this is an object, the CLR will store this on the heap, not on the stack. Now, let's modify this string.
str += " edited";
Console.WriteLine(str);
//This will print out "Hello World edited" to the console.
When you run this code, you will see "Hello World edited". Since this string is immutable, the CLR is again creating new blocks of space in memory to store this variable. The CLR will assign a new memory location, let's say location 1500 for this new variable. Eventually, the garbage collector will dispose of the original string stored in location 1000 and clear it out of memory.
Pros and Cons
Similar to almost everything else, there are reasons to use immutable strings, and there are reasons not to use immutable strings. Why should you use immutable strings? One advantage is that they are thread safe. If you are working with a multi threaded system, there will be no risk of a deadlock or any concurrency issues, since when you modify a string, you are really just creating a new object in memory. Another advantage is that you will not have to worry about accidentally changing them. You do not need to take the additional safety measures (i.e. a defensive object copy) that you may need to take with a mutable object.
Why are immutable strings a bad idea? The main issue is that constantly changing strings can lead to performance issues. We will explain this in a code block. If you refer back to the code snippets from the previous section, you will see that we only modified the string one time. Suppose we have a scenario like this:
string str = "Hello World";
Console.WriteLine(str);
//This will print out "Hello World" to the console.
for (int i = 0; i < 10; i++)
{
str += " again";
Console.WriteLine(str);
}
You can see the output below. This code prints "again" for each iteration (ten times) after the original "Hello World". However, for each iteration, since the string is immutable, what is happening in memory is that ten times the CLR is allocating new space in memory and storing a new str variable, and each time it's creating bigger blocks to save more data.
What if we did this one thousand, ten thousand, or a million times? This would require an enormous amount of memory to be allocated by the CLR and an enormous amount of work to be done by the garbage collector. With all of that going on, it is highly likely you will see a slowdown in performance. If you find yourself in a scenario like this, a mutable object might be what you need. One suggestion is a StringBuilder object.
Conclusion
This guide has addressed why strings are immutable by default and how they are handled in memory. We looked at examples of creating and modifying strings and explained what was going on in memory during each step. We then went over why immutability of strings is a good thing, how it can also be a bad thing, and what alternatives exist in case you find yourself needing one. Strings are a very important concept in the software world, and in most technical job interviews you should expect to receive questions related to strings.
Possible interview questions to know:
- Should we use mutable or immutable strings?
- Are strings mutable or immutable by default?
- Why are strings immutable?
I hope you have enjoyed reading this guide and that it will help you understand one of the most important concepts of the computer science field! If you are interested in reading some of my other work, check out By Value vs. by Reference: Return Values for a Function.