Types and Their Boxing and Unboxing
Aug 23, 2019 • 9 Minute Read
Introduction
In C# data types are categorized based on how they store their value in memory. This gives us three main techniques, based on which the values are stored. The first one is Value type, the second one is Reference type the third one is called Pointer type. By default pointers and any facilities are disabled because they are involved with the unsafe keyword, that pushes the boundaries that the CLR (Common Language Runtime) can handle. This guide will not cover Pointer types in detail.
A value type holds data within its memory allocation, the reference type contains a pointer to another memory location that holds the actual data.
In order to get a better understanding, we will take a detour and learn the importance of stack and heap.
Detour
Stack a special type of memory that is allocated to be used by Value type data types, a.k.a. static memory allocation. Heap is the counterpart, which means this memory can be used for dynamic memory allocation that serves the Reference type data structures. Applications can and do utilize both, based on the variables. However, as a developer, it's very important to be aware of these facts, as they might impact the application's performance, and cause you headaches in the long run if they are neglected. Both Stack and Heap are stored in the computer's RAM.
Value type
This data type holds it's value or content in a memory space allocated on the stack. Let's take a look at a real-life example:
int i = 99;
The above statement works like this behind the scenes. When the i variable is created with the value 99 a single slot/space of memory is directly allocated to store the value. In this case, we assign another value to the variable; given that the value is compatible or can be converted to this type. The value is directly copied to the originally reserved memory location. Predefined data types, enums, and structs are also value types. These types are created at compile time and are usually stored in stack memory. This means the GC - Garbage Collector - cannot access them.
The access to this type of variable and memory is blazing fast because the allocation is done during compile time. The stack uses a LIFO - Last In First Out - datastructure. The most recently reserved block is the next block in line when a free action takes place. It's also ideal when the application is using nested function calls or recursive functions. Naturally, there is always a limit as to how deep this recursion/nesting can go. Although, nowadays, the stack is usually exhausted when a semantic error throws the application into a deep recursion.
The following data types all fall under the value type:
- int, byte, short
- float, decimal, struct
- long, double, uint
- char, enum, ulong
- bool, sbyte, ushort
Reference Type
This kind of datastructure uses the heap to store its value in the memory. The allocation takes place at runtime and the access to this kind of data is slower because of the extra step which takes the pointer to the actual value involved. The items stored in the heap are different from the ones stored on the stack. There is no dependency between these items and they can be accessed randomly, at any given moment. You are free to allocate a block during runtime with your application and release it too!
With this comes the complexity of keeping track of which parts of the heap are free or being used at any given time.
Developers usually use stack if they are aware in advance of how much data the application will process/store before compile time. Size is also of importance, and you need to do a relative calculation based on the target machine that you are going to run your application on. It's never an easy thing to do. Heap is used when we are unable to determine how much memory will be needed. Or, based on the input data and requests the application gets under heavy load/execution, it can, on-demand, reserve more memory during runtime.
Let's look at an example.
string writtenGuide = "Pluralsight";
The system is going to store the value of the variable Pluralsight in once location and the name of the variable writtenGuide in another location, as a pointer to the original value which was assigned.
The following data types fall under the reference type:
- all arrays (even if the elements are value types)
- delegates
- classes
- strings
Passing Types
This section shows you the effect of the two types have when you pass them around.
Value Type by Value
Let's take the following code as an example.
using System;
namespace PassingAround
{
public class PVTBV
{
static void Magic(int a, int b, int c) {
a = b * c;
b = a * c;
c = a * b;
Console.WriteLine($"Inside magic value a: {a}, b: {b}, c: {c}");
}
static void Main(String[] args)
{
int a = 10;
int b = 20;
int c = 30;
Console.WriteLine($"Original value a: {a}, b: {b}, c: {c}");
Magic(a, b, c);
Console.WriteLine($"After magic value a: {a}, b: {b}, c: {c}");
Console.Read();
}
}
}
The output is as follows.
Original value a: 10, b: 20, c: 30
Inside magic value a: 600, b: 18000, c: 10800000
After magic value a: 10, b: 20, c: 30
The idea behind passing these variables is to show the value type property of the data type. This means that passing any variable of value type and modifying the value will only affect the local variable in the given scope to which the value was passed, During compile time there were actually 6 variable reservations on the application we have, 3 in the Main function, and 3 in the Magic function. When the Magic was called the values of the a,b and c variables were copied. Then the calculations performed, and the WriteLine statement executed. There were no modifications to the original variables.
Reference Type by Value
Let's take the following code as an example.
using System;
namespace PassingAround
{
class Reader {
public int age;
public string platform = "Pluralsight";
}
public class PRTBV
{
static void oneYearLater(Reader a)
{
Console.WriteLine("Time flies...");
a.age += 1;
}
static void Main(String[] args)
{
Reader r1 = new Reader();
Reader r2 = new Reader();
r1.age = 30;
r2.age = 50;
Console.WriteLine($"The age is {r1.age}, favorite platform: {r1.platform}");
Console.WriteLine($"The age is {r2.age}, favorite platform: {r2.platform}");
oneYearLater(r1);
oneYearLater(r2);
Console.WriteLine($"The age is {r1.age}, favorite platform: {r1.platform}");
Console.WriteLine($"The age is {r2.age}, favorite platform: {r2.platform}");
Console.Read();
}
}
}
The output is as follows.
The age is 30, favorite platform: Pluralsight
The age is 50, favorite platform: Pluralsight
Time flies...
Time flies...
The age is 31, favorite platform: Pluralsight
The age is 51, favorite platform: Pluralsight
This is successful because, as far as the age modification is concerned, the class belongs to the reference type.. When we called the oneYearLater function, the pointer was passed and the original value pointed out by the pointer for the age attribute of the specific instance was increased by one. This should give you a good idea about the main difference of reference and value types.
Multi-threaded Apps
Nowadays 99% of the applications developed are running either multi-threaded or in environments with multiple cores. When we develop a multi-threaded application we need to be aware that each thread will have its own stack but the heap will be shared amongst them. The stack is thread specific in this sense and the Heap is application specific. Threads should be taken into consideration when we design our exception handling.
Conclusion
At first when I learned C# I was quickly introduced to reference type and value type data types, and it was not an easy concept to grasp. When I learned about the Heap and Stack it all became clear and, let's say, shined a light into the hole that the concepts had created. You can get by without taking these concepts into any consideration; however, if you want to get beyond entry-level, you cannot ignore this crucial part of C# and what the Heap and Stack memory spaces mean.