Literals, Unicode, and Underscores in Csharp
Oct 25, 2019 • 8 Minute Read
Introduction
This guide will introduce you to the syntax for literals, help you get comfortable with unicode escape sequences, and use the underscore character.
A literal is nothing more than a variable which is used by other variables, most commonly referred to as constants. An escape sequence is a special combination of characters that are non printable but allow users to communicate with the console, or the printer to provide different outputs. Encoding in this case is unicode, a standard form of encoding. Finally, we will learn about the underscore character, a code religion topic: there is no rule written in stone, but developers are trusted to follow specific guidelines.
Literals
Literals in C# are nothing more than fixed values. These values are used by variables or constants.
There are many types of literals:
- Integer
- Floating-point
- Character
- String
- Null
- Boolean
Integer Literals
This is a type of literal which concerns integers. There are three sub-types: one for the decimal, another for the octal and a third for the hexadecimal numbers.
Decimal numbers don't have prefixes.
int myDecimal = 28;
Octal numbers should always start with the 0 prefix.
int myOctal = 034;
Hexa-decimal number should start with either the 0X or 0x prefix:
int myHex = 0X1C
This little app demonstrates them getting printed with their values.
using System;
namespace litrals
{
class Literally
{
static void Main(string[] args)
{
int myDecimal = 28;
int myOctal = 034;
int myHex = 0X1C;
Console.WriteLine($"The variable {nameof(myDecimal)} has value: {myDecimal}");
Console.WriteLine($"The variable {nameof(myOctal)} has value: {myOctal}");
Console.WriteLine($"The variable {nameof(myHex)} has value: {myHex}");
Console.ReadKey(); ;
}
}
}
Upon execution we get the following output.
The variable myDecimal has value: 28
The variable myOctal has value: 34
The variable myHex has value: 28
Note how the nameof function allows us to refer to the name of the variable in our string interpolation. You may notice that the values shown in cases of hexa and octal numbers lack the prefixes. The prefixes tell the compiler which type of operations are allowed to happen on those types of integers.
Floating-Point Literals
A literal that has an integer part, a decimal point, and a fractional or exponential part is called floating point literal.
Valid notations:
double myFloat = 28.3848;
double myFloat = 28E-5;
double myFloat = 28f;
Invalid notations:
double myFloat = 28E;
double myFloat = .e28;
By default, every floating point literal is of double type, and we are not allowed to assign it to float.
The following expression...
float myFloat = 28.28;
...gives us a very talkative exception.
Literal of type double cannot be implicitly converted to type 'float';
use an 'F' suffix to create a literal of this type
So basically, if we want the above to work we need to suffix it with f.
float myFloat = 28.28f;
Character literals
For datatypes of character, we have three options to specify literals.
- Single quote notation.
char c = 'a';
- Unicode notation. Working with this type requires you to prefix the code with \u. This is a good site for unicode characters.
char c = '\u0306';
- Escape sequences. These are usually prefixed with \.
char c = '\n';
This topic will be covered in more detail later in this guide.
String Literals
These types of literals are usually enclosed in double quotes: "" or @"".
string myString = "Welcome to Pluralsight!";
string myOtherString = @"Welcome to the written quides!";
Null Literals
A null literal is simply the null keyword, which is a reference to the null value. It means nothing, or empty, based on the logic or context of the application.
For example, we can initialize a string with this null value as follows:
string s = null;
In this context, this is an empty string.
Boolean Literals
These are the simplest literals. They can have only true values. One ONE WHAT? is true or false and the other is 0 or 1. Context determines how these are evaluated.
bool myTrue = true;
bool myFalse = false;
Numerical Syntax Improvements
C# version 7.0 added flexibility into numerical literals, improving readability and maintainability of the source code.
Before C# version 7.0 long numbers looked like this:
int myLongNumber = 10000000;
It was hard at first glance to identify this number. But now there are separators:
int myLongNumber = 10_000_000;
This does not affect the internal workings. It's just a syntactic sugar to improve readability for binary literals and digit separators.
Unicode and Escape Sequences
This was already mentioned with regard to the character literals, but now we will expand on the idea and demonstrate the inside workings.
We usually call character combinations that consist of backslash and a character an escape sequence. This is fairly common in programming languages. In C# we put these characters in single quotes.
For example, a new-line character looks like this:
char newLine = '\n';
This is a list of escape sequences.
- bell -> '\a'
- backspace -> '\b'
- formfeed -> '\f'
- carriage return -> '\r'
- new line -> '\n'
- horizontal tab -> '\t'
- vertical tab -> '\v'
- single quote -> '''
- double quote -> '"'
- backslash -> '\'
Unicode escape sequences are prefixed with either \u or \x.
- unicode in hex -> '\xhhhh'
- unicode in byte -> '\uhhhh'
In C# you can combine or use the escape characters in string literals. Take the following code as example.
using System;
namespace litrals
{
class Literally
{
static void Main(string[] args)
{
char newLine = '\n';
char tab = '\t';
Console.WriteLine($"Although it may seem like this is one line only,{newLine}actually it is multiple lines.{newLine}{tab}We have tab-based indentation too.");
Console.ReadKey(); ;
}
}
}
The output looks like this.
Although it may seem like this is one line only.
actually it is multiple lines.
We have tab-based indentation too.
In my experience these escape characters are mostly used with short sentences or with a few sentences when you don't want to append them in a string and record them in variables. It also gives you flexibility to easily customize log messages, which are emitted by the application during runtime.
Underscores
This topic is one of the most divisive among developers. One reason is that some are learning C# as first language, trying to follow the best practices and code styling guides. Others are coming from different backgrounds like C, C++ or other languages where similar notation is available but their best-practice usage is slightly different in reasoning. There are two big guides, StyleCop and ReSharper, which help developers use appropriate naming conventions.
As a rule of thumb, I always follow the guidelines of the given language. C# code should look like C# code, and it is specified in the design guidelines.
Let's take the following code.
public class Employee
{
private string _firstName;
public string FirstName
{
get { return _firstName; }
set { _firstName = value; }
}
}
Here in our Employee class we have a property called FirstName, which is publicly accessible, but is just an accessor to the private _firstName. Most of the time it boils down to personal preference when and how underscores are used. The key is consistency. In general, underscores allow you to hide specific values or provide proxy to their access or modification via getter and setter functions.
Conclusion
This guide introduced different types of literals before examining escape sequences and underscores. I hope this has been informative and you found what you were looking for. Thank you for reading.