Streamlining LINQ Code with Let
Nov 29, 2018 • 6 Minute Read
The Problem This Solves
Sometimes in our queries, we must modify column values with functions in the same way over and over again. As we compose the query, we feel the urge to copy and paste, which we have been trained is the wrong impulse. LINQ offers us a way to modularize these functions and then refer to them elsewhere in the query without having to copy and paste.
Consider the following query, which queries a set of customers whose email does not have the "email.com" and returns the count for each non-"email.com" domain:
from c in customers
where c.Email.Substring(c.Email.IndexOf("@") + 1) != "email.com"
group c by c.Email.Substring(c.Email.IndexOf("@") + 1) into g
select new { Total = g.Count(), Domain = g.Key };`
We're using our domain string-parsing logic in two places: once to filter out the undesired domain and the second to group our customers together. The code is duplicated in two places – duplicated code is vulnerable to defects and is best refactored to a common block. The let statement in LINQ allows us to do just that.
Our Scenario
We've been tasked with writing a new query that returns the set of employees that belong to one of two Locations, and then order by those locations. The resulting query looks like this:
from e in employees
where e.GroupCode.ToUpper().Substring(0, 2) == "AA"
||
e.GroupCode.ToUpper().Substring(0, 2) == "BB"
orderby e.GroupCode.ToUpper().Substring(0, 2)
select e;
This repeating clause:
e.GroupCode.ToUpper().Substring(0, 2)
``
`
- represents Location – the building where the employee is physically located. It would be nice if this were broken out separately from the other fields but, unfortunately, we must do that ourselves in code. Looking at this, we consider that it would be nice to streamline and centralize this code.
After a little research, we find that this is what the _let _keyword in LINQ is for. Let allows you to define a variable which contains the expression you want and use it as if it were any other field. We can take our query from before and streamline it to look like this:
```csharp
from e in employees
let location = e.GroupCode.ToUpper().Substring(0, 2)
where location == "AA"
||
location == "BB"
orderby location
select e;
Now, the location variable contains the GroupCode string manipulation expression and, when we refer to it on subsequent lines, the query behaves as if the code were there. This query returns the same results as the previous one.
Chained Lets
Let statements can function as miniature where statements. Consider this list of animals:
var animals = new List<Animal>
{
new Animal {Name = "Whale", Class = "Mammal", Location = "Ocean"},
new Animal {Name = "Bear", Class = "Mammal", Location = "Forest"},
new Animal {Name = "Hawk", Class = "Bird", Location = "Forest"},
new Animal {Name = "Tuna", Class = "Fish", Location = "Ocean"}
};
Let's say we want to return a list of animals that:
- Aren't scary
- Is one of our favorite animals on the list
- Lives in the forest
The following query will return those results:
from a in animals
where !(new List<string> { "Bear", "Shark" }.Contains(a.Name))
&& new List<string> { "Hawk", "Shark" }.Contains(a.Name)
&& a.Location == "Forest"
select a;
This is a little unclear because the meaning of the two lists, the bear and shark and hawk and shark elements, above is ambiguous. The developer may have known which was his favorite and which were scary (and what the relationship between those two lists was) at the time he wrote it but, looking at it a week later, it's difficult to know.
Put in plain English, the above query says:
Give me the animals from the animal list such that this list: "Bear", "Shark" doesn't' contain the name, and this list: "Hawk", "Shark" does contain the name, and the location is "Forest".
There's no duplication here for the let statement to resolve but we _can _make everything much clearer by using it:
var results = from a in animals
let favoriteAnimals = new List<string> { "Hawk", "Shark" }
let scaryAnimals = new List<string> { "Bear", "Shark" }
let isFavorite = favoriteAnimals.Contains(a.Name)
let isScary = scaryAnimals.Contains(a.Name)
let livesInTheWoods = a.Location == "Forest"
where (!isScary && isFavorite && livesInTheWoods)
select a;
Here, we define our lists, favoriteAnimals and scaryAnimals, inline in the query. We then create two boolean variables that reflect whether a given animal’s name is in that list. Then we add a livesInTheWoods variable that checks the Location and we concatenate these three elements together in the where clause.
The result of both of these queries is Hawk– my only favorite animal I'm not scared of and that lives in the forest. Both of these queries are entirely correct but the second reflects the meaning of the sets far more clearly. If, later on, I decide I'm not actually afraid of sharks anymore, it's perfectly clear that I need to remove that element from the scaryAnimals list. Compare that to the previous query:
from a in animals
where !(new List<string> { "Bear", "Shark" }.Contains(a.Name))
&& new List<string> { "Hawk", "Shark" }.Contains(a.Name)
&& a.Location == "Forest"
select a;
If I'm not scared of sharks anymore, which list does that need to be removed from? The first or the second?
It's important to consider that this let approach resulted in a much longer query. But shorter is not always better and less code is not always better than more. Sometimes it's worth a few extra lines of code if the trade-off is a significant increase in clarity.