Filtering Data Using LINQ
By filtering data, we can avoid having to search for the data we need and greatly reduce the amount of data that needs to be transferred across a database connection.
Sep 6, 2019 • 5 Minute Read
The Problem This Solves
In T-SQL, we can filter the data that we wish returned by using a WHERE statement:
SELECT * FROM Customers WHERE Id = 27
By filtering the data, we can avoid having to search our data set for the data we need and also greatly reduce the amount of data that needs to be transferred across a database connection. LINQ makes the same capability available to use in the form of a similar WHERE clause and, in the case of Query syntax, a .Where() method.
Our Scenario
We've gotten great feedback about our Location report which we created in Grouping and Aggregating Data with LINQ with this query:
from e in employees
group e by e.GroupCode.ToUpper().Substring(0, 2) into g
select new { Location, Total };
A user noted that when we had this display:
AA 85
AB 73
..
ZZ 145
- it would be handy to be able to click on the Location and see a list of who the employees are. Your stakeholders agreed, so this was broken down into a story for you to work on. The story reads like this:
As a user, I need to see a list of the users for a particular Location, so that I can make staffing plans.
As with any query, we begin by defining what we want the shape of the return value to be. For now, we'll say that we want a set of full Employee objects:
from e in employees
select e
The location for the query will be supplied by the code. For our sample query, we'll use AA. If you'll recall, Location doesn't exist as a direct field on the Employee object, but rather as the first two characters of the GroupCode property, e.g., AA123. This means that we'll have to perform some simple string manipulation with the substring function, just like we did in Selecting and Transforming Data with LINQ.
Our original data set looks like this:
Id = 1 FirstName = "Chris" ** LastName** = "Behrens" ** GroupCode** = "AA123" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
Id = 2 FirstName = "Sam" LastName = "Smith" GroupCode = "AA128" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
Id = 3 FirstName = "Gill" LastName = "Bates" GroupCode = "ZZ867" Addresses = "2810 Pisces Ct Albuequerque NM 87144"
We want to return only the first two results from the test set, the employees with the GroupCodes "AA123" and "AA128". To make this happen, we simply add a WHERE clause before the codeblock:
from e in employees
where e.GroupCode.ToUpper().Substring(0, 2) == "AA"
select e;
This will return the following resultset:
Id = 1 FirstName = "Chris" ** LastName** = "Behrens" ** GroupCode** = "AA123" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
Id = 2 FirstName = "Sam" LastName = "Smith" GroupCode = "AA128" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
Here it is in query syntax:
employees.Where(e => e.GroupCode.ToUpper().Substring(0, 2) == "AA");
To points to notice here – first, we're creating the where clause as a lambda expression:
e => e.GroupCode.ToUpper().Substring(0, 2) == "AA"
Secondly, where is the select statement? It would be perfectly fine to create a query statement like this:
employees.Where(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA").Select(e => e);
But the select statement isn't really adding anything. Without it, the query engine simply assumes that we want to return all the results untransformed as the original type.
Where-enabled Statements
LINQ gives us some other elements to limit the results, such as First() and Last(), which do exactly what they sound like – they'll give you the first and last elements in the set. On our resultset above, this query:
employees.Where(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA").First();
- would return our first employee:
Id = 1 FirstName = "Chris" ** LastName** = "Behrens" ** GroupCode** = "AA123" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
While this query, using the Last() method:
employees.Where(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA").Last();
- would return our last employee:
Id = 2 FirstName = "Sam" LastName = "Smith" GroupCode = "AA128" Addresses = "7814 Vandalia Avenue Apt 3b Los Angeles CA 91610"
Taking a set, filtering with a WHERE clause, and then selecting the first or last element is such a common pattern that LINQ offers us a lambda argument for First and Last that takes the place of our WHERE statement:
employees.First(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA");
employees.Last(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA");
Consider our earlier query with the explicit where clause:
employees.Where(x => x.GroupCode.ToUpper().Substring(0, 2) == "AA").Last();
It is a common practice, when reading a lambda out loud, to render it something like this:
"All of the employees such that the first two characters of the to-upper of the GroupCode equals AA"
You may find it helpful to think of LINQ lambdas in that way: "all of the elements of the set such that…"