Understanding the .NET Language Integrated Query (LINQ)
The Language Integrated Query (LINQ), which is pronounced as “link”, was introduced in the .NET Framework 3.5 to provide query capabilities by defining standardized query syntax in the .NET programming languages (such as C# and VB.NET). LINQ is provided via the
A query is an expression to retrieve data from a data source. Usually, queries are expressed as simple strings (e.g., SQL for relational databases) without type checking at compile time or IntelliSense support. Traditionally, developers had to learn a new query language for each data source type (e.g., SQL, XML, ADO.NET Datasets, etc.).
LINQ provides unified query syntax to query different data sources by working with objects. For example, we could retrieve and save data in different databases (MS SQL, My SQL, Oracle, etc.) with the same code. Using the same basic coding patterns, we can query and transform data in any source where a LINQ provider is available. In addition, we can perform many operations, such as filtering, ordering, and grouping.
In this article, we will learn about the LINQ architecture and technologies, query syntaxes, execution types, and query operations. In addition, we will see some code examples to be familiarized with LINQ concepts.
We can design classes and methods that can provide functionalities for a general type (
T) by using Generics. The generic type parameter will be defined when the class or method is declared and instantiated. In this way, we can use the generic class or method for different types without the cost of boxing operations and the risk of runtime casts.
A generic type is declared by specifying a type parameter in angle brackets after the class or method name, e.g.
T is a type parameter. The
MyClassName class will provide generalized solutions for any
T. The most common use of generics is to create collection classes.
LINQ queries are based on generic types. So, when creating an instance of a generic collection class, such as
Dictionary<TKey, TValue>, etc., we should replace the
T parameter with the type of our objects. For example, we could keep a list of string values (
List<string>), a list of custom
User objects (
List< User>), a dictionary of integer keys with string values (
Dictionary<int, string>), etc.
If you have already used LINQ, you probably have seen the
IEnumerable<T> interface. The
IEnumerable<T> interface enables the generic collection classes to be enumerated using the
foreach statement. A generic collection is a collection with a general type (
T). The non-generic collection classes such as
ArrayList support the IEnumerable interface to be enumerated.
As we have already seen, we can write LINQ queries in any source in which a LINQ provider is available. These sources implement the
IEnumerable interface, such as in-memory data structures, XML documents, SQL databases, and DataSet objects. In this way, we always view the data as an
IEnumerable collection, either when we query, update, etc.
In the following figure, we can see the LINQ architecture and the available LINQ technologies. As we can see, the LINQ technologies are the following:
- LINQ to Objects: Using LINQ queries with any
IEnumerable<T>collection directly, without using an intermediate LINQ provider or API such as LINQ to SQL, LINQ to XML, etc. Practically, we query any enumerable collections such as
- LINQ to XML: LINQ to XML provides an in-memory XML programming interface that leverages the LINQ Framework to perform queries easier, similarly to SQL.
- ADO.NET LINQ Technologies: ADO.NET provides consistent access to data sources (such as SQL Server, data sources exposed through OLE DB and ODBC, etc.) to separate the data access from data manipulation.
- LINQ to DataSet: To perform queries over data cached in a DataSet object. In this scenario, the retrieved data are stored in a DataSet object.
- LINQ to SQL: Use the LINQ programming model directly over the existing database schema and auto-generate the .NET model classes representing data.
LINQ to SQLis used when we do not require mapping to conceptual models (i.e., when one-to-one mapping of the data to model classes is accepted).
- LINQ to Entities: We can use the LINQ to Entities to support conceptual models (i.e., models that are not the same as the logical models of the database). The conceptual data models (mapped database models) are used to model the data and interact as objects. In this way, we can formulate queries in the database in the same programming language we are building the business logic.
LINQ provides two ways to write queries, the
Query Syntax and the
Method Syntax. In the following sections, we will see the syntax of both ways.
The LINQ Query Syntax has some similarities with the SQL query syntax, as we see in the following syntax statement. The result of a query expression is a query object (not the actual results), which is usually a collection of type
In Figure 2, we can see a simple LINQ query syntax example. The
from clause specifies the data source (
numbers) and the
num range variable (i.e., the value in each iteration). The
where clause applies the filter (e.g., when the
num is an even number), and the
select clause specifies the type of the returned elements (e.g. all even numbers).
In general, the query specifies what information to retrieve from the data source or sources. Optionally, a query also determines how that information should be sorted, grouped, and shaped before it is returned.
Query syntax and Method syntax are semantically identical. However, many people find query syntax simpler and easier to read since it doesn’t use lambda expressions. In Figure 3, we can see the semantically equivalent LINQ Query syntax example written in Method syntax.
The query syntax is translated into method calls (method syntax) for the .NET common language runtime (CLR) in compile-time. Thus, in terms of runtime performance, both LINQ syntaxes are the same.
In the previous sections, we saw how to use Query and Method syntax to create our query object. It is essential to notice that the query object doesn’t contain the results (i.e., the query result data). Instead, it includes the information required to produce the results when the query is executed. As we can understand, we can execute the query multiple times.
There are two ways to execute a LINQ query object, the deferred execution and the forced execution:
- Deferred Execution is performed when we use the query object in a
foreachstatement, executing it and iterating the results.
- Forced execution is performed when we execute the query to retrieve its results in a single collection object using the
ToArray()methods. Another way to force the query execution is when we perform functions that need to iterate the results, such as
Let’s assume we have the
Customer customers array from a related service. We have created the following query object to retrieve the customers who live in Athens.
In the following example, we can see how to execute the query object using the two execution methods (Deferred and Forced).
In the following table, we can see the majority of the LINQ Query Operations grouped in categories. For information regarding each query operator’s result type and execution type (Deferred or Forced), click here.
|LINQ Operator Category||LINQ Query Operators|
|Filtering Data||Where, OfType|
|Sorting Data||OrderBy, OrderByDescending, ThenBy, ThenByDescending, Reverse|
|Projection Operations||Select, SelectMany|
|Quantifier Operations||All, Any, Contains|
|Element Operations||ElementAt, ElementAtOrDefault, First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault|
|Partitioning Data||Skip, SkipWhile, Take, TakeWhile|
|Join Operations||Join, GroupJoin|
|Grouping Data||GroupBy, ToLookup|
|Aggregation Operations||Aggregate, Average, Count, LongCount, Max or MaxBy, Min or MinBy, Sum|
|Generation Operations||DefaultIfEmpty, Empty, Range, Repeat|
The Language Integrated Query (LINQ) provides unified query syntax to query different data sources (e.g., SQL, XML, ADO.NET Datasets, Objects, etc.). In addition, it supports various query operations, such as filtering, ordering, grouping, etc.
LINQ queries are based on generic types, so in generic collections such as
List<T>, we should replace the
T parameter with our type object. The LINQ sources implement the IEnumerable interface to be enumerated. The available LINQ technologies include LINQ to Objects, XML, DataSet, SQL, and Entities.
- Provide unified query syntax of queries for different data sources.
- Type checking at compile-time and IntelliSense support.
- We can reuse the queries quickly.
- Easier debugging through the .NET debugger.
- Supports various query operations, such as filtering, ordering, grouping, etc.
- The project should be recompiled and redeployed for every change in the queries.
- For complex SQL queries, LINQ is not very good.
- We cannot take advantage of the execution caching provided in SQL store procedures.
LINQ provides powerful query capabilities that any .NET developer should know.