0 Comments Posted in:

When I created my More Effective LINQ Pluralsight course (hopefully to be updated in the not too distant future!), I wanted to include a section that discussed the question of the performance implications of using LINQ. Sometimes developers hear that "LINQ is slower than using a for loop" and wonder whether that means they should avoid using LINQ for performance reasons.

Note: In this post I'm specifically talking about LINQ to objects. With LINQ queries against databases (such as when you're using Entity Framework) performance considerations are mostly related to whether the generated SQL is efficient or not.

It is slightly slower

Let's start off by acknowledging that using the LINQ operators (such as Select and Where) does typically result in very slighly slower code than if you wrote a for or foreach loop to do the same thing.

This is acknowledged in the Microsoft documentation:

LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code.

And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so. I created a very simple LINQ benchmark project a few years back and recently updated it to .NET 6.

The test basically uses a LINQ pipeline to perform a series of mathematical operations on 10 million numbers.

const int N = 10000000;
return Enumerable.Range(1, N)
    .Select(n => n * 2)
    .Select(n => Math.Sin((2 * Math.PI * n) / 1000))
    .Select(n => Math.Pow(n, 2))
    .Sum();

But how does that compare with a regular for loop performng the same calculation:

double sum = 0;
for (int n = 1; n <= N; n++)
{
    var a = n * 2;
    var b = Math.Sin((2 * Math.PI * a) / 1000);
    var c = Math.Pow(b, 2);
    sum += c;
}

With BenchmarkDotNet, running on .NET 6, we can see that the for loop was indeed significantly faster (around 20%) as we might expect:

MethodMeanErrorStdDev
SumLinq415.14 ms8.076 ms8.977 ms
SumForLoop335.70 ms6.553 ms10.203 ms

It probably doesn't matter

Now before you go rushing off and refactoring all your LINQ code into for loops, it's worth remembering that in the vast majority of situations, LINQ is not a major contributing factor to the overall speed of your application.

The example I used was special in a couple of ways - first, we were iterating through millions of times (not a typical scenario for LINQ to objects), and second, the inside of the loop was extremely trivial (which is not always the case when you're using LINQ).

One of the most important rules of software optimization is to measure first. If your application is slow, take some time to instrument it and find out where the time is being taken. This will lead you to the places you should invest effort in performance optimizing. It is extremely unlikely in most enterprise applications that switching from LINQ to for loops will make any noticable difference to the overall performance. It would just make the code harder to read and maintain for negilible benefit.

By the way, don't take my word for it - I foumd this stack overflow answer from Eric Lippert that states it well. He says "ensure that you're only spending valuable time and effort doing performance optimizations on things that are not fast enough."

There are often better ways to make your code faster

We saw from the benchmarks above that if we converted my simple LINQ example into a for loop it would have run faster. But that's not the only way we could speed up that code.

For example, if I run that same comparison on .NET 4.8 I get the following results:

MethodMeanErrorStdDev
SumLinq756.4 ms14.83 ms19.29 ms
SumForLoop645.8 ms12.43 ms12.21 ms

As you can see, .NET 4.8 is significantly slower than .NET 6. That means that if my project is running on .NET 4.8, then converting to .NET 6 will actually give me a substantially bigger performance boost than converting my LINQ expressions to for loops. And it will make many other parts of my application faster as well.

Of course, the for loop on .NET 6 is still the fastest of the options we've seen so far. But that's not the only way we could go about making this code run faster. As it happens, the example I used can be parallelized using PLINQ. The change is trivial - just add an AsParallel and PLINQ will use different threads that can run across different CPUs to calculate the values and then join back together with the Sum at the end.

Enumerable.Range(1, N).AsParallel()
    .Select(n => n * 2)
    .Select(n => Math.Sin((2 * Math.PI * n) / 1000))
    .Select(n => Math.Pow(n, 2))
    .Sum();

When we run this, we can see that the PLINQ solution greatly outperforms even the for loop conversion, whilst retaining the readability and composability benefits that typically come with LINQ pipelines.

MethodMeanErrorStdDev
SumLinq415.14 ms8.076 ms8.977 ms
SumForLoop335.70 ms6.553 ms10.203 ms
SumParallel93.55 ms1.824 ms2.371 ms

Of course, not every LINQ query is suitable for parallelizing with PLINQ, but it serves as an example of the benefits of considering alternative ways to optimize performance.

Let me give one other very simple example of how a LINQ query can be performance optimized without the need to eliminate LINQ. In this example I'm generating a list of 1 million random books:

record Book(string Title, int Pages);
var random = new Random();
var books = Enumerable.Range(1,1000000)
    .Select(n => new Book($"Book {n}", random.Next(50,2000))).ToList();

Suppose I want to find the longest book and I decide to be clever with LINQ and do something like this:

var longestBook = books.First(b => b.Pages == books.Max(x => x.Pages));

This works just fine and will find the longest book, but it's an O(n^2) algorithm. For every book in the list, I go through the entire list again repeatedly recalculating the maximum number of pages.

Of course, the problem here is that there are already many more efficient ways we could do this in LINQ. In this example, the MaxBy operator is perfect for our needs:

var longestBook = books.MaxBy(b => b.Pages);

If we compare the speeds of the two approaches with a simple stopwatch:

var sw = Stopwatch.StartNew();
var longestBook = books.First(b => b.Pages == books.Max(x => x.Pages));
Console.WriteLine($"{longestBook} is the longest book {sw.ElapsedMilliseconds}ms");
sw.Restart();
longestBook= books.MaxBy(b => b.Pages);
Console.WriteLine($"{longestBook} is the longest book {sw.ElapsedMilliseconds}ms");

We see that we get a roughly 1000 times speed increase from using a more appropriate algorithm that is only O(n).

Book { Title = Book 1023, Pages = 1999 } is the longest book 11845ms
Book { Title = Book 1023, Pages = 1999 } is the longest book 12ms

In short, if you are in a situation where you think that LINQ might be what's making your application slow, there are still likely many ways to greatly improve performance that don't involve eliminating LINQ.

When should you avoid LINQ?

You might think from this article that I'm advocating that you should always use LINQ, but that's not true. There are situations where performance is so critical that avoiding LINQ makes sense.

For example, in my own NAudio library, I use LINQ very sparingly. For a CPU intensive algorithm like a Fast Fourier Transform, it wouldn't make sense to use LINQ even if it did make the code read a bit more elegantly.

And the .NET runtime libraries likewise have been heavily performance optimized due to the fact that they are relied on by all kinds of performance critical applications. I'm sure there are many places where the nicer syntax of LINQ has been avoided to gain a small performance boost.

Another example I've heard of where LINQ was deliberately avoided was in the "game loop" for a computer game, where getting the fastest possible render speed was considered critical.

Summary

In conclusion although it's true that there are some situations where it does make sense to avoid LINQ for performance reasons, it's not nearly as frequent as you might think, and there are often much better approaches to speeding up your code.

Always start by measuring what parts of your codebase are taking the longest to run, always ensure you consider multiple alternative strategies for optimizing the performance (or perceived performance) of a slow piece of code, and always take into account the business cost-benefit tradeoff of the developer time required to rewrite and the possible loss of readability and maintainability that comes with moving away from LINQ.

Want to learn more about LINQ? Be sure to check out my Pluralsight course More Effective LINQ.