Should I avoid LINQ for performance reasons?

When I created my More Effective LINQ Pluralsight course (hopefully to be updated in the not too distant future!), I wanted to include a section that discussed the question of the performance implications of using LINQ. Sometimes developers hear that "LINQ is slower than using a for loop" and wonder whether that means they should avoid using LINQ for performance reasons.

Note: In this post I'm specifically talking about LINQ to objects. With LINQ queries against databases (such as when you're using Entity Framework) performance considerations are mostly related to whether the generated SQL is efficient or not.

It is slightly slower

Let's start off by acknowledging that using the LINQ operators (such as Select and Where) does typically result in very slighly slower code than if you wrote a for or foreach loop to do the same thing.

This is acknowledged in the Microsoft documentation:

LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code.

And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so. I created a very simple LINQ benchmark project a few years back and recently updated it to .NET 6.

The test basically uses a LINQ pipeline to perform a series of mathematical operations on 10 million numbers.

const int N = 10000000;
return Enumerable.Range(1, N)
    .Select(n => n * 2)
    .Select(n => Math.Sin((2 * Math.PI * n) / 1000))
    .Select(n => Math.Pow(n, 2))
    .Sum();

But how does that compare with a regular for loop performng the same calculation:

double sum = 0;
for (int n = 1; n <= N; n++)
{
    var a = n * 2;
    var b = Math.Sin((2 * Math.PI * a) / 1000);
    var c = Math.Pow(b, 2);
    sum += c;
}

With BenchmarkDotNet, running on .NET 6, we can see that the for loop was indeed significantly faster (around 20%) as we might expect:

Method	Mean	Error	StdDev
SumLinq	415.14 ms	8.076 ms	8.977 ms
SumForLoop	335.70 ms	6.553 ms	10.203 ms

It probably doesn't matter

Now before you go rushing off and refactoring all your LINQ code into for loops, it's worth remembering that in the vast majority of situations, LINQ is not a major contributing factor to the overall speed of your application.

The example I used was special in a couple of ways - first, we were iterating through millions of times (not a typical scenario for LINQ to objects), and second, the inside of the loop was extremely trivial (which is not always the case when you're using LINQ).

One of the most important rules of software optimization is to measure first. If your application is slow, take some time to instrument it and find out where the time is being taken. This will lead you to the places you should invest effort in performance optimizing. It is extremely unlikely in most enterprise applications that switching from LINQ to for loops will make any noticable difference to the overall performance. It would just make the code harder to read and maintain for negilible benefit.

By the way, don't take my word for it - I foumd this stack overflow answer from Eric Lippert that states it well. He says "ensure that you're only spending valuable time and effort doing performance optimizations on things that are not fast enough."

There are often better ways to make your code faster

We saw from the benchmarks above that if we converted my simple LINQ example into a for loop it would have run faster. But that's not the only way we could speed up that code.

For example, if I run that same comparison on .NET 4.8 I get the following results:

Method	Mean	Error	StdDev
SumLinq	756.4 ms	14.83 ms	19.29 ms
SumForLoop	645.8 ms	12.43 ms	12.21 ms

As you can see, .NET 4.8 is significantly slower than .NET 6. That means that if my project is running on .NET 4.8, then converting to .NET 6 will actually give me a substantially bigger performance boost than converting my LINQ expressions to for loops. And it will make many other parts of my application faster as well.

Of course, the for loop on .NET 6 is still the fastest of the options we've seen so far. But that's not the only way we could go about making this code run faster. As it happens, the example I used can be parallelized using PLINQ. The change is trivial - just add an AsParallel and PLINQ will use different threads that can run across different CPUs to calculate the values and then join back together with the Sum at the end.

Enumerable.Range(1, N).AsParallel()
    .Select(n => n * 2)
    .Select(n => Math.Sin((2 * Math.PI * n) / 1000))
    .Select(n => Math.Pow(n, 2))
    .Sum();

When we run this, we can see that the PLINQ solution greatly outperforms even the for loop conversion, whilst retaining the readability and composability benefits that typically come with LINQ pipelines.

Method	Mean	Error	StdDev
SumLinq	415.14 ms	8.076 ms	8.977 ms
SumForLoop	335.70 ms	6.553 ms	10.203 ms
SumParallel	93.55 ms	1.824 ms	2.371 ms

Of course, not every LINQ query is suitable for parallelizing with PLINQ, but it serves as an example of the benefits of considering alternative ways to optimize performance.

Let me give one other very simple example of how a LINQ query can be performance optimized without the need to eliminate LINQ. In this example I'm generating a list of 1 million random books:

record Book(string Title, int Pages);
var random = new Random();
var books = Enumerable.Range(1,1000000)
    .Select(n => new Book($"Book {n}", random.Next(50,2000))).ToList();

Suppose I want to find the longest book and I decide to be clever with LINQ and do something like this:

var longestBook = books.First(b => b.Pages == books.Max(x => x.Pages));

This works just fine and will find the longest book, but it's an O(n^2) algorithm. For every book in the list, I go through the entire list again repeatedly recalculating the maximum number of pages.

Of course, the problem here is that there are already many more efficient ways we could do this in LINQ. In this example, the MaxBy operator is perfect for our needs:

var longestBook = books.MaxBy(b => b.Pages);

If we compare the speeds of the two approaches with a simple stopwatch:

var sw = Stopwatch.StartNew();
var longestBook = books.First(b => b.Pages == books.Max(x => x.Pages));
Console.WriteLine($"{longestBook} is the longest book {sw.ElapsedMilliseconds}ms");
sw.Restart();
longestBook= books.MaxBy(b => b.Pages);
Console.WriteLine($"{longestBook} is the longest book {sw.ElapsedMilliseconds}ms");

We see that we get a roughly 1000 times speed increase from using a more appropriate algorithm that is only O(n).

Book { Title = Book 1023, Pages = 1999 } is the longest book 11845ms
Book { Title = Book 1023, Pages = 1999 } is the longest book 12ms

In short, if you are in a situation where you think that LINQ might be what's making your application slow, there are still likely many ways to greatly improve performance that don't involve eliminating LINQ.

When should you avoid LINQ?

You might think from this article that I'm advocating that you should always use LINQ, but that's not true. There are situations where performance is so critical that avoiding LINQ makes sense.

For example, in my own NAudio library, I use LINQ very sparingly. For a CPU intensive algorithm like a Fast Fourier Transform, it wouldn't make sense to use LINQ even if it did make the code read a bit more elegantly.

And the .NET runtime libraries likewise have been heavily performance optimized due to the fact that they are relied on by all kinds of performance critical applications. I'm sure there are many places where the nicer syntax of LINQ has been avoided to gain a small performance boost.

Another example I've heard of where LINQ was deliberately avoided was in the "game loop" for a computer game, where getting the fastest possible render speed was considered critical.

Summary

In conclusion although it's true that there are some situations where it does make sense to avoid LINQ for performance reasons, it's not nearly as frequent as you might think, and there are often much better approaches to speeding up your code.

Always start by measuring what parts of your codebase are taking the longest to run, always ensure you consider multiple alternative strategies for optimizing the performance (or perceived performance) of a slow piece of code, and always take into account the business cost-benefit tradeoff of the developer time required to rewrite and the possible loss of readability and maintainability that comes with moving away from LINQ.

Comments

June 19. 2022 22:14

One thing that always gets me is when a developer uses a LINQ query to filter a collection and then iterates over the selected subset in a foreach loop. Why on Earth not simply put an if statement inside the loop?

Luke Webber

June 20. 2022 18:27

Not sure I see the problem - both options work just fine. Often with a LINQ pipeline you chain together mapping and filtering elements, but at the end once you want to "act" on the elements coming out of the pipeline, a foreach loop is how you'd normally implement that.

Mark Heath

June 20. 2022 21:15

The biggest issue with LINQ isn't performance but clarity. Sometimes LINQ is very clear as to what is happening. Other times LINQ is so obtuse as to be unreadable. Until performance is shown to be an issue, go for code clarity.

Mike

June 20. 2022 21:24

Sure, I'm completely in agreement that readability matters. I devoted a whole module to it in my Pluralsight course - a clever but incomprehensible LINQ pipeline helps no-one. Personally, I don't consider a Where method with a *simple* lambda expression inside it to be a particularly big barrier to readability. Obviously you're free to take a different view!

June 20. 2022 22:10

I was referring to cases where the LINQ is a simple .Where(blah). You can easily just put a test into your loop, which you were going to be doing anyway. Obviously, if there is ordering or grouping involved, LINQ is your friend, but a Where followed by a foreach just grinds my gears.

June 21. 2022 12:40

I avoid linq for the most part because I am still called upon to target .NET 2.0, and other times to simply avoid the unknown code bloat of yet one more utility library for just a single function call.

Steve Tabler

June 26. 2022 22:55

If you are functional programmer your life might become a hell without LINQ.

Masaab Aljailani

June 27. 2022 05:32

Nesting.
Some code guidelines dictate that you MUST use { } to enclose even single statements:
if (blah) { something = somethingelse }
Some reply that if you have more than one statement inside your if block, then you should put those into a separate method instead.
Either way, if you do a .Where() in advance, it circumvents this issue. IMO it makes the code more readable as well. And it is certainly easier to expand with more conditions (and/or sorting, etc).

NorrbaggRune

August 26. 2022 09:37

Would you propose avoiding LINQ here...
var ListOfTea = _service.FetchListOfTea(true);
// this
var collectionToCheckLater = ListOfTea .Where(x => x.IsOfInterest).Select(y => y.Name);
// rather than
foreach(var tea in ListOfTea)
{
if (tea.IsOfInterest)
collectionToCheckLater.Add(tea.Name);
}

Mark