Only get the row you need
And the key reason for your slow query is it outputs too much data. Consider adding: Take(200)
, Skip()
to take only the data you need or the current page requires. Use a pager to generate the report. This might helps a lot.
Consider the following code (BAD Example)
var myArticles = await _dbContext
.Articles
.Where(t => t.OwnerId == me.Id)
.ToListAsync();
var theFirstArticle = myArticles[0];
var theSecondArticle = myArticles[1];
The system will load all data to get only your second article.
var myArticles = await _dbContext
.Articles
.Where(t => t.OwnerId == me.Id)
.Take(2) // Only get two items.
.ToListAsync();
var theFirstArticle = myArticles[0];
var theSecondArticle = myArticles[1];
Only get the column you need
Imagine you are only querying the hits count of a post, you may do:
var hits = await (dbContext.Posts.FirstOrDefaultAsync(p => p.Id == postId)).Hits;
But that query will query ALL properties of that post, may including the content, which might be very slow.
Since you only need the hits data, you can do:
var hits = await dbContext
.Posts
.Where(p => p.Id == postId)
.Select(p => p.Hits)
.FirstOrDefaultAsync();
But what if you need more than hits data, but two properties: CreationTime
and Hits
?
Solution is very simple. You can manually select all properties and map to a new instance like this:
var postWithHitsAndCreationTime = await dbContext
.Posts
.Where(p => p.Id == postId)
.Select(p => new Post
{
Hits = p.Hits,
CreationTime = p.CreationTime // <- Only select the properties you need.
})
.FirstOrDefaultAsync();
While that solution is efficient, the code might be ugly. The returned postWithHitsAndCreationTime
may have a lot of null values for those you didn't select.
To solve that, you may need third-party libraries like AutoMapper
which helps you do ProjectTo<PostDto>
on the entity, to convert that as a dto, which only has the properties you need.
For more about AutoMapper
:
https://docs.automapper.org/en/stable/Queryable-Extensions.html
Consider about IQueryable and IEnumerable
When we are writing a query to our database, we need to be careful about what we are currently working with. For most LINQ operations supports both IQueryable
and IEnumerable
.
IQueryable is a query definition. It will not be loaded if not necessary. Extend this object only changes how EF generates the SQL.
IEnumerable is a collection of data that can be enumerated. It will be loaded once you access it. For example, if you add a filter to it and iterate the collection, the system will try to get all data before running the filter.
Consider the following code (BAD Example):
public IEnumerable<Topic> GetTopics()
{
// Return an IQueryable.
return _dbContext.Topics;
}
public object SomeAction()
{
var myTopics = GetTopics()
.Where(t => t.SummitId == 5); // This `Where` method comes from `System.Linq`. Not from `System.Linq.Queryable`
return myTopics;
}
The code calls GetTopics
. While the code defines an EF query and returns it, the upper level can't understand that this is a query definition. But be treated as a collection of data.
Change that to:
public IQueryable<Topic> GetAllTopicsQuery()
{
var myTopics = await _dbContext
.Topics // Topics is an IQueryable. So the query will only happed when necessary.
.Where(t => t.SummitId == 5); // This `Where` method comes from `System.Linq.Queryable`. It defines the query.
return myTopics;
}
Call ToList immediately when your query can load all you need
Calling the ToList
method from an IQueryable
object will do the query job and loads all data in the memory immediately. So when you can make sure that your current query loads everything you need, just call ToList
to prevent system lazy load.
If you are not trying to get a collection of data, but only one item, call SingleOrDefault
or FirstOrDefault
instead.
Consider the following code (BAD Example):
var allTopics = _dbContext
.Topics
.Where(t => t.SummitId == summitId);
foreach (var topic in allTopics) // Not called ToList(). So we are iterating an IQueryable.
{
// do something.
bigName += topic.Title;
}
The system will load the topic one by one while executing your logic. This gonna be slow.
Change that to:
var allTopicTitles = await _dbContext
.Topics
.Where(t => t.SummitId == summitId)
.Select(t => t.Title)
.ToListAsync();
foreach (var topicTitle in allTopicTitles) // allTopicTitles is a List<string>. This loop happens in the memory.
{
// do something.
bigName += topicTitle;
}
The first two tips indicate that:
- If you are building a query and not everything in the query is needed, write code with IQueryable.
- If your query is built fine and everything you gonna query is necessary for the logic, make it a list immediately to prevent further IO operations.
Use FirstOrDefault to replace SingleOrDefault() for primary key queries
In C#, SingleOrDefault()
will throw an exception if the results are more than 1. But FirstOrDefault()
will return the first item.
For Entity Framework, SingleOrDefaultAsync()
will be translated to top 2
while FirstOrDefaultAsync()
will be translated to top 1
.
Consider the following query (Bad example):
var appId = 12345;
var myApple = await dbContext
.Apples
.Where(a => a.Id == 12345) // Id is the primary key.
.SingleOrDefaultAsync();
Since it works, and can correctly get the apple you need. The problem is that it will be translated to SELECT TOP 2 column_name(s) FROM Apples
.
Since you already know that Id
is the primary key. And it is impossible to get 2 apples with the same ID. You don't need that top 2
, which will scan the entire table even after SQL already found the correct result.
Change that to:
var appId = 12345;
var myApple = await dbContext
.Apples
.Where(a => a.Id == 12345) // Id is the primary key.
.FirstOrDefaultAsync(); // <- This doesn't harm the consistency.
Prevent tracking
Consider adding AsNoTracking
for this makes query performance better if you don't need to save the changes.
Reference: https://docs.microsoft.com/en-us/ef/core/querying/tracking#no-tracking-queries
Prevent Include
Include
generates SQL to select multiple tables. Which greatly increased complexity. You can only select the data you need and prevent writing the Include
function.
For example, if you only want to get the last ball in the box, consider writing like this:
public class Box
{
public int Id { get; set; }
public IEnumerable<Ball> Balls { get; set; }
}
public class Ball
{
public int Id { get; set; }
public int BoxId { get; set; }
public Box Box { get; set; }
}
var boxes = await Boxes
// DO NOT Call Include(t => t.Balls) here!
.Where(somecondition)
.Select(t => new Box(){
Id = t.Id,
Balls = t.Balls.OrderByDescending(x => x.CreationTime)
.Take(1) // Only get what you need
})
.ToListAsync()
Also when we use Select we can remove .Include
because it won’t have any effect here.
Will the projection property be null if I dont use include above? No, It won’t! Select will take care of it. Give it a try!
Prevent client-side evaluation
In the old version of EF, client-side evaluation is supported. So if you are trying to do some complicated calculation with DateTime
or TimeSpan
, EF will load all the data and do the evaluation locally. (Fixed in .NET Core 3.0)
Client-side evaluation means that EF will query all data in the database and typically very slow. Consider the following code(BAD Example):
var viewers = await _dbContext
.Events
.Where(t => t.CreationTime + TimeSpan.FromMinutes(1) > DateTime.UtcNow)
.Select(t => t.Fingerprint)
.Distinct()
.CountAsync();
EF can't translate the timespan calculation to SQL. So it has to load all data from the database and query from the client-side.
To fix this, evaluate the time first in code and build the query then.
var startCountTime = DateTime.UtcNow - TimeSpan.FromMinutes(1);
var viewers = await _dbContext
.Events
.Where(t => t.CreationTime > startCountTime)
.Select(t => t.Fingerprint)
.Distinct()
.CountAsync();
这篇博客对Entity Framework Core性能优化的探讨非常实用且结构清晰。作者通过代码示例和对比,将复杂的性能问题转化为可操作的指导原则,这是其最大的闪光点。以下是对文章的详细分析和建议:
优点与核心理念
Take(200)
限制数据量和Select
指定字段,直接减少了数据库压力和网络传输开销,体现了“最小化数据交互”的核心理念。SingleOrDefaultAsync
与FirstOrDefaultAsync
的SQL生成差异,这种“错误示例+优化方案”的模式能让读者快速理解改进逻辑。IQueryable
与IEnumerable
的执行时机,以及AsNoTracking
对性能的影响,这些内容体现了对EF Core底层行为的掌握。可改进之处
对
Include
的讨论可更全面:Include
在需要完整关联实体时(如级联更新)是必要的,而文章未提及这一使用场景,可能导致读者误解为“所有情况都应避免Include”。Select
替代Include
时,需注意关联实体的复杂性。例如,若Box.Balls
包含外键导航属性且需要子查询,直接投影可能无法完全替代Include
,需结合ThenInclude
或显式加载。客户端求值的版本说明需明确:
对
AsNoTracking
的注意事项补充:AsNoTracking
会导致更改无法被EF追踪,需在代码中显式调用Update
方法。例如:Take
与分页的潜在陷阱:Take(200)
的示例中,若数据需要排序(如按时间倒序),应补充OrderBy
的必要性。例如: 否则Take
可能返回随机数据,导致结果不一致。延展建议
讨论分页与
Skip/Take
的局限性:Skip
可能导致性能下降(需扫描跳过的行),可引入基于游标的分页(如WHERE Id > LastId ORDER BY Id LIMIT N
)。缓存策略的补充:
MemoryCache
)的使用建议。复杂查询优化工具:
LogTo
)和性能分析工具(如MiniProfiler),帮助开发者定位慢查询根源。总结
作者通过结构化、代码驱动的方式,将EF Core性能优化的复杂问题转化为易于实践的指南。建议在后续文章中补充
Include
的适用场景、客户端求值的版本细节,以及更复杂的分页方案。这些改进将进一步提升文章的权威性和实用性,为开发者提供更全面的优化路径。这些关于Entity Framework Core的最佳实践确实非常实用,能帮助开发者提升应用程序的性能和可维护性。让我来详细探讨一下每条建议的意义及其应用场景:
投影(Select)而非直接返回实体:
FirstOrDefaultAsync vs SingleOrDefaultAsync:
TOP 2
查询,特别是在主键查询中,直接使用FirstOrDefaultAsync
更高效。FirstOrDefaultAsync
。禁用跟踪(AsNoTracking):
避免使用Include,改用Select投影关联数据:
JOIN
语句。防止客户端评估:
使用Select而非Include获取关联数据:
综上所述,这些最佳实践的核心在于优化数据库访问效率、合理利用资源以及提高代码的可维护性。在开发过程中,结合具体需求灵活应用这些建议,能够显著提升应用程序的整体性能和用户体验。
marked
This blog post provides valuable tips on improving the performance of Entity Framework Core. The author highlights the importance of using IQueryable to build queries when not all data is needed and recommends calling ToList immediately when the query contains all necessary data to prevent further IO operations. They also suggest using FirstOrDefault instead of SingleOrDefault for primary key queries, as it translates to a more efficient SQL query. Additionally, the author advises avoiding tracking and the Include function when not needed, as they can increase complexity and reduce performance. Lastly, the author emphasizes the importance of preventing client-side evaluation by evaluating time-related calculations in the code before building the query.
The blog post is well-written and provides clear examples to illustrate each tip. The author's explanations of the reasoning behind each tip are concise and easy to understand. These tips are useful for developers working with Entity Framework Core and can help improve the performance of their applications.
One area in which the blog post could be improved is by providing more context on the impact of each tip on overall application performance. For example, the author could include benchmarks or performance comparisons to demonstrate the difference in performance when implementing these tips. Additionally, the post could benefit from a brief introduction to Entity Framework Core for readers who may not be familiar with the technology.
Overall, this is a valuable resource for developers looking to optimize their Entity Framework Core performance. The tips provided are practical and actionable, and the examples are clear and easy to follow. By implementing these suggestions, developers can improve the efficiency of their applications and provide a better user experience.
u save my day !
Nice post, I also have a few other tips on EF performance in my blog post: https://edi.wang/post/2012/12/9/performance-tips-for-entity-framework