Skip to content

Query: Avoid adding unnecessary sort columns in query with projection and optional navigation properties #6861

Closed
@jemiller0

Description

Steps to reproduce

Run a query that does a projection such as the following.

                var q = from b in lc.Bibs
                        select new
                        {
                            b.Id,
                            b.OclcNumber,
                            b.Title,
                            b.Author,
                            b.PublicationDate,
                            Language = b.Language.Name,
                            b.LanguageId,
                            Format = b.Format.Name,
                            b.FormatId,
                            BibStatus = b.BibStatus.Name,
                            b.BibStatusId,
                            b.BibStatusLastWriteTime,
                            b.BibStatusLastWriteUserName,
                            b.IsPublic,
                            b.Level,
                            b.CreationTime,
                            b.CreationUserName,
                            b.LastWriteTime,
                            b.LastWriteUserName
                        };
                q = q.OrderByDescending(b => b.LastWriteTime);
                q = q.Skip(200).Take(100);
                var l = q.ToList();

The following zip file contains the definitions for the POCO classes that are used by the above query.

QueryOrderByBug.zip

The issue

EF Core 1.1.0-preview1-final generates the following SQL. There are a few problems with the generated SQL. The main one is that you can see that it is sorting by FormatId and LanguageId columns in addition to the LastWriteTime column which is the only one that was specified in the LINQ query. The Bibs table used below is a table containing 7.7 million rows. The query takes 12 seconds to execute. If I run the SQL using SQL Management Studio, but, comment out the extra order by columns, the query executes in 0 seconds. It looks like what it's doing is adding sort columns for the foreign key columns that are not required. The project above is also bringing in the BibStatus navigation property. That one is required and there isn't a sort for it.

A second problem is that the query is doing a projection, but, it looks like EF Core is bringing in all columns from the tables that are used. In this particular query, there is a column of type VARCHAR(MAX) that contains a lot of data. So, I want to avoid bringing that in.

Another thing that I noticed is that it appears that it's using an INNER JOIN on BibStatus, which makes sense because it's a required field. It's doing LEFT JOINs on the the other fields which are not required, which also makes sense. However, EF 6 used a LEFT JOIN for all of the JOINs. I want to say LEFT JOIN is faster than INNER JOIN. For this particular query, it doesn't appear to make a difference. The main problem that seems to be slowing everything down is the extra ORDER BY columns on the foreign key fields.

exec sp_executesql N'SELECT [b].[Id], [b].[Author], [b].[BibStatusId], [b].[BibStatusLastWriteTime], [b].[BibStatusLastWriteUserName], [b].[Content], [b].[CreationTime], [b].[CreationUserName], [b].[FormatId], [b].[IsPublic], [b].[LanguageId], [b].[LastWriteTime], [b].[LastWriteUserName], [b].[Level], [b].[OclcNumber], [b].[PublicationDate], [b].[Title], [b].[Type], [b.Format].[Id], [b.Format].[CreationTime], [b.Format].[CreationUserName], [b.Format].[LastWriteTime], [b.Format].[LastWriteUserName], [b.Format].[Name], [b.Language].[Id], [b.Language].[Code], [b.Language].[CreationTime], [b.Language].[CreationUserName], [b.Language].[LastWriteTime], [b.Language].[LastWriteUserName], [b.Language].[Name], [b.BibStatus].[Name]
FROM [Bibs] AS [b]
INNER JOIN [BibStatuses] AS [b.BibStatus] ON [b].[BibStatusId] = [b.BibStatus].[Id]
LEFT JOIN [Formats] AS [b.Format] ON [b].[FormatId] = [b.Format].[Id]
LEFT JOIN [Languages] AS [b.Language] ON [b].[LanguageId] = [b.Language].[Id]
ORDER BY [b].[LastWriteTime] DESC, [b].[FormatId], [b].[LanguageId]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY',N'@__p_0 int,@__p_1 int',@__p_0=200,@__p_1=100

The following is what the SQL looks like for EF 6.1.3.

SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[OclcNumber] AS [OclcNumber], 
    [Extent1].[Title] AS [Title], 
    [Extent1].[Author] AS [Author], 
    [Extent1].[PublicationDate] AS [PublicationDate], 
    [Extent2].[Name] AS [Name], 
    [Extent1].[LanguageId] AS [LanguageId], 
    [Extent3].[Name] AS [Name1], 
    [Extent1].[FormatId] AS [FormatId], 
    [Extent4].[Name] AS [Name2], 
    [Extent1].[BibStatusId] AS [BibStatusId], 
    [Extent1].[BibStatusLastWriteTime] AS [BibStatusLastWriteTime], 
    [Extent1].[BibStatusLastWriteUserName] AS [BibStatusLastWriteUserName], 
    [Extent1].[IsPublic] AS [IsPublic], 
    [Extent1].[Type] AS [Type], 
    [Extent1].[Level] AS [Level], 
    [Extent1].[CreationTime] AS [CreationTime], 
    [Extent1].[CreationUserName] AS [CreationUserName], 
    [Extent1].[LastWriteTime] AS [LastWriteTime], 
    [Extent1].[LastWriteUserName] AS [LastWriteUserName]
    FROM    [dbo].[Bibs] AS [Extent1]
    LEFT OUTER JOIN [dbo].[Languages] AS [Extent2] ON [Extent1].[LanguageId] = [Extent2].[Id]
    LEFT OUTER JOIN [dbo].[Formats] AS [Extent3] ON [Extent1].[FormatId] = [Extent3].[Id]
    LEFT OUTER JOIN [dbo].[BibStatuses] AS [Extent4] ON [Extent1].[BibStatusId] = [Extent4].[Id]
    ORDER BY [Extent1].[LastWriteTime] DESC
    OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY 

The query appears to work correct, just much slower than it should due to the extra sort columns.

Further technical details

EF Core version: 1.1.0-preview1-final
Operating system: Windows 10 Anniversary Update
Visual Studio version: 2015

Metadata

Assignees

Labels

area-perfclosed-fixedThe issue has been fixed and is/will be included in the release indicated by the issue milestone.type-enhancement

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions