Essentially I am trying to produce the following SQL query via Linq:
SELECT *
FROM `CustomerMainTable` AS `t`
WHERE `t`.`CustomerMainTableId` IN (
SELECT MAX(`t0`.`CustomerMainTableId`)
FROM `CustomerMainTable` AS `t0`
GROUP BY `t0`.`CustomerMainTableCustomerId`
)
CustomerMainTableId
is an autoincrement id, and CustomerMainTableCustomerId
is the customer’s unique ID.
There can be multiple entries of the customer’s transaction in the CustomerMainTableId
, so the idea is to select the latest transaction for each customer by grouping and selecting the max id.
In entity framework core 3.1, this was achieved via:
var CustomerGroupQuery = _DB.CustomerMainTable.Where(p => _DB.CustomerMainTable
.GroupBy(l => l.CustomerMainTableCustomerId)
.Select(g => g.Max(c => c.CustomerMainTableId))
.Contains(p.CustomerMainTableId));
However, the same query, when executed in EF Core 6 or 7 produces a wrong query in SQL:
SELECT *
FROM `CustomerMainTable` AS `t`
WHERE EXISTS (
SELECT 1
FROM `CustomerMainTable` AS `t0`
GROUP BY `t0`.`CustomerMainTableCustomerId`
HAVING MAX(`t0`.`CustomerMainTableId`) = `t`.`CustomerMainTableId`)
Interestingly enough, if we split the Linq query like so:
List<int> idlist = _DB.CustomerMainTable
.GroupBy(l => l.CustomerMainTableCustomerId)
.Select(g => g.Max(c => c.CustomerMainTableId)).ToList();
var CustomerGroupQuery = _DB.CustomerMainTable.Where(p => idlist.Contains(p.CustomerMainTableId));
It produces the correct query – but requires a list of integers to be returned.. which is less than ideal.
Might anyone know of a workaround for this issue?
2
Answers
FYI for anyone else - EFCore confirms they will roll back the exists to a where "in": https://github.com/dotnet/efcore/issues/30955
For EF Core 6 and 7 you can use other technique how to get latest record of the group: