skip to Main Content

What is the correct way of repository implementation in EF Core?

public IAsyncEnumerable<Order> GetOrder(int orderId)
{
    return blablabla.AsAsyncEnumerable();
}

or

public Task<IEnumerable<Order>> GetOrder(int orderId)
{
    return blablabla.ToListAsync();
}

Is it performance wise to call AsAsyncEnumerable()? Is this approach safe?
From one hand it doesn’t create List<T> object so it should be slightly faster. But from the order hand the query is not materialized so we deffer the SQL execution and the result can change in the meantime.

2

Answers


  1. According to source .ToListAsync will use IAsyncEnumerable internally anyway, so there’s not much of performance benefits in one or another.
    But one important feature of .ToListAsync or .ToArrayAsync is cancellation.

    public static async Task<List<TSource>> ToListAsync<TSource>(
        this IQueryable<TSource> source,
        CancellationToken cancellationToken = default)
    {
        var list = new List<TSource>();
        await foreach (var element in source.AsAsyncEnumerable().WithCancellation(cancellationToken))
        {
            list.Add(element);
        }
        return list;
    }
    

    List will basically hold everything in memory but it might be a serious performance concern only if the list is really big. In this case you might consider paging your big response.

    public Task<List<Order>> GetOrders(int orderId, int offset, int limit)
    {
        return blablabla.Skip(offset).Take(limit).ToListAsync();
    }
    
    Login or Signup to reply.
  2. The decision really comes down to whether you wish to buffer or stream.

    If you want to buffer the results, use ToList() or ToListAsync().
    If you want to stream the results, use AsEnumerable() or AsAsyncEnumerable().

    From the docs:

    Buffering refers to loading all your query results into memory, whereas streaming means that EF hands the application a single result each time, never containing the entire resultset in memory. In principle, the memory requirements of a streaming query are fixed – they are the same whether the query returns 1 row or 1000; a buffering query, on the other hand, requires more memory the more rows are returned. For queries that result large resultsets, this can be an important performance factor.

    In general, it’s best to stream, unless you need to buffer.

    When you stream, once the data is read, you can’t read it again without hitting the DB again. So if you need to read the same data more than once, you’ll need to buffer.

    If a repository streams a IEnumerable, the caller could choose to buffer it by calling ToList() (or ToListAsync() on IAsyncEnumerable). We lose this flexibility if the repository chooses to return an IList.

    So to answer your question, you’re better off to the repository stream the result. And let the caller decide if they want to buffer.


    If the team working on the project is not comfortable with stream semantics, or if most of the code already buffers, it might make sense to suffix the methods that stream with something like AsStream (eg. GetOrdersAsStream()) so that they know they shouldn’t be enumerating it more than once.

    So a repository could have:

    async Task<List<Order>> GetOrders() => await GetOrdersAsStream.ToListAsync();
    IAsyncEnumerable<Order> GetOrdersAsStream() => ...
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search