I have an ASP.NET Web API project, and have implemented repository and unitOfWork patterns.
I currently have a big SQL query that I want to implement in code, it includes a lot of tables and joins.
My current solution is just fetching data from repositories one by one, returning an in-memory value or IEnumerable
(not an IQueryable
as it defeats the purpose of repositories) and just joining them later and projecting them to a DTO.
But this makes 20 round trips to the database for 20 tables I’m using, and it’s very inefficient.
How can I improve performance (mainly focused on lowering number of round trips to database) by following repository pattern?
2
Answers
I don’t think you can say "I need to change the design of this without changing the patterns I’m using".
I think you might have a boundary issue if you’re doing 20 queries that you can’t do in 1 Batch.
Solution 1: Write SQL
Execute this as one sproc or command and parse each result into your DTOs.
If you’re in MS SQL Server land you can use Linked Servers to cross boundaries. I don’t think this is a great idea, but pushes this logic into the DB side and makes it 1 round trip as far as the application is concerns.
Solution 2: Do them concurrently in a
Manager
orHelper
Your manager will control getting data from your Repositories.
Solution 3: You need a DataWarehouse or some sort of Aggregation
If you’re doing a bunch of queries all over the place that should be part of a single process, then create a single place for that.
Another SQL Server database that collects this information. Maybe events in that update it’s state. The structure of this should be suited to the business question you are answering.
You could use CDC to create the events. Or if you already have a system with an aggregate simply read those events and re-build something more suited to what you really want to query.
Your repositories are implemented at the wrong layer.
Then there should be a single repository method that returns the result of that query. Although in EF it would normally be shaped with multiple related objects, instead of a big flat query result.
A repository shouldn’t be ‘per table’, it should be ‘per aggregate root object’. If a table cannot exist without another one (e.g. a school class has a teacher and some students), then the repository should be able return a class and it’s students in one query. You should not have to query multiple times to retrieve related data (e.g. 1 query to get class, then another to retreive it’s teacher and another to retrieve the students)