I run an ASP.NET (C#) Web Forms ecommerce website hosted in Azure as an auto scale-out WebApp, using an AzureSQL database. Each of the 300k product pages displays a significant amount of data that’s pulled from the database. Retrieving the necessary data requires running several relatively large, complex queries referencing several relational tables each (i.e., lots of joins). I cannot show any less data on each page as it’s a business requirement and beyond my control. I’ve added several SQL indexes, both manually and using recommendations made by Azure, but the queries still take too long. There are 15+ queries running for each page load. I’ve tried to reduce the number, but it’s challenging and the resulting fewer, but more complex queries take longer to run than having a greater number of simpler queries. The product page load time is about 5 seconds, and my goal is to reduce it to under 1 second.
I’m looking for options on how I can re-architect how we’re storing/referencing our data.
Note: I would like to avoid solutions that dynamically load in content below the fold based on scrolling, since while this would improve load times, it does not address the underlying problem of the suboptimal way our data is being stored/referenced.
Some options I’ve considered:
- Run a background service that forms a complete data set for each product page, and saves the data set in:
- A Redis cache (would be costly).
- A CDN (might be costly + inappropriate use of CDN).
- A CDN, but then having each dynamically created website instance download a copy locally if one does not exist, and then referencing the local data instead of the CDN data (seems kind of dumb + would be difficult to purge local files when data sets update).
- Create some sort of flattened "warehouse" table in our SQL database where all data needed for each product page would be in a single row, and then setting up triggers or something to ensure it’s updated as needed (seems sloppy + error prone + inappropriate use of relational database).
- Same thing as above, but in a new NoSQL database (possibly overkill/redundant to have two databases storing the same data).
I’m not convinced that any of my ideas are good enough to pursue, so that’s what brings me here – I’m hoping someone can suggest a better option. I’m open to any solution that’s compatible with ASP.NET Web Forms and Azure cloud services.
2
Answers
Seems to me the ‘waste’ is around composing the data to be displayed in your page.
You can solve it using a Cache mechanism / rather than using a complex query to grab data from multiple tables, use a Document Store based NoSQL and create documents with all the required information to be displayed. This way, rather than grabbing data from products, manufacturer, tags, etc, you will just do a quick lookup under products catalog using the product id.
e.g.
GET /products/123
I’m not sure about your budget constraints, but one thing you can do is use CosmosDB, I’ve read successful implementations of CosmosDB as Document Store and also as a replacement for Azure Cache for Redis.
PS: You can also try native aspnet output cache
Caching hides a multitude of sins – some ideas:
Are the pages reasonably static? – use output caching https://learn.microsoft.com/en-us/previous-versions/aspnet/sfw2210t(v=vs.100)
Can query output be cached:
https://learn.microsoft.com/en-us/aspnet/web-forms/overview/data-access/caching-data/caching-data-in-the-architecture-cs
Consider limiting the data you return using paging: https://medium.com/@ohadinho25/linq-query-performance-improvement-guidelines-183d569b0668