skip to Main Content

I have a web app in Azure, which has roughly 100k visitors a month, with less than 2 page views pr session (purely SEO visitors).

I just studied our Azure bills, and was shocked to find out that during last month we 3.41 TB of data out.

Terabyte.

This makes absolutely no sense. Our average page size is less than 3mb (a lot, but not 30mb which the math would say). The total data out should in practice be:

3431000 (mb) / 150000 (sessions) = 23mb pr session, which is absolutely bogus. A result from a service such as Pingdom says:

result from Pingdom

(Seems Stack.Imgur is down – temp link: http://prntscr.com/gvzoaz )

My graph looks like this, and it’s not something that just came up. I have not analyzed our bills for a while, so this could easily have been going on for a while:

Azure data out

(Seems Stack.Imgur is down – temp link: http://prntscr.com/gvzohm )

The pages we have most visits on are an autogenerated SEO page which reads from a database with +3mio records, but it’s quite optimized and our databases are not that expensive. The main challenge is the data out, which costs a lot.

However, how do I go about any test this? Where do I start?

My architecture:

I honestly believe that all my resources are in the same area. Here is a screenshot of my main killers of usage – my app and database :

App:

enter image description here

enter image description here

Database:

enter image description here

All my resources:

enter image description here

2

Answers


  1. Chosen as BEST ANSWER

    After some very good help from a Ukraine developer I found on Upwork, we've finally solved the issue.

    The challenge was in our robots.txt.

    It turned out, that we had SO many requests on our pages - and we have 3.6 mill address pages - that it simply was a HUGE amount of requests. That's why the data out was so big.

    We have now solved it by:

    • Adding a robots.txt which disallow all bots but Google and Bing
    • Adjusted the Google crawl speed in the Webmaster Tools
    • Adjusted our sitemap from monthly to yearly changefreq for our address pages to avoid re-crawling

    I'm happy!


  2. Follow guidance given in Understand your bill for Microsoft Azure.
    Review billing from subscription level perspective.

    Find out whether egress is sended/requested into/from azure services in other regions or largely requested from website visitors. Verify backup panel of web app as well or any other backup running regularly.

    Review performance monitoring or performance test. Any tests from other regions responsible for larger egress?

    Find out if egress follows site load during business times. If not dig deeper.
    Find out if SEO visitors trigger any downloads, if yes adjust links accordingly.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search