skip to Main Content

I currently have a problem with varnish 7.0.3

I’m using varnish in front of Magento. Right now I have a problem with memory leak with the varnish Transient storage.

From what I see on the internet there is a problem with the the libjemalloc2 version "libjemalloc2/jammy,now 5.2.1-4ubuntu1" regarding this, and they say that 5.3.0 solved this issue.

How can I upgrade the libjemalloc2 to 5.3.0? I’m using Ubuntu 22.04 and everything was installed using apt package manager.

2

Answers


  1. Chosen as BEST ANSWER

    Please check Varnish 6 LTS /w CentOS 8 not respecting memory limits?

    It is happening the same to me. I'm using ubuntu 22.04 LTS that has jemalloc 5.2.1 .

    The varnish.service now looks like this:

    [Service]
    Type=forking
    KillMode=process
    Environment="MALLOC_CONF=thp:never,narenas:2"
    
    # Maximum number of open files (for ulimit -n)
    LimitNOFILE=131072
    
    # Locked shared memory - should suffice to lock the shared memory log
    # (varnishd -l argument)
    # Default log size is 80MB vsl + 1M vsm + header -> 82MB
    # unit is bytes
    LimitMEMLOCK=85983232
    
    # Enable this to avoid "fork failed" on reload.
    TasksMax=infinity
    
    # Maximum size of the corefile.
    LimitCORE=infinity
    
    ExecStart=/usr/sbin/varnishd 
              -a :80 
              -p feature=+http2 
              -p feature=+esi_ignore_other_elements 
              -p feature=+esi_disable_xml_check 
              -p feature=+esi_ignore_https 
              -p http_req_size=256k 
              -p http_resp_hdr_len=128k 
              -p http_resp_size=512k 
              -p workspace_backend=512k 
              -p workspace_client=256k 
              -f /etc/varnish/default.vcl 
              -s Cache=malloc,2g 
              -s Transient=malloc,512m
    
    ExecReload=/usr/sbin/varnishreload
    
    [Install]
    WantedBy=multi-user.target
    

    My vcl looks like this:

    vcl 4.0;
    
    import std;
    # The minimal Varnish version is 6.0
    # For SSL offloading, pass the following header in your proxy server or load balancer: 'X-Forwarded-Proto: https'
    
    backend default {
        .host = "xxx.xx.xx.xx";
        .port = "80";
        .first_byte_timeout = 30s;
        .probe = {
            .url = "/health_check.php";
            .timeout = 2s;
            .interval = 2m;
            .window = 10;
            .threshold = 5;
       }
    }
    
    acl purge {
        "xx.xxx.xx.xx";
    }
    
    sub vcl_recv {
        if (req.restarts > 0) {
            set req.hash_always_miss = true;
        }
    
        if (req.method == "PURGE") {
            if (client.ip !~ purge) {
                return (synth(405, "Method not allowed"));
            }
            # To use the X-Pool header for purging varnish during automated deployments, make sure the X-Pool header
            # has been added to the response in your backend server config. This is used, for example, by the
            # capistrano-magento2 gem for purging old content from varnish during it's deploy routine.
            if (!req.http.X-Magento-Tags-Pattern && !req.http.X-Pool) {
                return (synth(400, "X-Magento-Tags-Pattern or X-Pool header required"));
            }
            if (req.http.X-Magento-Tags-Pattern) {
              ban("obj.http.X-Magento-Tags ~ " + req.http.X-Magento-Tags-Pattern);
            }
            if (req.http.X-Pool) {
              ban("obj.http.X-Pool ~ " + req.http.X-Pool);
            }
            return (synth(200, "Purged"));
        }
    
        if (req.method != "GET" &&
            req.method != "HEAD" &&
            req.method != "PUT" &&
            req.method != "POST" &&
            req.method != "TRACE" &&
            req.method != "OPTIONS" &&
            req.method != "DELETE") {
              /* Non-RFC2616 or CONNECT which is weird. */
              return (pipe);
        }
    
        # We only deal with GET and HEAD by default
        if (req.method != "GET" && req.method != "HEAD") {
            return (pass);
        }
    
        # Bypass customer, shopping cart, checkout
        if (req.url ~ "/customer" || req.url ~ "/checkout") {
            return (pass);
        }
    
        # Bypass health check requests
        if (req.url ~ "^/(pub/)?(health_check.php)$") {
            return (pass);
        }
    
        # Set initial grace period usage status
        set req.http.grace = "none";
    
        # normalize url in case of leading HTTP scheme and domain
        set req.url = regsub(req.url, "^http[s]?://", "");
    
        # collect all cookies
        std.collect(req.http.Cookie);
    
        # Compression filter. See https://www.varnish-cache.org/trac/wiki/FAQ/Compression
        if (req.http.Accept-Encoding) {
            if (req.url ~ ".(jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf|flv)$") {
                # No point in compressing these
                unset req.http.Accept-Encoding;
            } elsif (req.http.Accept-Encoding ~ "gzip") {
                set req.http.Accept-Encoding = "gzip";
            } elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "MSIE") {
                set req.http.Accept-Encoding = "deflate";
            } else {
                # unknown algorithm
                unset req.http.Accept-Encoding;
            }
        }
    
        # Remove all marketing get parameters to minimize the cache objects
        if (req.url ~ "(?|&)(gclid|cx|ie|cof|siteurl|zanpid|origin|fbclid|mc_[a-z]+|utm_[a-z]+|_bta_[a-z]+)=") {
            set req.url = regsuball(req.url, "(gclid|cx|ie|cof|siteurl|zanpid|origin|fbclid|mc_[a-z]+|utm_[a-z]+|_bta_[a-z]+)=[-_A-z0-9+()%.]+&?", "");
            set req.url = regsub(req.url, "[?|&]+$", "");
        }
    
        # Static files caching
        if (req.url ~ "^/(pub/)?(media|static)/") {
            # Static files should not be cached by default
            return (pass);
    
            # But if you use a few locales and don't use CDN you can enable caching static files by commenting previous line (#return (pass);) and uncommenting next 3 lines
            #unset req.http.Https;
            #unset req.http.X-Forwarded-Proto;
            #unset req.http.Cookie;
        }
    
        # Bypass authenticated GraphQL requests without a X-Magento-Cache-Id
        if (req.url ~ "/graphql" && !req.http.X-Magento-Cache-Id && req.http.Authorization ~ "^Bearer") {
            return (pass);
        }
    
        return (hash);
    }
    
    sub vcl_hash {
        if ((req.url !~ "/graphql" || !req.http.X-Magento-Cache-Id) && req.http.cookie ~ "X-Magento-Vary=") {
            hash_data(regsub(req.http.cookie, "^.*?X-Magento-Vary=([^;]+);*.*$", "1"));
        }
    
        # To make sure http users don't see ssl warning
        if (req.http.X-Forwarded-Proto) {
            hash_data(req.http.X-Forwarded-Proto);
        }
    
    
        if (req.url ~ "/graphql") {
            call process_graphql_headers;
        }
    }
    
    sub process_graphql_headers {
        if (req.http.X-Magento-Cache-Id) {
            hash_data(req.http.X-Magento-Cache-Id);
    
            # When the frontend stops sending the auth token, make sure users stop getting results cached for logged-in users
            if (req.http.Authorization ~ "^Bearer") {
                hash_data("Authorized");
            }
        }
    
        if (req.http.Store) {
            hash_data(req.http.Store);
        }
    
        if (req.http.Content-Currency) {
            hash_data(req.http.Content-Currency);
        }
    }
    
    sub vcl_backend_response {
    
        set beresp.grace = 3d;
    
        if (beresp.http.content-type ~ "text") {
           set beresp.do_esi = true;
        }
    
        if (bereq.url ~ ".js$" || beresp.http.content-type ~ "text") {
            set beresp.do_gzip = true;
        }
    
        if (beresp.http.X-Magento-Debug) {
            set beresp.http.X-Magento-Cache-Control = beresp.http.Cache-Control;
        }
    
        # cache only successfully responses and 404s that are not marked as private
        if (beresp.status != 200 &&
                beresp.status != 404 &&
                beresp.http.Cache-Control ~ "private") {
            set beresp.uncacheable = true;
            set beresp.ttl = 86400s;
            return (deliver);
        }
    
        # validate if we need to cache it and prevent from setting cookie
        if (beresp.ttl > 0s && (bereq.method == "GET" || bereq.method == "HEAD")) {
            unset beresp.http.set-cookie;
        }
    
       # If page is not cacheable then bypass varnish for 2 minutes as Hit-For-Pass
       if (beresp.ttl <= 0s ||
           beresp.http.Surrogate-control ~ "no-store" ||
           (!beresp.http.Surrogate-Control &&
           beresp.http.Cache-Control ~ "no-cache|no-store") ||
           beresp.http.Vary == "*") {
            # Mark as Hit-For-Pass for the next 2 minutes
            set beresp.ttl = 120s;
            set beresp.uncacheable = true;
       }
    
       # If the cache key in the Magento response doesn't match the one that was sent in the request, don't cache under the request's key
       if (bereq.url ~ "/graphql" && bereq.http.X-Magento-Cache-Id && bereq.http.X-Magento-Cache-Id != beresp.http.X-Magento-Cache-Id) {
          set beresp.ttl = 0s;
          set beresp.uncacheable = true;
       }
    
        return (deliver);
    }
    
    sub vcl_deliver {
        if (resp.http.x-varnish ~ " ") {
            set resp.http.X-Magento-Cache-Debug = "HIT";
            set resp.http.Grace = req.http.grace;
        } else {
            set resp.http.X-Magento-Cache-Debug = "MISS";
        }
    
        # Not letting browser to cache non-static files.
        if (resp.http.Cache-Control !~ "private" && req.url !~ "^/(pub/)?(media|static|repository)/") {
            set resp.http.Pragma = "no-cache";
            set resp.http.Expires = "-1";
            set resp.http.Cache-Control = "no-store, no-cache, must-revalidate, max-age=0";
        }
    
        if (!resp.http.X-Magento-Debug) {
            unset resp.http.Age;
        }
        unset resp.http.X-Magento-Debug;
        unset resp.http.X-Magento-Tags;
        unset resp.http.X-Powered-By;
        unset resp.http.Server;
        unset resp.http.X-Varnish;
        unset resp.http.Via;
        unset resp.http.Link;
    }
    
    sub vcl_hit {
        if (obj.ttl >= 0s) {
            # Hit within TTL period
            return (deliver);
        }
        if (std.healthy(req.backend_hint)) {
            if (obj.ttl + 300s > 0s) {
                # Hit after TTL expiration, but within grace period
                set req.http.grace = "normal (healthy server)";
                return (deliver);
            } else {
                # Hit after TTL and grace expiration
                return (restart);
            }
        } else {
            # server is not healthy, retrieve from cache
            set req.http.grace = "unlimited (unhealthy server)";
            return (deliver);
        }
    }
    

    My current storage usage acording to varnishstat:

      SMA.Cache.c_req                                                                560519          3.00          5.09          6.84          6.67          6.67
        SMA.Cache.c_bytes                                                                5.41G        26.28K        51.53K        68.42K        66.72K        66.72KSMA.Cache.c_freed                                                                5.11G        15.97K        48.68K        52.50K        74.28K        74.28KSMA.Cache.g_alloc                                                               65665          2.00           .        65647.17      65661.34      65661.34
        SMA.Cache.g_bytes                                                              307.05M        10.30K          .          306.92M       307.09M       307.09MSMA.Cache.g_space                                                                1.70G       -10.30K          .            1.70G         1.70G         1.70GSMA.Transient.c_req                                                            909978         14.98          8.26          8.48         11.18         11.18
        SMA.Transient.c_fail                                                             3579          0.00          0.03          0.56          1.64          1.64
        SMA.Transient.c_bytes                                                            4.32G       110.74K        41.16K        57.47K        68.57K        68.57KSMA.Transient.c_freed                                                            3.82G       104.61K        36.40K        54.04K        70.60K        70.60KSMA.Transient.g_alloc                                                           94503          1.00           .        94495.50      94490.31      94490.31
        SMA.Transient.g_bytes                                                          511.89M         6.13K          .          511.84M       511.81M       511.81M
        SMA.Transient.g_space                                                          116.70K        -6.13K          .          161.41K       191.47K       191.47K
    

    My process is now consuming 1.3g according to "top" command.

    The problem is that I don't see the transient memory every going down, and it reaches the limit of the defined total transient memory, according to the varnishstat. If I put a bigger limit it reaches it in the space of a day and never goes down. Also the total memory of the varnish process end up being bigger that the limits I put on it.

    Several times it just consumes all the server memory and ends up using swap, until it reaches its limit.

    At that point the OS kills the varnish process.


  2. The Varnish Core Developer group is not aware of any real memory leak regarding jemalloc. Can you describe the symptoms of this leak?

    Transient storage is a very particular storage engine in Varnish. By default it’s unlimited and ensures that short-lived and uncacheable content is held while being processed by the client.

    Transient storage memory grows a lot faster than it shrinks and potential jemalloc tuning could optimize some of the wasted memory.

    Is there anything in particular you can share to support your problem? Because it’s entirely possible that it’s a feature and not a bug.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search