skip to Main Content

I’m having trouble stripping out cookies I don’t want using Varnish (v6.0.8). I thought this was working, but I think instead it was the URL-based filtering that was working and I just hadn’t noticed because it overlapped.

The relevant section of my configuration file is as follows:

sub vcl_recv {

    set req.http.X-Forwarded-For = regsub(req.http.X-Forwarded-For,"^([^,]+)(,[^,]+)*","1");

    if (req.method == "PURGE") {
    if (!client.ip ~ purge) {
    return (synth(405, "This IP is not allowed to send PURGE requests."));
}
    if (req.http.X-Purge-Method == "regex") {
    ban("obj.http.x-url ~ " + req.url + " && obj.http.x-host ~ " + req.http.host);
    return (synth(200, "Banned"));
}
    return (purge);
}
    if (req.url ~ "(wp-admin|post.php|edit.php|wp-login|wp-json)") {
    return(pass);
}
    if (req.http.Cookie ~ "wordpress_logged_in_|resetpass|wp-postpass_") {
    return(pass);
}

# Remove cookies
    set req.http.Cookie = regsuball(req.http.Cookie, "comment_author_[a-zA-Z0-9_]+", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "PHPSESSID=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "wp_woocommerce_session_[^;]+(;)", "");
# Remove proxy header (see https://httpoxy.org/#mitigate-varnish)
    unset req.http.proxy;
# Normalize query arguments (sort alphabetically)
    set req.url = std.querysort(req.url);
# Strip trailing ? if it exists
    if (req.url ~ "?$") {
    set req.url = regsub(req.url, "?$", "");
}
# Limit requests to the following types
    if (req.method !~ "^GET|HEAD|PUT|POST|TRACE|OPTIONS|PATCH|DELETE$") {
    return (pipe);
}
# Only cache GET or HEAD requests to ensure that POST requests are always passed through, along with their cookies
    if (req.method != "GET" && req.method != "HEAD") {
    return (pass);
}
# Don't cache AJAX requests
    if (req.http.X-Requested-With == "XMLHttpRequest") {
    return(pass);
}
# Don't cache images and PDFs
    if (req.url ~ ".(gif|jpg|jpeg|bmp|png|pdf)$") {
    return(pass);
}
# Don't cache large files (zip, audio, video, etc.)
    if (req.url ~ "^[^?]*.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|wmv|xz|zip)(?.*)?$") {
    return (pipe);
}
# Add support for ESI
    if (req.http.Authorization) {
    return (pass);
}
# WordPress: don't cache WooCommerce pages
    if (req.url ~ "(cart|my-account|checkout|addons|checkouts|product)") {
    return (pass);
}
    if ( req.url ~ "(?|&)add-to-cart=" ) {
    return (pass);
}
    if ( req.url ~ "(?|&)wc-ajax=" ) {
    return (pass);
}
# WordPress: don't cache associated requests if the referer is a WooCommerce page
    if (req.http.referer ~ "(cart|my-account|checkout|addons|checkouts|product)") {
    return(pass);
}
# Are there cookies left with only spaces or that are empty?
    if (req.http.cookie ~ "^s*$") {
    unset req.http.cookie;
}
# Remove all cookies to enable caching
    unset req.http.Cookie;
    return (hash);
}

As far as I understand it, the regexes in the # Remove cookies section, should be stripping out those cookies. But they don’t. What am I doing wrong?

Thanks!

UPDATE: Here is the output code from Varnish. I think it’s not executing, but I can’t see why. I’ve tried moving it up to the top of the sub vcl_recv section, but it makes no difference:

*   << Request  >> 3007535
-   Begin          req 3007534 rxreq
-   Timestamp      Start: 1638408710.214334 0.000000 0.000000
-   Timestamp      Req: 1638408710.214334 0.000000 0.000000
-   ReqStart       <REDACTED IP> a1
-   ReqMethod      HEAD
-   ReqURL         /example-page/
-   ReqProtocol    HTTP/1.1
-   ReqHeader      host: example.com
-   ReqHeader      user-agent: curl/7.77.0
-   ReqHeader      accept: */*
-   ReqHeader      x-forwarded-port: 443
-   ReqHeader      x-forwarded-proto: https
-   ReqHeader      x-forwarded-for: <REDACTED IP>
-   ReqUnset       x-forwarded-for: <REDACTED IP>
-   ReqHeader      X-Forwarded-For: <REDACTED IP>, <REDACTED IP>
-   VCL_call       RECV
-   ReqUnset       X-Forwarded-For: <REDACTED IP>, <REDACTED IP>
-   ReqHeader      X-Forwarded-For: <REDACTED IP>
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqUnset       Cookie:
-   ReqHeader      Cookie:
-   ReqURL         /example-page/
-   ReqUnset       Cookie:
-   VCL_return     hash
-   VCL_call       HASH
-   ReqHeader      newUrl: /example-page/
-   VCL_return     lookup
-   HitMiss        3617921 98.252548
-   VCL_call       MISS
-   VCL_return     fetch
-   Link           bereq 3007536 fetch
-   Timestamp      Fetch: 1638408711.615163 1.400829 1.400829
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Server: nginx
-   RespHeader     Date: Thu, 02 Dec 2021 01:31:51 GMT
-   RespHeader     Content-Type: text/html; charset=UTF-8
-   RespHeader     Vary: Accept-Encoding
-   RespHeader     Set-Cookie: wp_woocommerce_session_7006ef9ea65e2d5e1a330b658a34667c=e005d432498b0dc461295d782d2bc59c%7C%7C1638581510%7C%7C1638577910%7C%7C13a8cdc105250d841e0fef56a1255d03; expires=Sat, 04-Dec-2021 01:31:50 GMT; Max-Age=172800; path=/; secure; HttpOnly
-   RespHeader     Set-Cookie: wl_current_page_id=20387; expires=Thu, 02-Dec-2021 02:31:50 GMT; Max-Age=3600; path=/
-   RespHeader     Link: <https://example.com/wp-json/>; rel="https://api.w.org/"
-   RespHeader     Link: <https://example.com/wp-json/wp/v2/posts/20387>; rel="alternate"; type="application/json"
-   RespHeader     Link: <https://example.com/?p=20387>; rel=shortlink
-   RespHeader     X-Content-Type-Options: nosniff
-   RespHeader     X-XSS-Protection: 1; mode=block
-   RespHeader     X-UA-Compatible: IE=Edge
-   RespHeader     Content-Security-Policy: frame-ancestors 'self'
-   RespHeader     X-Frame-Options: SAMEORIGIN
-   RespHeader     Content-Encoding: gzip
-   RespHeader     x-url: /example-page/
-   RespHeader     x-host: example.com
-   RespHeader     X-Varnish: 3007535
-   RespHeader     Age: 0
-   RespHeader     Via: 1.1 varnish (Varnish/6.0)
-   VCL_call       DELIVER
-   RespUnset      x-url: /example-page/
-   RespUnset      x-host: example.com
-   RespHeader     X-Cache: MISS
-   RespHeader     X-Cache-Hits: 0
-   RespUnset      X-Varnish: 3007535
-   RespUnset      Via: 1.1 varnish (Varnish/6.0)
-   RespUnset      Server: nginx
-   VCL_return     deliver
-   Timestamp      Process: 1638408711.615227 1.400893 0.000064
-   RespUnset      Content-Encoding: gzip
-   RespHeader     Accept-Ranges: bytes
-   RespHeader     Connection: keep-alive
-   Gzip           U D - 0 0 0 0 0
-   Timestamp      Resp: 1638408711.615536 1.401202 0.000309
-   ReqAcct        239 0 239 947 0 947
-   End
**  << BeReq    >> 3007536
--  Begin          bereq 3007535 fetch
--  VCL_use        reload_20211202_013110_5534
--  Timestamp      Start: 1638408710.214561 0.000000 0.000000
--  BereqMethod    HEAD
--  BereqURL       /example-page/
--  BereqProtocol  HTTP/1.1
--  BereqHeader    host: example.com
--  BereqHeader    user-agent: curl/7.77.0
--  BereqHeader    accept: */*
--  BereqHeader    x-forwarded-port: 443
--  BereqHeader    x-forwarded-proto: https
--  BereqHeader    X-Forwarded-For: <REDACTED IP>
--  BereqHeader    newUrl: /example-page/
--  BereqMethod    GET
--  BereqHeader    Accept-Encoding: gzip
--  BereqHeader    X-Varnish: 3007536
--  VCL_call       BACKEND_FETCH
--  VCL_return     fetch
--  BackendOpen    66 reload_20211202_013110_5534.planet-a 10.0.55.20 80 10.0.55.10 43206
--  BackendStart   <REDACTED> 80
--  Timestamp      Bereq: 1638408710.214709 0.000147 0.000147
--  Timestamp      Beresp: 1638408711.614969 1.400408 1.400261
--  BerespProtocol HTTP/1.1
--  BerespStatus   200
--  BerespReason   OK
--  BerespHeader   Server: nginx
--  BerespHeader   Date: Thu, 02 Dec 2021 01:31:51 GMT
--  BerespHeader   Content-Type: text/html; charset=UTF-8
--  BerespHeader   Transfer-Encoding: chunked
--  BerespHeader   Connection: keep-alive
--  BerespHeader   Vary: Accept-Encoding
--  BerespHeader   Set-Cookie: wp_woocommerce_session_7006ef9ea65e2d5e1a330b658a34667c=e005d432498b0dc461295d782d2bc59c%7C%7C1638581510%7C%7C1638577910%7C%7C13a8cdc105250d841e0fef56a1255d03; expires=Sat, 04-Dec-2021 01:31:50 GMT; Max-Age=172800; path=/; secure; HttpOnly
--  BerespHeader   Set-Cookie: wl_current_page_id=20387; expires=Thu, 02-Dec-2021 02:31:50 GMT; Max-Age=3600; path=/
--  BerespHeader   Link: <https://example.com/wp-json/>; rel="https://api.w.org/"
--  BerespHeader   Link: <https://example.com/wp-json/wp/v2/posts/20387>; rel="alternate"; type="application/json"
--  BerespHeader   Link: <https://example.com/?p=20387>; rel=shortlink
--  BerespHeader   X-Content-Type-Options: nosniff
--  BerespHeader   X-XSS-Protection: 1; mode=block
--  BerespHeader   X-UA-Compatible: IE=Edge
--  BerespHeader   Content-Security-Policy: frame-ancestors 'self'
--  BerespHeader   X-Frame-Options: SAMEORIGIN
--  BerespHeader   Content-Encoding: gzip
--  TTL            RFC 120 10 0 1638408712 1638408712 1638408711 0 0 cacheable
--  VCL_call       BACKEND_RESPONSE
--  BerespHeader   x-url: /example-page/
--  BerespHeader   x-host: example.com
--  TTL            VCL 2592000 10 0 1638408712 cacheable
--  TTL            VCL 2592000 86400 0 1638408712 cacheable
--  TTL            VCL 120 86400 0 1638408712 cacheable
--  TTL            VCL 120 86400 0 1638408712 uncacheable
--  VCL_return     deliver
--  Storage        malloc Transient
--  Fetch_Body     2 chunked stream
--  Debug          "Fetch: Pass delivery abandoned%00"
--  Gzip           u F - 13522 65516 80 80 0
--  BackendClose   66 reload_20211202_013110_5534.planet-a
--  Timestamp      BerespBody: 1638408711.616261 1.401700 0.001292
--  Length         13522
--  BereqAcct      347 0 347 952 13522 14474
--  End

2

Answers


  1. Does the cookie removal VCL code execute?

    Can you check whether you actually reach the point where these cookies are removed?

    The following code that is executed right before the cookie removal might impact this:

    if (req.http.Cookie ~ "wordpress_logged_in_|resetpass|wp-postpass_") {
        return(pass);
    }
    

    If for example a wordpress_logged_in_ cookie is set, the cache will be bypassed and none of the cookie removal code will be execute.

    You can test this by running the following varnishlog command:

    sudo varnishlog -g request -q "ReqUrl eq '/'"
    

    FYI: this varnishlog command only filters out requests for the homepage. Change / to the appropriate page if you want to filter out requests for other URLs.

    The output will show what happens to the request and how Varnish behaves for this specific request. If you need help, just paste the varnishlog output to your original question and I’ll help you investigate.

    Keep necessary cookies instead of removing unwanted cookies

    It’s a tedious job to keep track of all the cookies you want to remove.

    You can also remove all cookies and only keep the ones that matter. Here’s some example code that only keeps the wordpress_logged_in_[hash] cookie will removing all others:

    sub vcl_recv {
        if (req.http.Cookie) {
        set req.http.Cookie = ";" + req.http.Cookie;
        set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
        set req.http.Cookie = regsuball(req.http.Cookie, ";(wordpress_logged_in_[A-Za-z0-9]+)=", "; 1=");
        set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
        set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
        
        if (req.http.cookie ~ "^s*$") {
            unset req.http.cookie;
        }
    }
    

    The official Varnish WordPress VCL file

    If you’re interested in which VCL file for WordPress we recommand at Varnish Software, have a look at https://www.varnish-software.com/developers/tutorials/configuring-varnish-wordpress/

    Update: my reaction to your VSL output

    I noticed in your VSL output that no cookie was passed by the client for the request to /example-page/.

    The cookie find and replace logic kicked in, but had no impact, since there were no cookies to remove.

    You can notice this in the VSL output by all the following lines:

    -   ReqHeader      Cookie:
    -   ReqUnset       Cookie:
    

    The content was considered cacheable and a cache lookup is made in the vcl_hash subroutine.

    The object is found, but marked as Hit-For-Miss, which is indicated by the following log line:

    -   HitMiss        3617921 98.252548
    

    This means the content was marked as uncacheable in a previous request/response transaction.

    We then continue to fetch the content, which is again marked as uncacheable.

    If you’re familiar with Varnish’s built-in VCL, you’ll immediately spot the Set-Cookie headers in the response that cause this:

    -   RespHeader     Set-Cookie: wp_woocommerce_session_7006ef9ea65e2d5e1a330b658a34667c=e005d432498b0dc461295d782d2bc59c%7C%7C1638581510%7C%7C1638577910%7C%7C13a8cdc105250d841e0fef56a1255d03; expires=Sat, 04-Dec-2021 01:31:50 GMT; Max-Age=172800; path=/; secure; HttpOnly
    -   RespHeader     Set-Cookie: wl_current_page_id=20387; expires=Thu, 02-Dec-2021 02:31:50 GMT; Max-Age=3600; path=/
    

    The conclusion is that your WooCommerce/WordPress setup issues Set-Cookie headers upon the first request. If these session cookies aren’t passed by the client upon subsequent requests, WooCommerce/WordPress will continue to set them.

    This is the reason why the content cannot be cached.

    Please ensure that the session cookie is only set in situations where stateful data is required (e.g.: adding content to the shopping cart, performing the checkout).

    For regular browsing through the product pages, session cookies are not required. But if you believe you do require them for certain parts of the page, please ensure this logic is contained in separate HTTP calls via AJAX or ESI.

    Login or Signup to reply.
  2. I have not read the details of the question, but there is a very short answer: Do not modify the cookie header like this. For example, your regexen are not anchored on the left hand side and thus will match any cookie ending on the pattern.

    Also, calling regsuball() repetitively is inefficient, both computationally and with respect to the workspace memory used.

    There is a very simple answer: Use vmod_cookie.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search