I’m having trouble stripping out cookies I don’t want using Varnish (v6.0.8). I thought this was working, but I think instead it was the URL-based filtering that was working and I just hadn’t noticed because it overlapped.
The relevant section of my configuration file is as follows:
sub vcl_recv {
set req.http.X-Forwarded-For = regsub(req.http.X-Forwarded-For,"^([^,]+)(,[^,]+)*","1");
if (req.method == "PURGE") {
if (!client.ip ~ purge) {
return (synth(405, "This IP is not allowed to send PURGE requests."));
}
if (req.http.X-Purge-Method == "regex") {
ban("obj.http.x-url ~ " + req.url + " && obj.http.x-host ~ " + req.http.host);
return (synth(200, "Banned"));
}
return (purge);
}
if (req.url ~ "(wp-admin|post.php|edit.php|wp-login|wp-json)") {
return(pass);
}
if (req.http.Cookie ~ "wordpress_logged_in_|resetpass|wp-postpass_") {
return(pass);
}
# Remove cookies
set req.http.Cookie = regsuball(req.http.Cookie, "comment_author_[a-zA-Z0-9_]+", "");
set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "PHPSESSID=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", "");
set req.http.Cookie = regsuball(req.http.Cookie, "wp_woocommerce_session_[^;]+(;)", "");
# Remove proxy header (see https://httpoxy.org/#mitigate-varnish)
unset req.http.proxy;
# Normalize query arguments (sort alphabetically)
set req.url = std.querysort(req.url);
# Strip trailing ? if it exists
if (req.url ~ "?$") {
set req.url = regsub(req.url, "?$", "");
}
# Limit requests to the following types
if (req.method !~ "^GET|HEAD|PUT|POST|TRACE|OPTIONS|PATCH|DELETE$") {
return (pipe);
}
# Only cache GET or HEAD requests to ensure that POST requests are always passed through, along with their cookies
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
# Don't cache AJAX requests
if (req.http.X-Requested-With == "XMLHttpRequest") {
return(pass);
}
# Don't cache images and PDFs
if (req.url ~ ".(gif|jpg|jpeg|bmp|png|pdf)$") {
return(pass);
}
# Don't cache large files (zip, audio, video, etc.)
if (req.url ~ "^[^?]*.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|opus|rar|tar|tgz|tbz|txz|wav|webm|wmv|xz|zip)(?.*)?$") {
return (pipe);
}
# Add support for ESI
if (req.http.Authorization) {
return (pass);
}
# WordPress: don't cache WooCommerce pages
if (req.url ~ "(cart|my-account|checkout|addons|checkouts|product)") {
return (pass);
}
if ( req.url ~ "(?|&)add-to-cart=" ) {
return (pass);
}
if ( req.url ~ "(?|&)wc-ajax=" ) {
return (pass);
}
# WordPress: don't cache associated requests if the referer is a WooCommerce page
if (req.http.referer ~ "(cart|my-account|checkout|addons|checkouts|product)") {
return(pass);
}
# Are there cookies left with only spaces or that are empty?
if (req.http.cookie ~ "^s*$") {
unset req.http.cookie;
}
# Remove all cookies to enable caching
unset req.http.Cookie;
return (hash);
}
As far as I understand it, the regexes in the # Remove cookies section, should be stripping out those cookies. But they don’t. What am I doing wrong?
Thanks!
UPDATE: Here is the output code from Varnish. I think it’s not executing, but I can’t see why. I’ve tried moving it up to the top of the sub vcl_recv
section, but it makes no difference:
* << Request >> 3007535
- Begin req 3007534 rxreq
- Timestamp Start: 1638408710.214334 0.000000 0.000000
- Timestamp Req: 1638408710.214334 0.000000 0.000000
- ReqStart <REDACTED IP> a1
- ReqMethod HEAD
- ReqURL /example-page/
- ReqProtocol HTTP/1.1
- ReqHeader host: example.com
- ReqHeader user-agent: curl/7.77.0
- ReqHeader accept: */*
- ReqHeader x-forwarded-port: 443
- ReqHeader x-forwarded-proto: https
- ReqHeader x-forwarded-for: <REDACTED IP>
- ReqUnset x-forwarded-for: <REDACTED IP>
- ReqHeader X-Forwarded-For: <REDACTED IP>, <REDACTED IP>
- VCL_call RECV
- ReqUnset X-Forwarded-For: <REDACTED IP>, <REDACTED IP>
- ReqHeader X-Forwarded-For: <REDACTED IP>
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqUnset Cookie:
- ReqHeader Cookie:
- ReqURL /example-page/
- ReqUnset Cookie:
- VCL_return hash
- VCL_call HASH
- ReqHeader newUrl: /example-page/
- VCL_return lookup
- HitMiss 3617921 98.252548
- VCL_call MISS
- VCL_return fetch
- Link bereq 3007536 fetch
- Timestamp Fetch: 1638408711.615163 1.400829 1.400829
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Server: nginx
- RespHeader Date: Thu, 02 Dec 2021 01:31:51 GMT
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader Vary: Accept-Encoding
- RespHeader Set-Cookie: wp_woocommerce_session_7006ef9ea65e2d5e1a330b658a34667c=e005d432498b0dc461295d782d2bc59c%7C%7C1638581510%7C%7C1638577910%7C%7C13a8cdc105250d841e0fef56a1255d03; expires=Sat, 04-Dec-2021 01:31:50 GMT; Max-Age=172800; path=/; secure; HttpOnly
- RespHeader Set-Cookie: wl_current_page_id=20387; expires=Thu, 02-Dec-2021 02:31:50 GMT; Max-Age=3600; path=/
- RespHeader Link: <https://example.com/wp-json/>; rel="https://api.w.org/"
- RespHeader Link: <https://example.com/wp-json/wp/v2/posts/20387>; rel="alternate"; type="application/json"
- RespHeader Link: <https://example.com/?p=20387>; rel=shortlink
- RespHeader X-Content-Type-Options: nosniff
- RespHeader X-XSS-Protection: 1; mode=block
- RespHeader X-UA-Compatible: IE=Edge
- RespHeader Content-Security-Policy: frame-ancestors 'self'
- RespHeader X-Frame-Options: SAMEORIGIN
- RespHeader Content-Encoding: gzip
- RespHeader x-url: /example-page/
- RespHeader x-host: example.com
- RespHeader X-Varnish: 3007535
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish (Varnish/6.0)
- VCL_call DELIVER
- RespUnset x-url: /example-page/
- RespUnset x-host: example.com
- RespHeader X-Cache: MISS
- RespHeader X-Cache-Hits: 0
- RespUnset X-Varnish: 3007535
- RespUnset Via: 1.1 varnish (Varnish/6.0)
- RespUnset Server: nginx
- VCL_return deliver
- Timestamp Process: 1638408711.615227 1.400893 0.000064
- RespUnset Content-Encoding: gzip
- RespHeader Accept-Ranges: bytes
- RespHeader Connection: keep-alive
- Gzip U D - 0 0 0 0 0
- Timestamp Resp: 1638408711.615536 1.401202 0.000309
- ReqAcct 239 0 239 947 0 947
- End
** << BeReq >> 3007536
-- Begin bereq 3007535 fetch
-- VCL_use reload_20211202_013110_5534
-- Timestamp Start: 1638408710.214561 0.000000 0.000000
-- BereqMethod HEAD
-- BereqURL /example-page/
-- BereqProtocol HTTP/1.1
-- BereqHeader host: example.com
-- BereqHeader user-agent: curl/7.77.0
-- BereqHeader accept: */*
-- BereqHeader x-forwarded-port: 443
-- BereqHeader x-forwarded-proto: https
-- BereqHeader X-Forwarded-For: <REDACTED IP>
-- BereqHeader newUrl: /example-page/
-- BereqMethod GET
-- BereqHeader Accept-Encoding: gzip
-- BereqHeader X-Varnish: 3007536
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- BackendOpen 66 reload_20211202_013110_5534.planet-a 10.0.55.20 80 10.0.55.10 43206
-- BackendStart <REDACTED> 80
-- Timestamp Bereq: 1638408710.214709 0.000147 0.000147
-- Timestamp Beresp: 1638408711.614969 1.400408 1.400261
-- BerespProtocol HTTP/1.1
-- BerespStatus 200
-- BerespReason OK
-- BerespHeader Server: nginx
-- BerespHeader Date: Thu, 02 Dec 2021 01:31:51 GMT
-- BerespHeader Content-Type: text/html; charset=UTF-8
-- BerespHeader Transfer-Encoding: chunked
-- BerespHeader Connection: keep-alive
-- BerespHeader Vary: Accept-Encoding
-- BerespHeader Set-Cookie: wp_woocommerce_session_7006ef9ea65e2d5e1a330b658a34667c=e005d432498b0dc461295d782d2bc59c%7C%7C1638581510%7C%7C1638577910%7C%7C13a8cdc105250d841e0fef56a1255d03; expires=Sat, 04-Dec-2021 01:31:50 GMT; Max-Age=172800; path=/; secure; HttpOnly
-- BerespHeader Set-Cookie: wl_current_page_id=20387; expires=Thu, 02-Dec-2021 02:31:50 GMT; Max-Age=3600; path=/
-- BerespHeader Link: <https://example.com/wp-json/>; rel="https://api.w.org/"
-- BerespHeader Link: <https://example.com/wp-json/wp/v2/posts/20387>; rel="alternate"; type="application/json"
-- BerespHeader Link: <https://example.com/?p=20387>; rel=shortlink
-- BerespHeader X-Content-Type-Options: nosniff
-- BerespHeader X-XSS-Protection: 1; mode=block
-- BerespHeader X-UA-Compatible: IE=Edge
-- BerespHeader Content-Security-Policy: frame-ancestors 'self'
-- BerespHeader X-Frame-Options: SAMEORIGIN
-- BerespHeader Content-Encoding: gzip
-- TTL RFC 120 10 0 1638408712 1638408712 1638408711 0 0 cacheable
-- VCL_call BACKEND_RESPONSE
-- BerespHeader x-url: /example-page/
-- BerespHeader x-host: example.com
-- TTL VCL 2592000 10 0 1638408712 cacheable
-- TTL VCL 2592000 86400 0 1638408712 cacheable
-- TTL VCL 120 86400 0 1638408712 cacheable
-- TTL VCL 120 86400 0 1638408712 uncacheable
-- VCL_return deliver
-- Storage malloc Transient
-- Fetch_Body 2 chunked stream
-- Debug "Fetch: Pass delivery abandoned%00"
-- Gzip u F - 13522 65516 80 80 0
-- BackendClose 66 reload_20211202_013110_5534.planet-a
-- Timestamp BerespBody: 1638408711.616261 1.401700 0.001292
-- Length 13522
-- BereqAcct 347 0 347 952 13522 14474
-- End
2
Answers
Does the cookie removal VCL code execute?
Can you check whether you actually reach the point where these cookies are removed?
The following code that is executed right before the cookie removal might impact this:
If for example a
wordpress_logged_in_
cookie is set, the cache will be bypassed and none of the cookie removal code will be execute.You can test this by running the following
varnishlog
command:The output will show what happens to the request and how Varnish behaves for this specific request. If you need help, just paste the
varnishlog
output to your original question and I’ll help you investigate.Keep necessary cookies instead of removing unwanted cookies
It’s a tedious job to keep track of all the cookies you want to remove.
You can also remove all cookies and only keep the ones that matter. Here’s some example code that only keeps the
wordpress_logged_in_[hash]
cookie will removing all others:The official Varnish WordPress VCL file
If you’re interested in which VCL file for WordPress we recommand at Varnish Software, have a look at https://www.varnish-software.com/developers/tutorials/configuring-varnish-wordpress/
Update: my reaction to your VSL output
I noticed in your VSL output that no cookie was passed by the client for the request to
/example-page/
.The cookie find and replace logic kicked in, but had no impact, since there were no cookies to remove.
You can notice this in the VSL output by all the following lines:
The content was considered cacheable and a cache lookup is made in the
vcl_hash
subroutine.The object is found, but marked as Hit-For-Miss, which is indicated by the following log line:
This means the content was marked as uncacheable in a previous request/response transaction.
We then continue to fetch the content, which is again marked as uncacheable.
If you’re familiar with Varnish’s built-in VCL, you’ll immediately spot the
Set-Cookie
headers in the response that cause this:The conclusion is that your WooCommerce/WordPress setup issues
Set-Cookie
headers upon the first request. If these session cookies aren’t passed by the client upon subsequent requests, WooCommerce/WordPress will continue to set them.This is the reason why the content cannot be cached.
Please ensure that the session cookie is only set in situations where stateful data is required (e.g.: adding content to the shopping cart, performing the checkout).
For regular browsing through the product pages, session cookies are not required. But if you believe you do require them for certain parts of the page, please ensure this logic is contained in separate HTTP calls via AJAX or ESI.
I have not read the details of the question, but there is a very short answer: Do not modify the cookie header like this. For example, your regexen are not anchored on the left hand side and thus will match any cookie ending on the pattern.
Also, calling
regsuball()
repetitively is inefficient, both computationally and with respect to the workspace memory used.There is a very simple answer: Use vmod_cookie.