So I’m using traefik 2.2, I run a bare metal kubernetes cluster with a single node master. I don’t have a physical or virtual load balancer so the traefik pod takes in all requests on ports 80 and 443. I have an example wordpress installed with helm. As you can see here exactly every other request is a 500 error. http://wp-example.cryptexlabs.com/feed/. I can confirm that the request that is a 500 error never reaches the wordpress container so I know this has something to do with traefik. In the traefik logs it just shows there was a 500 error. So I have 1 pod in the traefik namespace, a service in the default service, an external name service in the default namespace that points to the example wordpress site which a wp-example namespace.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
name: traefik
namespace: traefik
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: traefik
release: traefik
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: traefik
chart: traefik-0.2.0
heritage: Tiller
release: traefik
spec:
containers:
- args:
- --api.insecure
- --accesslog
- --entrypoints.web.Address=:80
- --entrypoints.websecure.Address=:443
- --providers.kubernetescrd
- --certificatesresolvers.default.acme.tlschallenge
- [email protected]
- --certificatesresolvers.default.acme.storage=acme.json
- --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
image: traefik:2.2
imagePullPolicy: IfNotPresent
name: traefik
ports:
- containerPort: 80
hostPort: 80
name: web
protocol: TCP
- containerPort: 443
hostPort: 443
name: websecure
protocol: TCP
- containerPort: 8088
name: admin
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: traefik-service-account
serviceAccountName: traefik-service-account
terminationGracePeriodSeconds: 60
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: wp-example.cryptexlabs.com
namespace: wp-example
spec:
entryPoints:
- web
routes:
- kind: Rule
match: Host(`wp-example.cryptexlabs.com`)
services:
- name: wp-example
port: 80
- name: wp-example
port: 443
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
namespace: wp-example
spec:
clusterIP: 10.101.142.74
externalTrafficPolicy: Cluster
ports:
- name: http
nodePort: 31862
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 32473
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
sessionAffinity: None
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
name: wp-example-wordpress
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/name: wordpress
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: wp-example
app.kubernetes.io/managed-by: Tiller
app.kubernetes.io/name: wordpress
helm.sh/chart: wordpress-9.3.14
spec:
containers:
- env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
- name: MARIADB_HOST
value: wp-example-mariadb
- name: MARIADB_PORT_NUMBER
value: "3306"
- name: WORDPRESS_DATABASE_NAME
value: bitnami_wordpress
- name: WORDPRESS_DATABASE_USER
value: bn_wordpress
- name: WORDPRESS_DATABASE_PASSWORD
valueFrom:
secretKeyRef:
key: mariadb-password
name: wp-example-mariadb
- name: WORDPRESS_USERNAME
value: user
- name: WORDPRESS_PASSWORD
valueFrom:
secretKeyRef:
key: wordpress-password
name: wp-example-wordpress
- name: WORDPRESS_EMAIL
value: [email protected]
- name: WORDPRESS_FIRST_NAME
value: FirstName
- name: WORDPRESS_LAST_NAME
value: LastName
- name: WORDPRESS_HTACCESS_OVERRIDE_NONE
value: "no"
- name: WORDPRESS_HTACCESS_PERSISTENCE_ENABLED
value: "no"
- name: WORDPRESS_BLOG_NAME
value: "User's Blog!"
- name: WORDPRESS_SKIP_INSTALL
value: "no"
- name: WORDPRESS_TABLE_PREFIX
value: wp_
- name: WORDPRESS_SCHEME
value: http
image: docker.io/bitnami/wordpress:5.4.2-debian-10-r6
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 120
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: wordpress
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 6
httpGet:
path: /wp-login.php
port: http
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
requests:
cpu: 300m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /bitnami/wordpress
name: wordpress-data
subPath: wordpress
dnsPolicy: ClusterFirst
hostAliases:
- hostnames:
- status.localhost
ip: 127.0.0.1
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1001
runAsUser: 1001
terminationGracePeriodSeconds: 30
volumes:
- name: wordpress-data
persistentVolumeClaim:
claimName: wp-example-wordpress
Output of kubectl describe svc wp-example-wordpress -n wp-example
Name: wp-example-wordpress
Namespace: wp-example
Labels: app.kubernetes.io/instance=wp-example
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=wordpress
helm.sh/chart=wordpress-9.3.14
Annotations: <none>
Selector: app.kubernetes.io/instance=wp-example,app.kubernetes.io/name=wordpress
Type: LoadBalancer
IP: 10.101.142.74
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31862/TCP
Endpoints: 10.32.0.17:8080
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32473/TCP
Endpoints: 10.32.0.17:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
josh@Joshs-MacBook-Pro-2:$ ab -n 10000 -c 10 http://wp-example.cryptexlabs.com/
This is ApacheBench, Version 2.3 <$Revision: 1874286 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking wp-example.cryptexlabs.com (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software: Apache/2.4.43
Server Hostname: wp-example.cryptexlabs.com
Server Port: 80
Document Path: /
Document Length: 26225 bytes
Concurrency Level: 10
Time taken for tests: 37.791 seconds
Complete requests: 10000
Failed requests: 5000
(Connect: 0, Receive: 0, Length: 5000, Exceptions: 0)
Non-2xx responses: 5000
Total transferred: 133295000 bytes
HTML transferred: 131230000 bytes
Requests per second: 264.61 [#/sec] (mean)
Time per request: 37.791 [ms] (mean)
Time per request: 3.779 [ms] (mean, across all concurrent requests)
Transfer rate: 3444.50 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 2 6 8.1 5 239
Processing: 4 32 29.2 39 315
Waiting: 4 29 26.0 34 307
Total: 7 38 31.6 43 458
Percentage of the requests served within a certain time (ms)
50% 43
66% 49
75% 51
80% 52
90% 56
95% 60
98% 97
99% 180
100% 458 (longest request)
Traefik Debug Logs: https://pastebin.com/QUaAR6G0 are showing something about SSL and x509 certs though I’m making the request via http not https.
I did a test with an nginx container that uses the same pattern and I did not have any issues. So this has something to do specifically with the relationship between wordpress and traefik.
I also saw a reference on traefik regarding to the fact that Keep-Alive was not enabled on the downstream server and traefik has Keep-Alive enabled by default. I have also tried enabling Keep-Alive by extending the wordpress image and enabling Keep-Alive on wordpress. When I access the wordpress container through `kubectl port-forward I can see that the Keep-Alive headers are being sent so I know its enabled but I am still seeing 50% of the requests failing.
3
Answers
I'm not really sure why and I can't explain it but when I add this option the random failures stop:
I saw in the traefik logs that HTTP connections are fine but when HTTPS redirections happen for favicon etc. then you get x509 sertificate not valid. That’s because wordpress pod has ssl certificate that’s not valid.
You can use
--serversTransport.insecureSkipVerify=true
safely inside your cluster since traffic will be encrypted and outside traffic is HTTP.If you need to use trusted certificate in future, deploy it with wordpress app and use traefik with ssl passthrough so traffic would be decrypted at pod level. Then you can remove insecure option on traefik.
Using bitnami helm argocd image the following configuration solved the ‘internal server error’ message on my side.
Helm values.yaml
Ingress manifest:
Redirect:
To redirect http to https I’m using a redirect middleware (this works on k3s):
Middleware spec: