skip to Main Content

I am running Airflow on Kubernetes from the stable helm chart. I’m running this in an AWS environment. This error exists with and without mounting any external volumes for log storage. I tried to set the configuration of the [logs] section to point to an EFS volume that I created. The PV gets mounted through a PVC but my containers are crashing (scheduler and web) due to the following error:

*** executing Airflow initdb...
Unable to load the config, contains a configuration error.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/logging/config.py", line 565, in configure
    handler = self.configure_handler(handlers[name])
  File "/usr/local/lib/python3.6/logging/config.py", line 738, in configure_handler
    result = factory(**kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/log/file_processor_handler.py", line 50, in __init__
    os.makedirs(self._get_log_directory())
  File "/usr/local/lib/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/opt/airflow/logs/scheduler/2020-08-20'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 25, in <module>
    from airflow.configuration import conf
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__init__.py", line 47, in <module>
    settings.initialize()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/settings.py", line 374, in initialize
    LOGGING_CLASS_PATH = configure_logging()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/logging_config.py", line 68, in configure_logging
    raise e
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/logging_config.py", line 63, in configure_logging
    dictConfig(logging_config)
  File "/usr/local/lib/python3.6/logging/config.py", line 802, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/local/lib/python3.6/logging/config.py", line 573, in configure
    '%r: %s' % (name, e))
ValueError: Unable to configure handler 'processor': [Errno 13] Permission denied: '/opt/airflow/logs/scheduler/2020-08-20'

Persistent volume (created manually not from the stable/airflow chart)

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"efs-pv"},"spec":{"accessModes":["ReadWriteMany"],"capacity":{"storage":"5Gi"},"csi":{"driver":"efs.csi.aws.com","volumeHandle":"fs-e476a166"},"persistentVolumeReclaimPolicy":"Retain","storageClassName":"efs-sc","volumeMode":"Filesystem"}}
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2020-08-20T15:47:21Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: efs-pv
  resourceVersion: "49476860"
  selfLink: /api/v1/persistentvolumes/efs-pv
  uid: 45d9f5ea-66c1-493e-a2f5-03e17f397747
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: efs-claim
    namespace: airflow
    resourceVersion: "49476857"
    uid: 354103ea-f8a9-47f1-a7cf-8f449f9a2e8b
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-e476a166
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  volumeMode: Filesystem
status:
  phase: Bound

Persistent Volume Claim for logs (created manually not from the stable/airflow chart):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"efs-claim","namespace":"airflow"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"5Gi"}},"storageClassName":"efs-sc"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2020-08-20T15:47:46Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: efs-claim
  namespace: airflow
  resourceVersion: "49476866"
  selfLink: /api/v1/namespaces/airflow/persistentvolumeclaims/efs-claim
  uid: 354103ea-f8a9-47f1-a7cf-8f449f9a2e8b
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: efs-sc
  volumeMode: Filesystem
  volumeName: efs-pv
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 5Gi
  phase: Bound

My values.yaml below:

airflow:
  image:
    repository: apache/airflow
    tag: 1.10.10-python3.6
    ## values: Always or IfNotPresent
    pullPolicy: IfNotPresent
    pullSecret: ""

  executor: KubernetesExecutor

  fernetKey: "XXXXXXXXXHIVb8jK6lfmSAvx4mO6Arehnc="

  config:
    AIRFLOW__CORE__REMOTE_LOGGING: "True"
    AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: "s3://mybucket/airflow/logs"
    AIRFLOW__CORE__REMOTE_LOG_CONN_ID: "MyS3Conn"
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: "apache/airflow"
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: "1.10.10-python3.6"
    AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: "IfNotPresent"
    AIRFLOW__KUBERNETES__WORKER_PODS_CREATION_BATCH_SIZE: "10"
    AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM: "efs-claim"
    AIRFLOW__KUBERNETES__GIT_REPO: "[email protected]:org/myrepo.git"
    AIRFLOW__KUBERNETES__GIT_BRANCH: "develop"
    AIRFLOW__KUBERNETES__GIT_DAGS_FOLDER_MOUNT_POINT: "/opt/airflow/dags"
    AIRFLOW__KUBERNETES__DAGS_VOLUME_SUBPATH: "repo/"
    AIRFLOW__KUBERNETES__GIT_SSH_KEY_SECRET_NAME: "airflow-git-keys"
    AIRFLOW__KUBERNETES__NAMESPACE: "airflow"
    AIRFLOW__KUBERNETES__DELETE_WORKER_PODS: "True"
    AIRFLOW__KUBERNETES__RUN_AS_USER: "50000"
    AIRFLOW__CORE__LOAD_EXAMPLES: "False"
    AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "60"
    AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: "airflow"

  podAnnotations: {}
  extraEnv: []
  extraConfigmapMounts: []
  extraContainers: []
  extraPipPackages: []
  extraVolumeMounts: []
  extraVolumes: []
scheduler:
  resources: {}
  nodeSelector: {}
  affinity: {}
  tolerations: []
  labels: {}
  podLabels: {}
  annotations: {}
  podAnnotations: {}
  podDisruptionBudget:
    enabled: true
    maxUnavailable: "100%"
    minAvailable: ""
  connections:
    - id: MyS3Conn
      type: aws
      extra: |
        {
        "aws_access_key_id": "XXXXXXXXX",
        "aws_secret_access_key": "XXXXXXXX",
        "region_name":"us-west-1"
        }

  refreshConnections: true
  variables: |
    {}

  pools: |
    {}

  numRuns: -1
  initdb: true
  preinitdb: false
  initialStartupDelay: 0
  extraInitContainers: []
web:
  resources: {}
  replicas: 1
  nodeSelector: {}
  affinity: {}
  tolerations: []
  labels: {}
  podLabels: {}
  annotations: {}
  podAnnotations: {}
  service:
    annotations: {}
    sessionAffinity: "None"
    sessionAffinityConfig: {}
    type: ClusterIP
    externalPort: 8080
    loadBalancerIP: ""
    loadBalancerSourceRanges: []
    nodePort:
      http: ""

  baseUrl: "http://localhost:8080"
  serializeDAGs: false
  extraPipPackages: []
  initialStartupDelay: 0
  minReadySeconds: 5
  readinessProbe:
    enabled: false
    scheme: HTTP
    initialDelaySeconds: 10
    periodSeconds: 10
    timeoutSeconds: 1
    successThreshold: 1
    failureThreshold: 3

  livenessProbe:
    enabled: true
    scheme: HTTP
    initialDelaySeconds: 300
    periodSeconds: 30
    timeoutSeconds: 3
    successThreshold: 1
    failureThreshold: 2

  secretsDir: /var/airflow/secrets
  secrets: []
  secretsMap:

workers:
  enabled: false
  resources: {}
  replicas: 1
  nodeSelector: {}
  affinity: {}
  tolerations: []
  labels: {}
  podLabels: {}
  annotations: {}
  podAnnotations: {}
  autoscaling:
    enabled: false
    maxReplicas: 2
    metrics: []
  initialStartupDelay: 0
  celery:
    instances: 1
    gracefullTermination: false
    gracefullTerminationPeriod: 600
  terminationPeriod: 60
  secretsDir: /var/airflow/secrets
  secrets: []
  secretsMap:

flower:
  enabled: false
  resources: {}
  nodeSelector: {}
  affinity: {}
  tolerations: []
  labels: {}
  podLabels: {}
  annotations: {}
  podAnnotations: {}
  basicAuthSecret: ""
  basicAuthSecretKey: ""
  urlPrefix: ""
  service:
    annotations: {}
    type: ClusterIP
    externalPort: 5555
    loadBalancerIP: ""
    loadBalancerSourceRanges: []
    nodePort:
      http: ""

  initialStartupDelay: 0
  extraConfigmapMounts: []

logs:
  path: /opt/airflow/logs
  persistence:
    enabled: true
    existingClaim: efs-claim
    subPath: ""
    storageClass: efs-sc
    accessMode: ReadWriteMany
    size: 1Gi
dags:
  path: /opt/airflow/dags
  doNotPickle: false
  installRequirements: false
  persistence:
    enabled: false
    existingClaim: ""
    subPath: ""
    storageClass: ""
    accessMode: ReadOnlyMany
    size: 1Gi
  git:
    url: [email protected]:org/myrepo.git
    ref: develop
    secret: airflow-git-keys
    sshKeyscan: false
    privateKeyName: id_rsa
    repoHost: github.com
    repoPort: 22
    gitSync:
      enabled: true
      resources: {}
      image:
        repository: alpine/git
        tag: latest
        pullPolicy: Always
      refreshTime: 60
  initContainer:
    enabled: false
    resources: {}
    image:
      repository: alpine/git
      tag: latest
      pullPolicy: Always
    mountPath: "/dags"
    syncSubPath: ""
ingress:
  enabled: false
  web:
    annotations: {}
    path: ""
    host: ""
    livenessPath: ""
    tls:
      enabled: false
      secretName: ""
    precedingPaths: []
    succeedingPaths: []
  flower:
    annotations: {}
    path: ""
    host: ""
    livenessPath: ""
    tls:
      enabled: false
      secretName: ""
rbac:
  create: true
serviceAccount:
  create: true
  name: ""
  annotations: {}
extraManifests: []

postgresql:

  enabled: true
  postgresqlDatabase: airflow
  postgresqlUsername: postgres
  postgresqlPassword: airflow
  existingSecret: ""
  existingSecretKey: "postgresql-password"
  persistence:
    enabled: true
    storageClass: ""
    accessModes:
      - ReadWriteOnce
    size: 5Gi

externalDatabase:
  type: postgres
  host: localhost
  port: 5432
  database: airflow
  user: airflow
  passwordSecret: ""
  passwordSecretKey: "postgresql-password"

redis:
  enabled: false
  password: airflow
  existingSecret: ""
  existingSecretKey: "redis-password"
  cluster:
    enabled: false
    slaveCount: 1
  master:
    resources: {}
    persistence:
      enabled: false
      storageClass: ""
      accessModes:
        - ReadWriteOnce

      size: 8Gi

  slave:
    resources: {}
    persistence:
      enabled: false
      storageClass: ""
      accessModes:
        - ReadWriteOnce

      size: 8Gi

externalRedis:
  host: localhost
  port: 6379
  databaseNumber: 1
  passwordSecret: ""
  passwordSecretKey: "redis-password"

serviceMonitor:
  enabled: false
  selector:
    prometheus: kube-prometheus
  path: /admin/metrics
  interval: "30s"

prometheusRule:
  enabled: false
  additionalLabels: {}
  groups: []

I’m not really sure what to do here if anyone knows how to fix the permission error.

3

Answers


  1. You can use extraInitContainers with scheduler to change the permission, something like this.

     extraInitContainers:
        - name: volume-logs
          image: busybox
          command: ["sh", "-c", "chown -R 50000:50000 /opt/airflow/logs/"]
          volumeMounts:
            - mountPath: /opt/airflow/logs/
              name: logs-data 
    

    This will change permission of the mount point.

    Login or Signup to reply.
  2. you can try modify workers.persistence.fixPermissions: true in values.yaml, it is OK.

    Login or Signup to reply.
  3. I have had this issue with the Google Cloud Plateform and the helm airflow 1.2.0 chart (which uses airflow 2).
    What ended up working was:

    extraInitContainers:
      - name: fix-volume-logs-permissions
        image: busybox
        command: [ "sh", "-c", "chown -R 50000:0 /opt/airflow/logs/" ]
        securityContext:
          runAsUser: 0
        volumeMounts:
          - mountPath: /opt/airflow/logs/
            name: logs
    

    by tweaking based on Ajay’s answer. Please note that:

    • the values 50000:0 are based on uid and gid setup in your values.yaml
    • you need to use extraInitContainers under scheduler and not worker
    • "logs" seems to be the volume name automatically used by the helm logging config when enabled
    • Security context was necessary for me or else the chown failed due to unprivileged rights
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search