I am trying to setup remote logging in Airflow stable/airflow
helm chart on v1.10.9
I am using Kubernetes executor and puckel/docker-airflow
image. here’s my values.yaml
file.
airflow:
image:
repository: airflow-docker-local
tag: 1.10.9
executor: Kubernetes
service:
type: LoadBalancer
config:
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.9
AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
AIRFLOW__KUBERNETES__NAMESPACE: airflow
AIRFLOW__CORE__REMOTE_LOGGING: True
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: "s3://xxx"
AIRFLOW__CORE__REMOTE_LOG_CONN_ID: "s3://aws_access_key_id:aws_secret_access_key@bucket"
AIRFLOW__CORE__ENCRYPT_S3_LOGS: False
persistence:
enabled: true
existingClaim: ''
postgresql:
enabled: true
workers:
enabled: false
redis:
enabled: false
flower:
enabled: false
but my logs don’t get exported to S3, all I get on UI is
*** Log file does not exist: /usr/local/airflow/logs/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log
*** Fetching from: http://icpjobdagicpkubejob-f4144a374f7a4ac9b18c94f058bc7672:8793/log/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='icpjobdagicpkubejob-f4144a374f7a4ac9b18c94f058bc7672', port=8793): Max retries exceeded with url: /log/icp_job_dag/icp-kube-job/2019-02-13T00:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f511c883710>: Failed to establish a new connection: [Errno -2] Name or service not known'))
any one have more insights what could I be missing?
Edit: from @trejas’s suggestion below. I created a separate connection and using that. here’s what my airflow config in values.yaml
look like
airflow:
image:
repository: airflow-docker-local
tag: 1.10.9
executor: Kubernetes
service:
type: LoadBalancer
connections:
- id: my_aws
type: aws
extra: '{"aws_access_key_id": "xxxx", "aws_secret_access_key": "xxxx", "region_name":"us-west-2"}'
config:
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: airflow-docker-local
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 1.10.9
AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY: Never
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: airflow
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM: airflow
AIRFLOW__KUBERNETES__NAMESPACE: airflow
AIRFLOW__CORE__REMOTE_LOGGING: True
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: s3://airflow.logs
AIRFLOW__CORE__REMOTE_LOG_CONN_ID: my_aws
AIRFLOW__CORE__ENCRYPT_S3_LOGS: False
I still have the same issue.
2
Answers
Your remote log conn id needs to be an ID of a connection in the connections form/list. Not a connection string.
https://airflow.apache.org/docs/stable/howto/write-logs.html
https://airflow.apache.org/docs/stable/howto/connection/index.html
I was running into the same issue and thought I’d follow up with what ended up working for me. The connection is correct but you need to make sure that the worker pods have the same environment variables:
I also had to set the fernet key for the workers (and in general) otherwise I get an invalid token error: