skip to Main Content

I like to re-run or run a DAG from composer, and below command is what i have used, but i got some exceptions like this

kubeconfig entry generated for europe-west1-leo-stage-bi-db7ea92f-gke.
Executing within the following Kubernetes cluster namespace: composer-1-7-7-airflow-1-10-1-db7ea92f
command terminated with exit code 2
[2020-07-14 12:44:34,472] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
[2020-07-14 12:44:35,624] {default_celery.py:80} WARNING - You have configured a result_backend of redis://airflow-redis-service.default.svc.cluster.local:6379/0, it is highly recommended to use an alternative result_backend (i.e. a database).
[2020-07-14 12:44:35,628] {__init__.py:51} INFO - Using executor CeleryExecutor
[2020-07-14 12:44:35,860] {app.py:51} WARNING - Using default Composer Environment Variables. Overrides have not been applied.
[2020-07-14 12:44:35,867] {configuration.py:516} INFO - Reading the config from /etc/airflow/airflow.cfg
[2020-07-14 12:44:35,895] {configuration.py:516} INFO - Reading the config from /etc/airflow/airflow.cfg
usage: airflow [-h]
               {backfill,list_tasks,clear,pause,unpause,trigger_dag,delete_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,scheduler,worker,flower,version,connections,create_user}
               ...
airflow: error: unrecognized arguments: --yes

ERROR: (gcloud.composer.environments.run) kubectl returned non-zero status code.

This is my command, the second line I have specified the parameters, can anyone help with this?
Thank you

 gcloud composer environments run leo-stage-bi     --location=europe-west1 backfill -- regulatory_spain_monthly  -s 20190701     -e 20190702 -t  "regulatory_spain_rud_monthly_materialization"  --reset_dagruns


gcloud composer environments run project-name     --location=europe-west1 backfill -- DAG name  -s start date     -e end date -t  task in the DAG --reset_dagruns

2

Answers


  1. To trigger a manual run you can use the trigger_dag parameter:

    gcloud composer environments run <COMPOSER_INSTANCE_NAME> --location <LOCATION> trigger_dag -- <DAG_NAME>
    
    Login or Signup to reply.
  2. I’ve checked Airflow backfill sub-command functionality within gcloud util from Google Cloud SDK 300.0.0 tools-set and I’ve finished my test attempts running backfill action with the same error:

    airflow: error: unrecognized arguments: –yes

    Digging into this issue and launching --verbosity=debug for gcloud composer environments run command, I’ve realized the cause of this lag:

    gcloud composer environments run <ENVIRONMENT> --location=<LOCATION> --verbosity=debug backfill -- <DAG> -s <start_date> -e <end_date> -t "task_id" --reset_dagruns
    

    DEBUG: Executing command: [‘/usr/bin/kubectl’, ‘–namespace’,
    ”, ‘exec’, ‘airflow-worker-*’, ‘–stdin’, ‘–tty’,
    ‘–container’, ‘airflow-worker’, ‘–‘, ‘airflow’, ‘backfill’, ”,
    ‘-s’, ‘<start_date>’, ‘-e’, ‘<end_date>’, ‘-t’, ‘task_id’,
    ‘–reset_dagruns’, ‘–yes‘]

    The above output reflects a way how gcloud decouples command line arguments, dispatching them to kubectl command inheritor. Saying this, I assume that –yes argument for unknown reason was propagated and even more wrongly positioned out the rest of parameters.

    Looking for a workaround I was on my way composing relevant kubectl command call to particular Airflow worker Pod, manually dispatching Airflow command line parameters:

    kubectl -it exec $(kubectl get po -l run=airflow-worker -o jsonpath='{.items[0].metadata.name}' 
        -n $(kubectl get ns| grep composer*| awk '{print $1}')) -n $(kubectl get ns| grep composer*| awk '{print $1}') 
        -c airflow-worker -- airflow backfill <DAG> -s <start_date> -e <end_date> -t "task_id" --reset_dagruns
    

    By now, airflow backfill command successes without throwing any error.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search