skip to Main Content

I updated my Airflow setup from 2.3.3 to 2.4.0. and I started to get these errors on the UI DAG <dag name> seems to be missing from DagBag. Scheduler log shows ERROR - DAG < dag name> not found in serialized_dag table

One of my airflow instanced seemed to work well for the old dags, but when I add new dags I get the error. On the other airflow Instance, every dag was outputting this error and the only way out of this mess was to delete the db and init it again. The error message appears when I click the dag from the main view.

Deleting db is not the solution I want to use in the future, is there any other way this can be fixed?

Side note:
It’s also weird, that I use the same airflow image in both of my instances and still the other instance has the newly added Datasets menu on top bar and the other instance doesn’t have it.

My setup:
Two isolated airflow main instances(dev,prod) with CeleryExecutor and each of these instances have 10 worker machines. I’m running the setup on each machine using docker compose conf and shared .env file that ensures that the setup is the same on the main machine and the worker machines.

Airflow version: 2.4.0 (same error in 2.4.1)
PSQL: 13
Redis:6.2.4

UPDATE:
Still unresolved. The new dag is shown at Airflow UI and it can be activated. Running the dag is not possible. I think theres no other solution than to reset the db.

2

Answers


  1. Given your latest comment, it sounds like you are running two airflow versions with two different schedulers connected to the same database.

    If one has access to DAGs, that the other doesn’t, that alone would already explain the errors you are seeing regarding DAG missing.

    Please share some more details on your setup and we can look into this more in depth.

    Login or Signup to reply.
  2. I have found no official references for this fix so use it carefully and backup your db first 🙂

    I have encountered the same problem after the upgrade to Airflow 2.4.1 (from 2.3.4). Pre-existing DAGs still worked properly, but for new DAGs I saw the error you mentioned.

    Debugging, I found in the scheduler logs:

    
    {manager.py:419} ERROR - Add View Menu Error: (psycopg2.errors.NotNullViolation) null value in column "id" of relation "ab_view_menu" violates not-null constraint
    DETAIL:  Failing row contains (null, DAG:my-new-dag-id).
    [SQL: INSERT INTO public.ab_view_menu (name) VALUES (%(name)s) RETURNING public.ab_view_menu.id]
    [parameters: {'name': 'DAG:my-new-dag-id'}]
    
    

    which seems to be the cause of the problem: a null value for the id column, which prevents the DAG from being loaded.
    I also saw similar errors when running airflow db upgrade.

    After a check on the ab_view_menu database table I noticed that a sequence exists for its primary key (ab_view_menu_id_seq), but it was not linked to the column.

    So I linked it:

    ALTER TABLE ab_view_menu ALTER COLUMN id SET DEFAULT NEXTVAL('public.ab_view_menu_id_seq'::REGCLASS);
    ALTER SEQUENCE ab_view_menu_id_seq OWNED BY ab_view_menu.id;
    SELECT setval('ab_view_menu_id_seq', (SELECT max(id) FROM ab_view_menu));
    

    The same consideration applies to other tables:

    • ab_permission
    • ab_permission_view
    • ab_permission_view_role
    • ab_register_user
    • ab_role
    • ab_user
    • ab_user_role
    • ab_view_menu

    With this fix on the sequences the problem seems to be solved.


    NOTE: the database used is PostgreSQL

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search