I updated my Airflow setup from 2.3.3 to 2.4.0. and I started to get these errors on the UI DAG <dag name> seems to be missing from DagBag
. Scheduler log shows ERROR - DAG < dag name> not found in serialized_dag table
One of my airflow instanced seemed to work well for the old dags, but when I add new dags I get the error. On the other airflow Instance, every dag was outputting this error and the only way out of this mess was to delete the db and init it again. The error message appears when I click the dag from the main view.
Deleting db is not the solution I want to use in the future, is there any other way this can be fixed?
Side note:
It’s also weird, that I use the same airflow image in both of my instances and still the other instance has the newly added Datasets menu on top bar and the other instance doesn’t have it.
My setup:
Two isolated airflow main instances(dev,prod) with CeleryExecutor and each of these instances have 10 worker machines. I’m running the setup on each machine using docker compose conf and shared .env file that ensures that the setup is the same on the main machine and the worker machines.
Airflow version: 2.4.0 (same error in 2.4.1)
PSQL: 13
Redis:6.2.4
UPDATE:
Still unresolved. The new dag is shown at Airflow UI and it can be activated. Running the dag is not possible. I think theres no other solution than to reset the db.
2
Answers
Given your latest comment, it sounds like you are running two airflow versions with two different schedulers connected to the same database.
If one has access to DAGs, that the other doesn’t, that alone would already explain the errors you are seeing regarding DAG missing.
Please share some more details on your setup and we can look into this more in depth.
I have encountered the same problem after the upgrade to Airflow 2.4.1 (from 2.3.4). Pre-existing DAGs still worked properly, but for new DAGs I saw the error you mentioned.
Debugging, I found in the scheduler logs:
which seems to be the cause of the problem: a null value for the id column, which prevents the DAG from being loaded.
I also saw similar errors when running
airflow db upgrade
.After a check on the
ab_view_menu
database table I noticed that a sequence exists for its primary key (ab_view_menu_id_seq
), but it was not linked to the column.So I linked it:
The same consideration applies to other tables:
ab_permission
ab_permission_view
ab_permission_view_role
ab_register_user
ab_role
ab_user
ab_user_role
ab_view_menu
With this fix on the sequences the problem seems to be solved.
NOTE: the database used is PostgreSQL