I’m using the Jupyter
extension (v2022.9.1303220346) in Visual Studio Code
(v1.73.1).
To reproduce this issue, make any modification to the notebook and check it into git. You’ll observe that you get an extra difference for execution_count
. For example (display from Git Gui
):
- "execution_count": 7,
+ "execution_count": 9,
The execution count doesn’t appear to be useful and is noise in the git history. Can Jupyter or VS Code be configured to stop updating this value or (better) ignore it altogether?
2
Answers
I’m not sure about VS Code, and I think the answer for VS Code config options might be no after reading some discussions in GitHub feature-request issue tickets for Jupyter notebooks, where the fact that they are feature-requests indicates to me that the answer also currently seems to be no, but also that there are plenty of approaches to tackling the problem:
In
jupyter/notebook
: Suggestion: Separate file for notebook executed cell outputs. #5677In
jupyterlab/jupyterlab
: Using a notebook & git creates too many diff #9444In
jupyterlab/jupyterlab-git
: Cleaning Notebook cell outputs #392For your learning purposes / reference, I found this info by googling "
github issues jupyter notebook put execution_count in separate file
" and looking through the top search results and linked GitHub issues in their discussion threads.The
.ipynb
format contains your input code cells, output data and a variety of metadata to reproduce the exact form you see when running the notebook interactively.The "execution_count" is unfortunately only one of them, there are many more (cell collapsed, extension metadata and more) that are stored and do not represent any difference in the code of the notebook. So therefore it is not really possible to preserve all the information and generate meaningful differences in git. While there are discussions which data to keep or throw out for version control purposes the underlying JSON format is not ideal anyway for this purpose, as for example each line in each cell gets encoded like this:
which is rather hard to read compared to the underlying code.
One possibility out of this is to use the Jupytext extension. This pairs your
.ibynb
file with a regular.py
file while keeping some of the metadata intact. The paired.py
file can be viewed & edited with any editor, works well with git, and does not depend on the complete jupyter infrastructure.