I want to create Python scripts, based on a notebook, to get a runtime using the same .pkl
file.
On this line:
learn = load_learner('model.pkl', cpu=True)
I get this error:
(project) me@ubuntu-pcs:~/PycharmProjects/project$ python main.py
Traceback (most recent call last):
File "main.py", line 6, in <module>
from src.train.train_model import train
File "/home/me/PycharmProjects/project/src/train/train_model.py", line 17, in <module>
learn = load_learner('yasmine-sftp/export_2.pkl', cpu=True) # to run on GPU
File "/home/me/miniconda3/envs/project/lib/python3.6/site-packages/fastai/learner.py", line 384, in load_learner
res = torch.load(fname, map_location='cpu' if cpu else None, pickle_module=pickle_module)
File "/home/me/miniconda3/envs/project/lib/python3.6/site-packages/torch/serialization.py", line 607, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/me/miniconda3/envs/project/lib/python3.6/site-packages/torch/serialization.py", line 882, in _load
result = unpickler.load()
File "/home/me/miniconda3/envs/project/lib/python3.6/site-packages/torch/serialization.py", line 875, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'Tf' on <module '__main__' from 'main.py'>
This is because in order to open the .pkl
file, I need the original function that was used to train it.
Thankfully, looking back at the notebook, Tf(o)
is there:
def Tf(o):
return '/mnt/scratch2/DLinTHDP/PathLAKE/Version_4_fastai/Dataset/CD8/Train/masks/'+f'{o.stem}_P{o.suffix}'
However, anywhere I place Tf(o)
in my Python scripts I still get the same error.
Where should I put Tf(o)
?
In error message: <module '__main__' from 'main.py'>
seems to suggest to put it in main()
or under if __name__ ...
.
I’ve tried everywhere. Importing Tf(o)
also doesn’t work.
Python Scripts
main.py
:
import glob
from pathlib import Path
from train_model import train
ROOT = Path("folder/path") # Detection Folder
def main(root: Path):
train(root)
if __name__ == '__main__':
main(ROOT)
train_model.py
:
from pathlib import Path
from fastai.vision.all import *
folder_path = Path('.')
learn = load_learner('model.pkl', cpu=True) # AttributeError
learn.load('model_3C_34_CELW_V_1.1') # weights
def train(root: Path):
# ...
I cannot inspect the file:
(project) me@ubuntu-pcs:~/PycharmProjects/project$ python -m pickletools -a model.pkl
Traceback (most recent call last):
File "/home/me/miniconda3/envs/project/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/me/miniconda3/envs/project/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/me/miniconda3/envs/project/lib/python3.6/pickletools.py", line 2830, in <module>
args.indentlevel, annotate)
File "/home/me/miniconda3/envs/project/lib/python3.6/pickletools.py", line 2394, in dis
for opcode, arg, pos in genops(pickle):
File "/home/me/miniconda3/envs/project/lib/python3.6/pickletools.py", line 2242, in _genops
arg = opcode.arg.reader(data)
File "/home/me/miniconda3/envs/project/lib/python3.6/pickletools.py", line 373, in read_stringnl_noescape
return read_stringnl(f, stripquotes=False)
File "/home/me/miniconda3/envs/project/lib/python3.6/pickletools.py", line 359, in read_stringnl
data = codecs.escape_decode(data)[0].decode("ascii")
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 63: ordinal not in range(128)
3
Answers
Problem
Why I get that error is bc the
Tf()
function was used to train themodel.pkl
file, in the same namespace (because it was done in a notebook file).This article states:
Solution
There's an extension to pickle called dill, that does serialise Python objects and functions etc. (not references) PyPI
I solved this for a similar problem by modifying load_learner from fastai.basic_train.py (fastai==1.0.61), according to ecatkins Edward Atkins
https://forums.fast.ai/t/error-loading-saved-model-with-custom-loss-function/37627/7
load_learner fails when classes used to define the model are not found in the original modules. So, override the pickle class loader (find_class) to search in the specified module.
Another approach is to manually replicate the training environment’s namespace.
In my case