UUID stayed the same for different processes on Centos OS, but works fine on Windows OS (UUID per Process flow)

OksanaOk
September 3, 2021
77 views
1 vote
2 Answers

I have two source files that I am running in Python 3.9. (The files are big…)

File one (fileOne.py)

# ...
sessionID = uuid.uuid4().hex
# ...

File two (fileTwo.py)

# ...
from fileOne import sessionID
# ...

File two is executed using module multiprocessing.

When I run on my local machine and print the UUID in file two, it is always unique.
When I run the script on Centos OS, it somehow remained the same
If I restart the service, the UUID will change once.

My question: Why does this work locally (Windows OS) as expected, but not on a CentOS VM?

UPDATE 1.0:
To make it clear.

For each separate process, I need that UUID will be the same across FileOne and FileTwo. WHich mean

processOne = UUID in file one and in file two will be 1q2w3e
processTwo = UUID in file one and in file two will be r4t5y6 (a different one)

Answers

- BrownBear
- September 3, 2021 at 4:38 pm
- 0 votes
0
When you run your script it generate the new value of the uuid, but when you run it inside some service you code the same as:
```
sessionID = 123 # simple constant
```
so to fix the issue you can try wrap the code to the function, for example:
```
def get_uuid():
    return uuid.uuid4().hex
```
in you second file:
```
from frileOne import get_uuid

get_uuid()    
```
Login or Signup to reply.

- jsbueno
- September 3, 2021 at 4:44 pm
- 0 votes
0
Your riddle is likely is caused by the way multi-processing works in different operating systems. You don’t mention, but your "run locally" is certainly Windows or MacOS, not a Linux or other Unix Flavor.

The thing is that multiprocessing on Linux (and up to a time ago on MacOS, but changed that on Python 3.8), used a system fork call when using multiprocessing: the current process is duplicatesd "as is" with all its defined variables and classes – since your sessionID is defined at import time, it stays the same in all subprocesess.

Windows lacks the fork call, and multiprocessing resorts to start a new Python interpreter which re-imports all modules from the current process (this leads to another, more common cause of confusion, where any code not guarded by an if __name__ == "__main__": on the entry Python file is re-executed). In your case the value for sessionID is regenerated.

Check the docs at: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

So, if you want the variable to behave reliably and have teh same value across all processes when running multiprocessing, you should either pass it as a parameter to the target functions in the other processes, or use a proper structure meant to share values across processes as documented here:
https://docs.python.org/3/library/multiprocessing.html#sharing-state-between-processes

(you can also check this recent question about the same topic: why is a string printing 3 times instead of 1 when using time.sleep with multiprocessing imported?)

If you need a unique ID across files for each different process:
(As is more clear from the edit and comments)

Have a global (plain) dictionary which will work as a per-process registry for the IDs, and use a function to retrieve the ID – the function can use os.getpid() as a key to the registry.

file 1:
```
import os
import uuid
...
_id_registry = {}

def get_session_id():
    return _id_registry.setdefault(os.getpid(), uuid.uuid4())
```
file2:
```
from file1 import get_session_id

sessionID = get_session_id()
```
(the setdefault dict method takes care of providing a new ID value if none was set)

NB.: the registry set up in this way will keep at most the master process ID (if multiprocessing is using fork mode) and itself – no data on the siblings, as each process will hold its own copy of the registry. If you need a working inter-process dictionary (which could hold a live registry for all processes, for example) you will probably be better using redis for it (https://redis.io – certainly one of the Python bindings have a transparent Python-mapping-over-redis, so you don´t have to worry with the semantics of it)
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.