skip to Main Content

I am facing issue when starting slurmd service on my compute nodes.

× slurmd.service – Slurm node daemon

Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2022-10-12 04:10:25 EDT; 7s ago
Process: 5839 ExecStart=/usr/sbin/slurmd -D -s $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
Main PID: 5839 (code=exited, status=1/FAILURE)
CPU: 3ms
Oct 12 04:10:25 compute1.ghpcv3.au.dk systemd[1]: Started Slurm node daemon.
Oct 12 04:10:25 compute1.ghpcv3.au.dk systemd[1]: slurmd.service: Main process exited, code=exited, status=1/FAILURE
Oct 12 04:10:25 compute1.ghpcv3.au.dk systemd[1]: slurmd.service: Failed with result ‘exit-code’.

#slurmd -D -vv
slurmd: debug: Log file re-opened
slurmd: debug: CPUs:1 Boards:1 Sockets:1 CoresPerSocket:1 ThreadsPerCore:1
slurmd: error: Couldn’t find the specified plugin name for cgroup/v2 looking at all files
slurmd: error: cannot find cgroup plugin for cgroup/v2
slurmd: error: cannot create cgroup context for cgroup/v2
slurmd: error: Unable to initialize cgroup plugin
slurmd: error: slurmd initialization failed

What I missing?

2

Answers


  1. You may have to manually create cgroup.conf in your slurm config directory https://stackoverflow.com/a/65226055/5749775

    I fixed this by creating a fairly simple conf:

    # /etc/slurm-llnl/cgroup.conf
    
    CgroupAutomount=yes
    # CgroupReleaseAgentDir="/etc/slurm/cgroup"
    
    ConstrainCores=yes
    ConstrainDevices=yes
    # TaskAffinity=yes
    ConstrainRAMSpace=yes
    # ConstrainSwapSpace=yes
    MaxRAMPercent=98
    AllowedSwapSpace=0
    AllowedRAMSpace=100
    MemorySwappiness=0
    
    Login or Signup to reply.
  2. I had the same problem. Slurm has support for both cgroup/v1 and v2, but support for v2 is only compiled in if the dbus development files are present. So first install dbus-devel

    dnf install dbus-devel
    

    and then run a clean Slurm build.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search