skip to Main Content

I learnt that to run a container as rootless, you need to specify either the SecurityContext:runAsUser 1000 or specify the USER directive in the DOCKERFILE.

Question on this is that there is no UID 1000 on the Kubernetes/Docker host system itself.

I learnt before that Linux User Namespacing allows a user to have a different UID outside it’s original NS.

Hence, how does UID 1000 exist under the hood? Did the original root (UID 0) create a new user namespace which is represented by UID 1000 in the container?

What happens if we specify UID 2000 instead?

2

Answers


  1. Hope this answer helps you

    I learnt that to run a container as rootless, you need to specify
    either the SecurityContext:runAsUser 1000 or specify the USER
    directive in the DOCKERFILE

    You are correct except in runAsUser: 1000. you can specify any UID, not only 1000. Remember any UID you want to use (runAsUser: UID), that UID should already be there!


    Often, base images will already have a user created and available but leave it up to the development or deployment teams to leverage it. For example, the official Node.js image comes with a user named node at UID 1000 that you can run as, but they do not explicitly set the current user to it in their Dockerfile. We will either need to configure it at runtime with a runAsUser setting or change the current user in the image using a derivative Dockerfile.

    runAsUser: 1001          # hardcode user to non-root if not set in Dockerfile
    runAsGroup: 1001         # hardcode group to non-root if not set in Dockerfile
    runAsNonRoot: true       # hardcode to non-root. Redundant to above if Dockerfile is set USER 1000
    

    Remmeber that runAsUser and runAsGroup ensures container processes do not run as the root user but don’t rely on the runAsUser or runAsGroup settings to guarantee this. Be sure to also set runAsNonRoot: true.


    Here is full example of securityContext:

    # generic pod spec that's usable inside a deployment or other higher level k8s spec
    
    apiVersion: v1
    kind: Pod
    metadata:
      name: mypod
    
    spec:
    
      containers:
    
          # basic container details
        - name: my-container-name
          # never use reusable tags like latest or stable
          image: my-image:tag
          # hardcode the listening port if Dockerfile isn't set with EXPOSE
          ports:
            - containerPort: 8080
              protocol: TCP
    
          readinessProbe:        # I always recommend using these, even if your app has no listening ports (this affects any rolling update)
            httpGet:             # Lots of timeout values with defaults, be sure they are ideal for your workload
              path: /ready
              port: 8080
          livenessProbe:         # only needed if your app tends to go unresponsive or you don't have a readinessProbe, but this is up for debate
            httpGet:             # Lots of timeout values with defaults, be sure they are ideal for your workload
              path: /alive
              port: 8080
    
          resources:             # Because if limits = requests then QoS is set to "Guaranteed"
            limits:
              memory: "500Mi"    # If container uses over 500MB it is killed (OOM)
              #cpu: "2"          # Not normally needed, unless you need to protect other workloads or QoS must be "Guaranteed"
            requests:
              memory: "500Mi"    # Scheduler finds a node where 500MB is available
              cpu: "1"           # Scheduler finds a node where 1 vCPU is available
    
          # per-container security context
          # lock down privileges inside the container
          securityContext:
            allowPrivilegeEscalation: false # prevent sudo, etc.
            privileged: false               # prevent acting like host root
      
      terminationGracePeriodSeconds: 600 # default is 30, but you may need more time to gracefully shutdown (HTTP long polling, user uploads, etc)
    
      # per-pod security context
      # enable seccomp and force non-root user
      securityContext:
    
        seccompProfile:
          type: RuntimeDefault   # enable seccomp and the runtimes default profile
    
        runAsUser: 1001          # hardcode user to non-root if not set in Dockerfile
        runAsGroup: 1001         # hardcode group to non-root if not set in Dockerfile
        runAsNonRoot: true       # hardcode to non-root. Redundant to above if Dockerfile is set USER 1000
    

    sources:

    Login or Signup to reply.
  2. Something at the container layer calls the setuid(2) system call with that numeric user ID. There’s no particular requirement to "create" a user; if you are able to call setuid() at all, you can call it with any numeric uid you want.

    You can demonstrate this with plain Docker pretty easily. The docker run -u option takes any numeric uid, and you can docker run -u 2000 and your container will (probably) still run. It’s common enough to docker run -u $(id -u) to run a container with the same numeric user ID as the host user even though that uid doesn’t exist in the container’s /etc/passwd file.

    At a Kubernetes layer this is a little less common. A container can’t usefully access host files in a clustered environment (…on which host?) so there’s no need to have a user ID matching the host’s. If the image already sets up a non-root user ID, you should be able to just use it as-is without setting it at the Kubernetes layer.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search