How should I deploy Persistent Volume(PV) for JupyterHub on Kubernetes? - CentOS

GuanzhouKe
July 2, 2020
203 views
0 votes
2 Answers

Environment information:

Computer detail: One master node and four slave nodes. All are CentOS Linux release 7.8.2003 (Core).
Kubernetes version: v1.18.0.
Zero to JupyterHub version: 0.9.0.
Helm version: v2.11.0

I recently try to deploy an online code environment(like Google Colab) in new lab servers via Zero to JupyterHub. Unfortunately, I failed to deploy Persistent Volume(PV) for JupyterHub and I got a failure message such below:

Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  4s (x27 over 35m)  default-scheduler  running "VolumeBinding" filter plugin for pod "hub-7b9cbbcf59-747jl": pod has unbound immediate PersistentVolumeClaims

I followed the installing process by the tutorial of JupyterHub, and I was used Helm to install JupyterHub on k8s. That config file such below:

config.yaml

proxy:
  secretToken: "2fdeb3679d666277bdb1c93102a08f5b894774ba796e60af7957cb5677f40706"
singleuser:
  storage:
    dynamic:
      storageClass: local-storage

Here, I was config a local-storage for JupyterHub, the local-storage was observed k8s: Link. And its yaml file
such like that:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Then I use kubectl get storageclass to check it work, I got the message below:

NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  64m

So, I thought I deployed a storage for JupyterHub, but I so naive. I am so disappointed about that because my other Pods(JupyterHub) are all running. And I have been search some solutions so long, but also failed.

So now, my problems are:

What is the true way to solve the PV problems? (Better using local storage.)
Is the local storage way will using other nodes disk not only master?
In fact, my lab had a could storage service, so if Q2 answer is No, and how I using my lab could storage service to deploy PV?

I had been addressed above problem with @Arghya Sadhu’s solution. But now, I got a new problem is the Pod hub-db-dir also pending, it result my service proxy-public pending.

The description of hub-db-dir such below:

Name:           hub-7b9cbbcf59-jv49z
Namespace:      jhub
Priority:       0
Node:           <none>
Labels:         app=jupyterhub
                component=hub
                hub.jupyter.org/network-access-proxy-api=true
                hub.jupyter.org/network-access-proxy-http=true
                hub.jupyter.org/network-access-singleuser=true
                pod-template-hash=7b9cbbcf59
                release=jhub
Annotations:    checksum/config-map: c20a64c7c9475201046ac620b057f0fa65ad6928744f7d265bc8705c959bce2e
                checksum/secret: 1beaebb110d06103988476ec8a3117eee58d97e7dbc70c115c20048ea04e79a4
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/hub-7b9cbbcf59
Containers:
  hub:
    Image:      jupyterhub/k8s-hub:0.9.0
    Port:       8081/TCP
    Host Port:  0/TCP
    Command:
      jupyterhub
      --config
      /etc/jupyterhub/jupyterhub_config.py
      --upgrade-db
    Requests:
      cpu:      200m
      memory:   512Mi
    Readiness:  http-get http://:hub/hub/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      PYTHONUNBUFFERED:        1
      HELM_RELEASE_NAME:       jhub
      POD_NAMESPACE:           jhub (v1:metadata.namespace)
      CONFIGPROXY_AUTH_TOKEN:  <set to the key 'proxy.token' in secret 'hub-secret'>  Optional: false
    Mounts:
      /etc/jupyterhub/config/ from config (rw)
      /etc/jupyterhub/cull_idle_servers.py from config (rw,path="cull_idle_servers.py")
      /etc/jupyterhub/jupyterhub_config.py from config (rw,path="jupyterhub_config.py")
      /etc/jupyterhub/secret/ from secret (rw)
      /etc/jupyterhub/z2jh.py from config (rw,path="z2jh.py")
      /srv/jupyterhub from hub-db-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from hub-token-vlgwz (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub-config
    Optional:  false
  secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-secret
    Optional:    false
  hub-db-dir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  hub-db-dir
    ReadOnly:   false
  hub-token-vlgwz:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-token-vlgwz
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  61s (x43 over 56m)  default-scheduler  0/5 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 4 node(s) didn't find available persistent volumes to bind.

The information with kubectl get pv,pvc,sc.

NAME                               STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
persistentvolumeclaim/hub-db-dir   Pending                                      local-storage   162m

NAME                                                  PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/local-storage (default)   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  8h

So, how to fix it?

Answers

- ArghyaSadhu
- July 2, 2020 at 6:40 am
- 0 votes
0
1. I think you need to make local-storage as default storage class
kubectl patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
1. Local storage will use the local disk storage of the node where the pod get scheduled.
2. Hard to tell without more details. You can either create PV manually or use a storage class which does dynamic volume provisioning.
Login or Signup to reply.

In addition to @Arghya Sadhu answer, in order to make it work using local storage you have to create a PersistentVolume manually.

For example:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: hub-db-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: <path_to_local_volume>
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - <name_of_the_node>

Then you can deploy the chart:

helm upgrade --install $RELEASE jupyterhub/jupyterhub 
  --namespace $NAMESPACE  
  --version=0.9.0 
  --values config.yaml

The config.yaml file can be left as is:

proxy:
  secretToken: "<token>"
singleuser:
  storage:
    dynamic:
      storageClass: local-storage

Please signup or login to give your own answer.

Click here to cancel reply.

How should I deploy Persistent Volume(PV) for JupyterHub on Kubernetes? – CentOS

Answers