skip to Main Content

I run my service app in docker container app-api

Result of top:

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
6420 root      20   0 30.572g 0.028t  38956 S  47.8 92.5 240:40.95 app
...

Result of htop:

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 6420 root       20   0 30.6G 29.0G 38956 S 47.1 92.5  4h21:53 app
 6554 root       20   0 30.6G 29.0G 38956 S  6.6 92.5 23:04.15 app
 6463 root       20   0 30.6G 29.0G 38956 S  2.0 92.5 27:29.53 app
 6430 root       20   0 30.6G 29.0G 38956 S  0.0 92.5 25:30.61 app
 6429 root       20   0 30.6G 29.0G 38956 S  5.3 92.5 26:36.17 app
 6428 root       20   0 30.6G 29.0G 38956 S 10.0 92.5 23:56.10 app
 6426 root       20   0 30.6G 29.0G 38956 S  6.0 92.5  8:09.12 app
 6427 root       20   0 30.6G 29.0G 38956 S  0.0 92.5 23:03.81 app
 6425 root       20   0 30.6G 29.0G 38956 S  0.0 92.5  0:00.00 app
 6424 root       20   0 30.6G 29.0G 38956 S  0.0 92.5 25:42.46 app
 6423 root       20   0 30.6G 29.0G 38956 S  4.6 92.5 26:10.82 app
 6422 root       20   0 30.6G 29.0G 38956 S 12.0 92.5 23:24.68 app
 6421 root       20   0 30.6G 29.0G 38956 S  2.0 92.5  4:32.47 app
 2202 gitlab-ru  20   0  231M 70132 53620 S  5.3  0.2  4h54:21 nginx: worker process
 2203 gitlab-ru  20   0  228M 59040 47680 S  0.7  0.2 54:44.83 nginx: worker process
  281 root       19  -1  175M 58104 47728 S  0.0  0.2  0:17.76 /lib/systemd/systemd-journald
 1036 root       20   0 1893M 38164 13332 S  0.0  0.1  0:38.17 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
...

Result of docker stats:

$ docker stats --no-stream
CONTAINER ID   NAME             CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
14654b8a4bfb   app-letsencrypt  13.08%    244.5MiB / 31.41GiB   0.76%     183GB / 192GB     12.4GB / 4.64MB   23
a932dabbced8   app-api          60.50%    7.258GiB / 31.41GiB   23.10%    53.2GB / 10.6GB   48.1MB / 0B       14
2cebc542dda6   app-redis        0.12%     3.902MiB / 31.41GiB   0.01%     24.2kB / 0B       1.84GB / 655kB    4

As you are can see 0.028t (~29G) in top in much more than 7.258GiB in docker stats. Difference is about 29 – 7.258 > 20G of RAM.

Help me please to understand how to detect what is this phantom that takes 20G of RAM? Or maybe point me where to dig, into problems with my application or with docker (20.10.1) or with operation system (Ubuntu 18.04)?

UPD

Output in pprof (heap)

# runtime.MemStats
# Alloc = 7645359160
# TotalAlloc = 2552206192400
# Sys = 31227357832
# Lookups = 0
# Mallocs = 50990505448
# Frees = 50882282691
# HeapAlloc = 7645359160
# HeapSys = 29526425600
# HeapIdle = 21707890688
# HeapInuse = 7818534912
# HeapReleased = 9017090048
# HeapObjects = 108222757
# Stack = 1474560 / 1474560
# MSpan = 101848496 / 367820800
# MCache = 13888 / 16384
# BuckHashSys = 10697838
# GCSys = 1270984696
# OtherSys = 49937954
# NextGC = 11845576832
# LastGC = 1615583458331895138
# PauseNs = ..................
# NumGC = 839
# NumForcedGC = 0
# GCCPUFraction = 0.027290987331299785
# DebugGC = false
# MaxRSS = 31197982720

2

Answers


  1. You are comparing top/htop RES mem (man):

    The non-swapped physical memory a task has used.
    RES = CODE + DATA.

    with docker stats CLI output (doc):

    On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage.

    Use docker stats API and you will get much more granular view, e.g. stat for memory:

    {
        "total_pgmajfault": 0,
        "cache": 0,
        "mapped_file": 0,
        "total_inactive_file": 0,
        "pgpgout": 414,
        "rss": 6537216,
        "total_mapped_file": 0,
        "writeback": 0,
        "unevictable": 0,
        "pgpgin": 477,
        "total_unevictable": 0,
        "pgmajfault": 0,
        "total_rss": 6537216,
        "total_rss_huge": 6291456,
        "total_writeback": 0,
        "total_inactive_anon": 0,
        "rss_huge": 6291456,
        "hierarchical_memory_limit": 67108864,
        "total_pgfault": 964,
        "total_active_file": 0,
        "active_anon": 6537216,
        "total_active_anon": 6537216,
        "total_pgpgout": 414,
        "total_cache": 0,
        "inactive_anon": 0,
        "active_file": 0,
        "pgfault": 964,
        "inactive_file": 0,
        "total_pgpgin": 477
    }
    

    You can see – memory is not just one, but it has many types and each tool may report own set&combination of memory types. I guess you will find missing memory in app cache memory allocation.

    You can check overall basic memory allocations with free command:

    $ free -m
                  total        used        free      shared  buff/cache   available
    Mem:           2000        1247          90         178         662         385
    Swap:             0           0           0
    

    It is a normal state, when Linux uses unused memory for buff/cache.

    Login or Signup to reply.
  2. docker stats is reporting the cgroup resource usage of the container’s cgroup:

    $ docker run -it -m 1g --cpus 1.5 --name test-stats busybox /bin/sh
    
    / # cat /sys/fs/cgroup/memory/memory.usage_in_bytes
    2629632
    
    / # cat /sys/fs/cgroup/memory/memory.limit_in_bytes
    1073741824
    
    / # cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us
    150000
    
    / # cat /sys/fs/cgroup/cpu/cpu.cfs_period_us
    100000
    

    From another window (there’s a small variation with the cat command stopped):

    $ docker stats --no-stream test-stats
    CONTAINER ID   NAME         CPU %     MEM USAGE / LIMIT   MEM %     NET I/O         BLOCK I/O   PIDS
    9a69d1323422   test-stats   0.00%     2.395MiB / 1GiB     0.23%     5.46kB / 796B   3MB / 0B    1
    

    Note that this is will differ from the overall host memory and cpu if you have specified limits with your containers. Without limits, the cpu quota will be -1 to be unrestricted, and the memory limit will set to the page counter max value.

    Trying to add up memory usage from the top command is very error prone. There is different types of memory in the linux kernel (including disk cache), memory gets shared between multiple threads (which is why you likely see multiple pids for app, each with the exact same memory), some memory may be mmap that is not backed with ram, and a long list of other challenges. People that know much more about this than me will say that the kernel doesn’t even know when it’s actually out of memory until it attempts to reclaim memory from many process and those attempts all fail.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search