I run my service app
in docker container app-api
Result of top
:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6420 root 20 0 30.572g 0.028t 38956 S 47.8 92.5 240:40.95 app
...
Result of htop
:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
6420 root 20 0 30.6G 29.0G 38956 S 47.1 92.5 4h21:53 app
6554 root 20 0 30.6G 29.0G 38956 S 6.6 92.5 23:04.15 app
6463 root 20 0 30.6G 29.0G 38956 S 2.0 92.5 27:29.53 app
6430 root 20 0 30.6G 29.0G 38956 S 0.0 92.5 25:30.61 app
6429 root 20 0 30.6G 29.0G 38956 S 5.3 92.5 26:36.17 app
6428 root 20 0 30.6G 29.0G 38956 S 10.0 92.5 23:56.10 app
6426 root 20 0 30.6G 29.0G 38956 S 6.0 92.5 8:09.12 app
6427 root 20 0 30.6G 29.0G 38956 S 0.0 92.5 23:03.81 app
6425 root 20 0 30.6G 29.0G 38956 S 0.0 92.5 0:00.00 app
6424 root 20 0 30.6G 29.0G 38956 S 0.0 92.5 25:42.46 app
6423 root 20 0 30.6G 29.0G 38956 S 4.6 92.5 26:10.82 app
6422 root 20 0 30.6G 29.0G 38956 S 12.0 92.5 23:24.68 app
6421 root 20 0 30.6G 29.0G 38956 S 2.0 92.5 4:32.47 app
2202 gitlab-ru 20 0 231M 70132 53620 S 5.3 0.2 4h54:21 nginx: worker process
2203 gitlab-ru 20 0 228M 59040 47680 S 0.7 0.2 54:44.83 nginx: worker process
281 root 19 -1 175M 58104 47728 S 0.0 0.2 0:17.76 /lib/systemd/systemd-journald
1036 root 20 0 1893M 38164 13332 S 0.0 0.1 0:38.17 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
...
Result of docker stats
:
$ docker stats --no-stream
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
14654b8a4bfb app-letsencrypt 13.08% 244.5MiB / 31.41GiB 0.76% 183GB / 192GB 12.4GB / 4.64MB 23
a932dabbced8 app-api 60.50% 7.258GiB / 31.41GiB 23.10% 53.2GB / 10.6GB 48.1MB / 0B 14
2cebc542dda6 app-redis 0.12% 3.902MiB / 31.41GiB 0.01% 24.2kB / 0B 1.84GB / 655kB 4
As you are can see 0.028t (~29G) in top
in much more than 7.258GiB in docker stats
. Difference is about 29 – 7.258 > 20G of RAM.
Help me please to understand how to detect what is this phantom that takes 20G of RAM? Or maybe point me where to dig, into problems with my application or with docker (20.10.1) or with operation system (Ubuntu 18.04)?
UPD
Output in pprof (heap)
# runtime.MemStats
# Alloc = 7645359160
# TotalAlloc = 2552206192400
# Sys = 31227357832
# Lookups = 0
# Mallocs = 50990505448
# Frees = 50882282691
# HeapAlloc = 7645359160
# HeapSys = 29526425600
# HeapIdle = 21707890688
# HeapInuse = 7818534912
# HeapReleased = 9017090048
# HeapObjects = 108222757
# Stack = 1474560 / 1474560
# MSpan = 101848496 / 367820800
# MCache = 13888 / 16384
# BuckHashSys = 10697838
# GCSys = 1270984696
# OtherSys = 49937954
# NextGC = 11845576832
# LastGC = 1615583458331895138
# PauseNs = ..................
# NumGC = 839
# NumForcedGC = 0
# GCCPUFraction = 0.027290987331299785
# DebugGC = false
# MaxRSS = 31197982720
2
Answers
You are comparing top/htop
RES
mem (man):with
docker stats
CLI output (doc):Use docker stats API and you will get much more granular view, e.g. stat for memory:
You can see –
memory
is not just one, but it has many types and each tool may report own set&combination of memory types. I guess you will find missing memory in app cache memory allocation.You can check overall basic memory allocations with
free
command:It is a normal state, when Linux uses unused memory for buff/cache.
docker stats
is reporting the cgroup resource usage of the container’s cgroup:From another window (there’s a small variation with the cat command stopped):
Note that this is will differ from the overall host memory and cpu if you have specified limits with your containers. Without limits, the cpu quota will be -1 to be unrestricted, and the memory limit will set to the page counter max value.
Trying to add up memory usage from the top command is very error prone. There is different types of memory in the linux kernel (including disk cache), memory gets shared between multiple threads (which is why you likely see multiple pids for app, each with the exact same memory), some memory may be mmap that is not backed with ram, and a long list of other challenges. People that know much more about this than me will say that the kernel doesn’t even know when it’s actually out of memory until it attempts to reclaim memory from many process and those attempts all fail.