skip to Main Content

I am testing my server app in the docker container, I saw it stopped with code 137.

root@debian:~# docker ps -a
CONTAINER ID        IMAGE                   COMMAND                 CREATED             STATUS                      PORTS               NAMES
821959f20624        webserver-in-c_server   "./webserver -p 8080"   2 weeks ago         Exited (137) 40 hours ago                       server
root@debian:~# 

Here is the docker inspect of dead process, OOMKilled is set to false:

root@debian:~# docker inspect server
[
    {
        "Id": "821959f206244d90297cfa0e31a89f4c8e06e3459cd8067e92b7cbb2e6fca3e0",
        "Created": "2020-11-25T15:13:10.989199751Z",
        "Path": "./webserver",
        "Args": [
            "-p",
            "8080"
        ],
        "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 137,
            "Error": "",
            "StartedAt": "2020-11-25T15:13:12.321234415Z",
            "FinishedAt": "2020-12-09T17:55:30.649883125Z"
        },

So my question is that message in dmesg like below will cause the container being killed also?

...
[1969112.586796] TCP: out of memory -- consider tuning tcp_mem
[1969122.585736] TCP: out of memory -- consider tuning tcp_mem
[1969132.585344] TCP: out of memory -- consider tuning tcp_mem
[1969142.585455] TCP: out of memory -- consider tuning tcp_mem
[1969152.598334] TCP: out of memory -- consider tuning tcp_mem
[1969162.585242] TCP: out of memory -- consider tuning tcp_mem

Thanks in advance!

2

Answers


  1. I never seen a "TCP: out of memory" before. So I will drop you some lines to help you; first, regarding those options from the inspection command:

            "OOMKilled": false,
            "Dead": false,
    

    They are set as default when you run a container ( as example "docker run xxx"), and I found an entry already responded. Over there the last answer you can find more information related about why the container gets OOM with those flags set as false, so BNT reply said: "By default, kernel kills processes in a container if an out-of-memory (OOM) error occurs" grabbed from docker official documentation.

    Second, the flag "TCP: out of memory — consider tuning tcp_mem" as far I can see here, that is more likely an issue on kernel TCP level, explained on this SAP-tech-article where on their case had changed the network kernel parameters to solve the issue, however I will recommended to check out your setting by:

    sysctl -a | grep rmem
    

    Try to change the parameters by /proc just to test and then if you need to make persist them through shutdown, do the changes at /etc/sysctl.conf. Here more info about tcp/ip kernel parameters.

    Moreover, either you set the flag to not kill the container or not, you will get the "TCP OOM", until you tune up your tcp parameters socket, by the info explained above. Additionally I shared you a pretty good analysis here where explained the flow of the TCP_mem and the funtion tcp_check_oom(); so basically there not will be a sigkill just a tcp oom..one quota from here says:

    In addition, the Linux kernel is transitioning to memory pressure mode
    when TCP oom occurs , limiting the memory allocated to the send /
    receive buffer of the TCP socket. Therefore, there is a penalty for
    sending and receiving performance over TCP.

    Hope the information can be useful for you, and it can be marked as a answer or not, it opens to review/edit to find a better answer.

    Login or Signup to reply.
  2. If you don’t set a memory limit on the container, then docker will never be the one to kill the process, and OOMKilled is expected to always be false.

    Whether or not the memory limit is set in docker, it’s always possible for the Linux kernel itself to kill the process when the Linux host runs out of memory. (The reason for configuring a docker memory limit is to avoid this, or at least have some control of which containers get killed before this happens.) When the kernel kills your process, you’ll get a signal 9, aka SIGKILL, which the application cannot trap and it will immediate exit. This will be seen as an exit code 137 (128 + 9).

    You can dig more into syslog, various kernel logs under /var/log, and dmesg to find more evidence of the kernel encountering an OOM and killing processes on the host. When this happens, the options are to avoid running memory hungry processes, adjust the app to use less memory, or add more memory to the host.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search