I am installing a brand new elasticsearch 7.5 on OS:Red Hat Enterprise Linux Server release 7.8 (Maipo)
At startup of the service, I have hard failure. here is what the service info provides
● elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: disabled)
Active: failed (Result: signal) since Tue 2020-08-25 11:34:39 CEST; 7min ago
Docs: http://www.elastic.co
Process: 102777 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet (code=killed, signal=ABRT)
Main PID: 102777 (code=killed, signal=ABRT)
CGroup: /system.slice/elasticsearch.service
Aug 25 11:34:34 sv-1348lvd44.esante.local systemd[1]: Starting Elasticsearch...
Aug 25 11:34:35 sv-1348lvd44.esante.local elasticsearch[102777]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated...lease.
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: elasticsearch.service: main process exited, code=killed, status=6/ABRT
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: Failed to start Elasticsearch.
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: Unit elasticsearch.service entered failed state.
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: elasticsearch.service failed.
when using journalctl -xe
Aug 25 11:34:38 sv-1348lvd44.esante.local audispd[824]: node=sv-1348lvd44.esante.local type=ANOM_ABEND msg=audit(1598348078.836:208066): auid=429496 uid=995 gid=991 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 pid=102777 comm="java" reason="memory violation" sig=6
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: elasticsearch.service: main process exited, code=killed, status=6/ABRT
Aug 25 11:34:39 sv-1348lvd44.esante.local systemd[1]: Failed to start Elasticsearch.
when looking into the dump hs_err_pidXXXX I have.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f4818939b85, pid=52870, tid=52933
#
# JRE version: OpenJDK Runtime Environment (13.0.1+9) (build 13.0.1+9)
# Java VM: OpenJDK 64-Bit Server VM (13.0.1+9, mixed mode, sharing, tiered, compressed oops, concurrent mark sweep gc, linux-amd64)
# Problematic frame:
# C [jna515356041985641679.tmp+0x12b85] ffi_prep_closure_loc+0x15
[OS:Red Hat Enterprise Linux Server release 7.8 (Maipo)
uname:Linux 3.10.0-1127.10.1.el7.x86_64 #1 SMP Tue May 26 15:05:43 EDT 2020 x86_64
libc:glibc 2.17 NPTL 2.17
rlimit: STACK 8192k, CORE 0k, NPROC 4096, NOFILE 65535, AS infinity, DATA infinity, FSIZE infinity
load average:0.08 0.03 0.05
.../...
It works like a charm on CentOS without doing anything.
For RHEL, I already fixed the stuff about JNA by adding ES_TMPDIR=/var/es-temp into /etc/sysconfig/elasticsearch as
Memory seems fine. this is a brand new VM. (no application logs into /var/logs)
Seems that this version is supposed to be supported
I tested with -Xms2g -Xmx2g, -Xms1g -Xmx1g, -Xms512m -Xmx512m but same error.
I don’t get what is going wrong. My Next step is to test with another version 7 of elasticsearch.
2
Answers
After 1 day of struggling, I found the solution at https://discuss.elastic.co/t/elasticsearch-v7-6-2-failed-to-start-killed-by-sigabrt-on-rhel-7-7-urgent/231039/11 from Ivan_A_Carrazana_C
I put here a copy of the steps to perform:
Seems to be known by elastic but not documented correctly. don't undertand why the tmpfs should in noexec. Would be good to have an JNA expert feedback about it.
For some reason, adding a TMPDIR var to /etc/sysconfig/elasticsearch worked (on 7.7.1) and pointing it to the same location as -Djava.io.tmpdir.
i.e.
TMPDIR="/usr/share/elasticsearch/tmp"
(in my case I actually used /var/lib/elasticsearch/tmp with 0755 permissions on it).
I can’t say why, and it doesn’t change the call string used if I look at ‘ps -aef’ . But just having -Djava.io.tmpdir wasn’t enough.
This allowed me to get it to work without removing noexec on /tmp and /dev/shm.