When the database server shuts down unexpectedly on Linux, you need to find the reason. There may be several reasons. For example, SIGSEGV - crash due to a bug in the backend server. But this is a rarity. More often than not, disk space or memory simply runs out. If the disk space runs out, one way out is to free up space and restart the database.
When the server or process runs out of memory, Linux offers 2 ways to solve it: crash the entire system or terminate the process (application) that eats up the memory. It is better, of course, to complete the process and save the OS from abnormal termination. In a nutshell, Out-Of-Memory Killer is the process that terminates an application to save the kernel from a crash. He sacrifices the application to keep the OS running. Let's first discuss how OOM works and how to control it, and then see how OOM Killer decides which application to end.
One of Linux’s main tasks is to allocate memory to processes when they ask for it. Usually a process or application requests memory from the OS, but they themselves do not fully use it. If the OS will issue memory to everyone who asks for it, but does not plan to use it, very soon the memory will end, and the system will fail. To avoid this, the OS reserves memory for the process, but does not actually issue it. Memory is allocated only when the process is really about to use it. It happens that the OS does not have free memory, but it assigns memory to the process, and when the process needs it, the OS allocates it if it can. The downside is that sometimes the OS reserves memory, but at the right time there is no free memory, and the system crashes. OOM plays an important role in this scenario and terminates processes to prevent the kernel from panic. When the PostgreSQL process is forcibly terminated, a message appears in the log:
Out of Memory: Killed process 12345 (postgres).
If there is little memory in the system and it is impossible to free it, the
out_of_memory function is
out_of_memory . At this stage, she has only one thing left - to complete one or more processes. OOM-killer should complete the process immediately or can I wait? Obviously, when out_of_memory is called, this is due to waiting for an I / O operation or page swapping to disk. Therefore, the OOM-killer must first perform checks and based on them decide that the process must be completed. If all of the checks below give a positive result, OOM will terminate the process.
When memory runs out, the
out_of_memory() function is
out_of_memory() . It has a
select_bad_process() function, which receives an estimate from the
badness() function. The distribution of the most "bad" process. The
badness() function selects a process according to certain rules.
- The kernel needs some minimum memory for itself.
- You need to free up a lot of memory.
- No need to terminate processes that use little memory.
- You need to complete a minimum of processes.
- Complex algorithms that increase the chances of completion for those processes that the user himself wants to complete.
Having completed all these checks, OOM examines the grade (
oom_score ). OOM assigns
oom_score each process, and then multiplies this value by the amount of memory. Processes with higher values are more likely to become victims of OOM Killer. Processes associated with a privileged user have a lower rating and are less likely to force termination.
postgres=# SELECT pg_backend_pid(); pg_backend_pid ---------------- 3813 (1 row)
The identifier of the Postgres process is 3813, so in another shell you can get an estimate using this
vagrant@vagrant:~$ sudo cat /proc/3813/oom_score 2
If you do not want OOM-Killer to complete the process at all, there is another kernel parameter:
oom_score_adj . Add a large negative value to reduce the chances of completing the process you love.
sudo echo -100 > /proc/3813/oom_score_adj
To set the value
oom_score_adj , set OOMScoreAdjust in the service block:
oomprotect in the
rcctl set <i>servicename</i> oomprotect -1000
Forced process termination
When one or more processes are already selected, OOM-Killer calls the
oom_kill_task() function. This function sends a completion signal to the process. If there is not enough memory,
oom_kill() calls this function to send a SIGKILL signal to the process. A message is written to the kernel log.
Out of Memory: Killed process [pid] [name].
How to control OOM-Killer
On Linux, you can enable or disable OOM-Killer (although the latter is not recommended). To enable and disable, use the
vm.oom-kill option. To enable OOM-Killer at runtime, run the
sudo -s sysctl -w vm.oom-kill = 1
To disable OOM-Killer, specify the value 0 in the same command:
sudo -s sysctl -w vm.oom-kill = 0
The result of this command will not be saved forever, but only until the first reboot. If you need more persistence, add this line to the
echo vm.oom-kill = 1 >>/etc/sysctl.conf
Another way to enable and disable is to write the variable
panic_on_oom . The value can always be checked in
$ cat /proc/sys/vm/panic_on_oom 0
If you set the value to 0, then when the memory runs out, kernel panic will not.
$ echo 0 > /proc/sys/vm/panic_on_oom
If you set the value to 1, then when the memory runs out, kernel panic will happen.
echo 1 > /proc/sys/vm/panic_on_oom
OOM-Killer can not only turn on and off. We have already said that Linux can reserve more memory for processes than there is, but not allocate it in fact, and this behavior is controlled by the Linux kernel parameter. The
vm.overcommit_memory variable is responsible for this.
You can specify the following values for it:
0: The kernel itself decides whether to reserve too much memory. This is the default value on most versions of Linux.
1: the kernel will always reserve extra memory. This is risky, because memory can end, because, most likely, one day the processes will demand what is supposed to be.
2: the kernel will not reserve more memory than specified in the
In this parameter you specify the percentage of memory for which redundancy is permissible. If there is no space for it, memory is not allocated, the reservation will be denied. This is the safest option recommended for PostgreSQL. OOM-Killer is affected by another element - the swap feature, which is controlled by the variable
cat /proc/sys/vm/swappiness . These values tell the kernel how to handle paging. The larger the value, the less likely it is that OOM will terminate the process, but due to I / O, this negatively affects the database. And vice versa - the lower the value, the higher the probability of OOM-Killer intervention, but the database performance is also higher. The default value is 60, but if the entire database fits in memory, it is best to set the value to 1.
Don't be scared by the killer in OOM-Killer. In this case, the killer will be the savior of your system. It “kills” the worst processes and saves the system from abnormal termination. To avoid having to use OOM-Killer to complete PostgreSQL, set
vm.overcommit_memory to 2. This does not guarantee that OOM-Killer does not have to intervene, but will reduce the likelihood of the PostgreSQL process being forced to terminate.