Restarting Services
cghealth restarts a service when the total memory consumption is near to the set limit.
For example, following is a service memory consumption.
[memory] hiwm = 90%
The percentage is in relation to the hard limit. Absolute values are also possible.
If the total memory consumption of all services in the pld.slice exceeds the high water mark, cghealth will, via systemd, restart all services in that slice whose memory consumption is high.
Here is an example of the individual limit for the vsi service
[memory_high] pld-vsi = 30%
This defines that the memory consumption of pld-vsi.service is considered high if it is above 30% of the configured total limit. Again, absolute values can also be used.
For services which do not have an explicit value set, a default value is used. cghealth creates a system log whenever the services are restarted. Following is an example of such a log entry from /var/log/messages:
Jun 12 15:21:21 ocsm journal: OCSM memory usage: 20,498,919,424 bytes Jun 12 15:21:21 ocsm journal: * 7,272,095,744 pld-vsi.service ... Jun 12 15:21:21 ocsm journal: 5,971,968 pld-enum-probe.service Jun 12 15:21:21 ocsm journal: Restarting marked services.
Here is a calculation example:
For this example, we will assume that the entire system memory size is 36 GB.
From file, /opt/oracle/ocsm/etc/iptego/cghealth.conf, you can view the existing limits:
limit = 60% (from all memory, this means 60% x 36 = 21.6 GB) hiwm = 90% (from limit, this means 90% x 21.6) = 19.44 GB pld-vsi = 30% = (from limit, this means 30% x 21.6) = 6.48 GB
This means, that as soon as memory usage for all OCSM services together is over 19.44 GB, cghealth will start looking at individual services to see which one is above its individual limit, in which case those services will be restarted.
For example pld-vsi, because the limit is 30%, the service will be restarted if its current memory usage is over 6.48 GB.


 
