System State Framework (SSF)¶
Introduction¶
This document describes an extensible design for tracking and publishing the system state for NG800 and OEM products derived from NG800.
The system state is a string variable that reflects the run-level of the overall system (off, booting, starting, up, shutdown-pending, shutting-down, powering-down). This value is published to user applications via the sysfs (file system).
At the core of the design a state machine tracks the system state and processes multiple inputs such as the ignition signal. Before shutting down Linux because of a de-asserted ignition signal, the state machine grants user-space application time to properly shut down. User applications can prolong the shutdown timer if they need more time to terminate. If the timer elapses, the state machine instructs the kernel to shut down.
File System Entries¶
All the entries are available under the directory /sys/kernel/broker:
ignition
status of the ignition signal
- 1 = asserted
- 0 = de-asserted
system-state
state of the system
- starting –> operating system, applications, etc are starting up
- up –> system start-up finished, i.e. fully booted, up and running
- shutdown-pending –> system was told to shut down by giving applications time to terminate, see also shutdown-delay
- shutting-down –> shut down in progress
system-state-target
interface to “command” the SSF, i.e. the following parts can be written in it:
- up –> –> triggers the SSF for being up (transition from starting to up)
- reboot –> triggers an immediate reboot
- powerdown –> triggers an immediate power-off
shutdown-delay [seconds]
- set or read the default shutdown-delay
- this value is initialized in the device-tree
extend-shutdown-delay [seconds]
- delay the shutdown to have more time to terminate applications
remaining-shutdown-delay [seconds]
- countdown with the remaining time until the device shuts down
start-reason
information about the reason for the start-up
- power –> ignition and power are both attached to the device
- reboot –> device is rebooting (reboot command, ignition signal or RTC alarm during shut down process)
- watchdog –> device is reset by watchdog (see watchdog feature below)
- wakeup;ignition –> the device was ignited at a power down (power supply still attached)
- wakeup;rtc-alarm –> the device woke up by an RTC alarm (power supply still attached)
ping-request
- used for the watchdog feature to test a correct operation of the kernel modules
- writing a test string triggers the kernel modules (response shown in ping-response)
ping-response
- used for the watchdog feature to test a correct operation of the kernel modules
- reading the response triggered by writing into ping-request
SSF Components¶
The SSF consists of two kernel modules and a user space application:
SSF broker (kernel module)
- exposes all important SSF topics as sysfs files
- distributes SSF notifications to registerd components
SSF sysstate (kernel module)
- exposes the current core system state in the sysfs file
SSF manager (user space application)
writes the state to filesys system-state file such as
- up as soon as the system is completely booted
- powerdown as soon as a powerdown is progressing
- reboot as soon as a reboot is progressing
handles the watchdog including the ping check (ping-request and ping-response)
Note
The SSF manager is provided with our OEM Linux Release. If a custom handling of the SSF is needed it can be configured with its command line options, see section SSF Manager below.
Device Tree Entries¶
At the moment there are only two relevant options to set in the device-tree. The rest of the device tree entries should be left as is or the device may not function properly.
default-shutdown-delay-s
- the default shutdown-delay when no extending of the shutdown-delay is requested.
- sets the value of shutdown-delay on startup.
max-shutdown-delay-s
- sets the maximum time of the shutdown-delay. This is used to make sure the shutdown delay can’t be extended forever.
Pending Shutdown¶
When the ignition signal is de-asserted the system-state shows shutdown-pending for the time located in the file remaining-shutdown-delay. Re-asserting the ignition signal during this time the system-state changes back to up.
Prolonging a pending shutdown is described in the next section.
Extending a Shutdown¶
As mentioned above the shut down can be delayed to have time to terminate applications properly. The following example shows about how to use it:
Example: Let’s assume the default shutdown is 60s and after 30s we notice that we need to delay it for 75s. Perform the following command:
echo "75" > /sys/kernel/broker/extend-shutdown-delay
With this command the shutdown countdown starts again from 75s.
Note
The maximum total delay is configured in the device-tree or is 300s by default.
RTC wake-up¶
The SSF provides a start reason to differentiate between RTC wake-up and ignition signal. To set up an RTC wake-up you can just use the linux command rtcwake.
Example: If I want to wake-up my device after 90s from now and in the meantime it shall be powered off, I can call this:
rtcwake -s 90 -m off
The start reason read from start-reason is wakeup;rtc-alarm.
Device is Shutting down¶
The system is rebooting if during the shutting down process the following events are given:
- re-assertion of the ignition signal
- wake-up event of an RTC alarm
- reboot commanded
Powering the Device Off¶
The system is powering off on the following events:
- poweroff commanded
- RTC alarm set up with mode to power off
- de-assertion of the ignition signal
Ping Request/Response¶
The kernel modules of the SSF can be tested by writing a string to ping-request and reading the response from ping-request. Any request is taken by the SSF broker and forwarded to the SSF sysstate which finally writes the response on the sysfs.
Watchdog Feature¶
The provided SSF manager includes a watchdog feature which is linked to the ping request/response mechanism checking that the kernel modules are working as expected. Thus it is using the watchdog feed interval to compare the ping response with the corresponsing request before feeding the watchdog. If the ping response and request are mismatching, the watchdog is not fed and will starve. This leads to a watchdog reset of the device. In this case the start-reason will be shown as watchdog.
SSF Manager¶
The SSF manager provides currently two features:
- marking the system state of the SSF
- hanlding the system watchdog by using the SSF Ping mechanism
See the following help for further details:
root@am335x:~# ssf-mgr -h
Usage: ssf-mgr [args]
-h | --help Show this help
-d | --daemonize Run as daemon
-p | --pidfile=path The PID file, see -d
-m | --mark-sys-state Mark the system state for the SSF
-w | --with-watchdog Enable watchdog and supervise SSF modules
-t | --wd-timeout=TIMEOUT_MS Configure watchdog timeout to TIMEOUT_MS
default=8000ms
-i | --wd-feed-interval=INTERVAL_MS Set watchdog feed interval to INTERVAL_MS
default=4000ms
Used loggers: - evtloop
- initSys
- systemState
- watchdogMgr
- brokerPinger
SysLogger OPTIONS:
--loglevel=n Set the max application log level (used for all logger
instances as default) to n (0=emcy, 7=dbg). If comma separated
list of separate logger instances is provided after this
number, the log level for each such instance will be overruled
accordingly (e.g. --loglevel=7,evtloop.5,fileOp.6).
--disable-syslog disable the log output to syslog
--enable-stdout enable output on stdout (e.g. for debugging purposes)
SysLogger Examples:
prog-name --disable-syslog
prog-name --disable-syslog --enable-stdout
prog-name --loglevel=6,config.5,serial.7
In our OEM Linux Release the ssf-mgr.service is starting with the default config where marking of the system state is activated and the watchdog feature is enabled:
root@am335x:~# cat /etc/default/ssf-mgr.conf
# Default settings for system-state-framework manager
# for details run ssf-mgr --help
MARK_SYS_STATE="-m"
WATCHDOG_CONFIG="-w"
LOGGER_CONFIG="--loglevel=6,evtloop.5,systemState.6,initSys.6,brokerPinger.6,watchdogMgr.6"
root@am335x:~# cat /usr/lib/systemd/system/ssf-mgr.service
[Unit]
Description=SystemStateFramework Manager daemon
[Service]
Type=forking
EnvironmentFile=-/etc/default/ssf-mgr.conf
ExecStart=/usr/bin/ssf-mgr $MARK_SYS_STATE $WATCHDOG_CONFIG -d -p /run/ssf-mgr.pid $LOGGER_CONFIG
[Install]
WantedBy=multi-user.target
Starting Options¶
Disable System State Marking and Watchdog Feature¶
Starting the SSF manager without watchdog and without marking the system state, needs to remove the options -w and -m:
root@am335x:~# cat /etc/default/ssf-mgr.conf
# Default settings for system-state-framework manager
# for details run ssf-mgr --help
MARK_SYS_STATE=""
WATCHDOG_CONFIG=""
LOGGER_CONFIG="--loglevel=6,evtloop.5,systemState.6,initSys.6,brokerPinger.6,watchdogMgr.6"
Disable System State Marking¶
Starting the SSF manager by handling only the watchdog part can be fulfilled by removing the -m option:
root@am335x:~# cat /etc/default/ssf-mgr.conf
# Default settings for system-state-framework manager
# for details run ssf-mgr --help
MARK_SYS_STATE=""
WATCHDOG_CONFIG="-w"
LOGGER_CONFIG="--loglevel=6,evtloop.5,systemState.6,initSys.6,brokerPinger.6,watchdogMgr.6"
Timeout Settings¶
The watchdog feed interval and the watchdog timeout are related, i.e. the watchdog timeout must be higher than the check interval. Those times can be changed by the following command line options:
-t watchdog timeout in [ms]
- default = 8000ms
-i watchdog feed interval in [ms]
- default = 4000ms
Example setting the timeout to 30s and the interval to 15s:
root@am335x:~# cat /etc/default/ssf-mgr.conf
# Default settings for system-state-framework manager
# for details run ssf-mgr --help
MARK_SYS_STATE="-m"
WATCHDOG_CONFIG="-w -t 30000 -i 15000"
LOGGER_CONFIG="--loglevel=6,evtloop.5,systemState.6,initSys.6,brokerPinger.6,watchdogMgr.6"
Note
The timeout may vary due to the PMIC setting which is a multiple of a specific base time, see the datasheet of the PMIC for more details.
Source for the System State Marking¶
The init system is systemd and the states of a finished start-up, rebooting or powering off can be collected from dbus messages. Find the dbus registration parameter in the following list.
State of finished start-up:
dbus registartion parameters: - sender/service = "org.freedesktop.systemd1"; - object path = "/org/freedesktop/systemd1"; - interface = "org.freedesktop.systemd1.Manager"; - signal = "StartupFinished" - item = "up" // this is not necessary as StartupFinished does not have any other items, it is just for the internal list
The same mechanism for the poweroff and reboot is used where only a different singal and different items are used:
- signal = "UnitNew" // same signal for poweroff and reboot - itemPoweroff = "poweroff.target" - itemReboot = "reboot.target"
System Watchdog Usage¶
The system watchdog work with the following principle:
/* The watchdog will be activated when opening the the watchdog device file */
fd = open(dev, O_RDWR);
if (-1 == fd)
{
fprintf(stderr, "Error: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
/* Setting the watchdog interval */
fprintf(stdout, "Set watchdog interval to %d\n", interval);
if (ioctl(fd, WDIOC_SETTIMEOUT, &interval) != 0)
{
fprintf(stderr, "Error: Set watchdog interval failed\n");
exit(EXIT_FAILURE);
}
/* Getting the current watchdog interval - which might be advisable when the
* watchdog timeout bases on a factor of a base time such as the PMIC
* watchdog does.
*/
if (ioctl(fd, WDIOC_GETTIMEOUT, &interval) == 0)
{
fprintf(stdout, "Current watchdog interval is %d\n", interval);
}
else
{
fprintf(stderr, "Error: Cannot read watchdog interval\n");
exit(EXIT_FAILURE);
}
/* Interval loop feeding the watchdog
* There are two ways to kick the watchdog:
* - by writing any dummy value into watchdog device file, or
* - by using IOCTL WDIOC_KEEPALIVE
*/
do
{
/* the device file way: */
write(fd, "w", 1);
fprintf(stdout, "Feed watchdog through writing over device file\n");
/* OR the ioctl way: */
ioctl(fd, WDIOC_KEEPALIVE, NULL);
fprintf(stdout, "Kick watchdog through IOCTL\n");
} while (isLoopRunning);
/* The 'V' value needs to be written into watchdog device file to
* indicate that we intend to close/stop the watchdog
*/
write(fd, "V", 1);
/* Closing the watchdog device deactivates the watchdog */
close(fd);