Manjusaka

Manjusaka

A Brief Discussion on Signal Handling in Processes V2

Last time I wrote a water literature A Simple Discussion on Signal Handling in Processes, my master scolded me angrily after reading it, stating that the examples in the previous water literature were too old style, too simple, too naive. If there are deviations in the future, I will also bear the responsibility. I was so scared that I didn't even write an article for my anniversary with my girlfriend, so I hurriedly came to rewrite an article to discuss better and more convenient signal handling methods.

Background#

First, let's take a look at the example from the previous article.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>

void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    deletejob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

int main(int argc, char **argv) {
  int pid;
  sigset_t mask_all, prev_all;
  sigfillset(&mask_all);
  signal(SIGCHLD, handler);
  while (1) {
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    addjob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
}

Next, let's review a few key syscall.

  1. signal1: The signal handling function that allows the user to specify a handler for specific signals for the current process. When a signal is triggered, the system calls the specific handler for corresponding logic processing.
  2. sigfillset2: One of the functions used to manipulate signal sets. Here, it means adding all supported signals into a signal set.
  3. fork3: A familiar API that creates a new process and returns the pid. If in the parent process, the returned pid is the corresponding child process's pid. If in the child process, pid is 0.
  4. execve4: Executes a specific executable file.
  5. sigprocmask5: Sets the process's signal mask. When the first parameter is SIG_BLOCK, the function saves the current process's signal mask in the signal set variable passed as the third parameter and sets the current process's signal mask to the signal mask passed as the second parameter. When the first parameter is SIG_SETMASK, the function sets the current process's signal mask to the value set by the second parameter.
  6. wait_pid6: To make an imprecise summary, it recovers and releases the resources of terminated child processes.

Now that we've reviewed the key points, let's move on to the main part of this article.

More Elegant Signal Handling Methods#

More Elegant Handler#

First, let's take another look at the signal handling code above.

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    deletejob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

Here, to ensure that the handler is not interrupted by other signals, we use sigprocmask + SIG_BLOCK for signal masking during processing. Logically, this seems fine, but there is a problem. When we have many different handlers, we will inevitably generate a lot of redundant code. So, is there a more elegant way to ensure the safety of our handler?

Yes (very loudly), let me introduce a new syscall -> sigaction7.

Without further ado, let's look at the code.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>

void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    deletejob(pid);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

int main(int argc, char **argv) {
  int pid;
  sigset_t mask_all, prev_all;
  sigfillset(&mask_all);
  struct sigaction new_action;
  new_action.sa_handler=handler;
  new_action.sa_mask=mask_all;
  signal(SIGCHLD, handler);
  while (1) {
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    addjob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
}

Great! Very energetic! You may have noticed that this code has added settings related to sigaction compared to the previous code. What does this mean?

Yep, in sigaction, we can set sa_mask to specify which signals will be blocked during the execution of the signal handling function.

See, our code is indeed more elegant than before. Of course, sigaction has many other useful settings, and you can check them out.

Faster Signal Handling Methods#

In our previous example, we have solved the problem of elegantly setting the signal handling function. Now we face a brand new problem.

As mentioned earlier, when our signal handling function is executed, we choose to block other signals. Here, there is a problem: when the logic in the signal handling function takes a long time and does not require atomicity (i.e., needs to be synchronized with the signal handling function), and the frequency of signals occurring in the system is high, our approach will lead to an increasing queue of signals for the process, resulting in unpredictable consequences.

So, is there a better way to handle this?

Suppose we open a file, and in the signal handling function, we only complete one task: writing a specific value to this file. Then we poll this file, and if it changes, we read the value from the file, determine the specific signal, and perform specific signal handling. This way, we ensure the delivery of signals while minimizing the cost of blocking signal handling logic.

Of course, the community knows that everyone finds it difficult to write code, so they specifically provide a brand new syscall -> signalfd8.

As usual, let's look at an example.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/signalfd.h>
#include <sys/wait.h>

#define MAXEVENTS 64
void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

int main(int argc, char **argv) {
  int pid;
  struct epoll_event event;
  struct epoll_event *events;
  sigset_t mask;
  sigemptyset(&mask);
  sigaddset(&mask, SIGCHLD);
  if (sigprocmask(SIG_SETMASK, &mask, NULL) < 0) {
    perror("sigprocmask");
    return 1;
  }
  int sfd = signalfd(-1, &mask, 0);
  int epoll_fd = epoll_create(MAXEVENTS);
  event.events = EPOLLIN | EPOLLEXCLUSIVE | EPOLLET;
  event.data.fd = sfd;
  int s = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, sfd, &event);
  if (s == -1) {
    abort();
  }
  events = calloc(MAXEVENTS, sizeof(event));
  while (1) {
    int n = epoll_wait(epoll_fd, events, MAXEVENTS, 1);
    if (n == -1) {
      if (errno == EINTR) {
        fprintf(stderr, "epoll EINTR error\n");
      } else if (errno == EINVAL) {
        fprintf(stderr, "epoll EINVAL error\n");
      } else if (errno == EFAULT) {
        fprintf(stderr, "epoll EFAULT error\n");
        exit(-1);
      } else if (errno == EBADF) {
        fprintf(stderr, "epoll EBADF error\n");
        exit(-1);
      }
    }
    printf("%d\n", n);
    for (int i = 0; i < n; i++) {
      if ((events[i].events & EPOLLERR) || (events[i].events & EPOLLHUP) ||
          (!(events[i].events & EPOLLIN))) {
        printf("%d\n", i);
        fprintf(stderr, "epoll err\n");
        close(events[i].data.fd);
        continue;
      } else if (sfd == events[i].data.fd) {
        struct signalfd_siginfo si;
        ssize_t res = read(sfd, &si, sizeof(si));
        if (res < 0) {
          fprintf(stderr, "read error\n");
          continue;
        }
        if (res != sizeof(si)) {
          fprintf(stderr, "Something wrong\n");
          continue;
        }
        if (si.ssi_signo == SIGCHLD) {
          printf("Got SIGCHLD\n");
          int child_pid = waitpid(-1, NULL, 0);
          deletejob(child_pid);
        }
      }
    }
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    addjob(pid);
  }
}

Now, let's introduce some key points in this code.

  1. signalfd is a special file descriptor that is readable and can be selected. When the specified signal occurs, we can read the specific signal value from the returned fd.
  2. signalfd has a lower priority than the signal handling function. In other words, if we register a signal handling function for the signal SIGCHLD and also register signalfd, when the signal occurs, the signal handling function will be called first. Therefore, when using signalfd, we need to use sigprocmask to set the process's signal mask.
  3. As mentioned earlier, this file descriptor can be selected, meaning we can use select9, poll10, epoll1112, etc., to monitor the fd. In the above code, we use epoll to monitor signalfd.

Of course, it's also worth noting that many languages may not provide an official signalfd API (like Python), but they may offer equivalent alternatives, a typical example being Python's signal.set_wakeup_fd13.

Here's a thought question for you: besides using signalfd, what other methods can achieve efficient and safe signal handling?

Conclusion#

I believe that signal handling is a fundamental skill for developers. We need to safely and reliably handle various signals encountered in the program environment. The system also provides many well-designed APIs to alleviate the burden on developers. However, we must understand that signals are essentially a means of communication, and their inherent drawback is that they carry limited information. Often, when we have many high-frequency information transmissions to handle, using signals may not be the best choice. Of course, this is not conclusive; it can only be done on a case-by-case trade-off.

That's about it for now; I've finished the second water literature this week (escape).

Reference#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.