Skip to content

Initialize TaskTracer failed due to SIGURG signal handler already registered #3118

Description

@ZhengweiZhu

Describe the bug
When BRPC_BTHREAD_TRACER is enabled and the app is linked with cgo library (such as linked with etcd client library which is written in golang), the app will fail to start up as TaskTracer finds SIGURG signal handler is already registered (LINE 346).

bool TaskTracer::RegisterSignalHandler() {
// Set up the signal handler.
struct sigaction old_sa{};
struct sigaction sa{};
sa.sa_sigaction = SignalHandler;
sa.sa_flags = SA_SIGINFO;
sigfillset(&sa.sa_mask);
if (sigaction(SIGURG, &sa, &old_sa) != 0) {
PLOG(ERROR) << "Failed to sigaction";
return false;
}
if (NULL != old_sa.sa_handler || NULL != old_sa.sa_sigaction) {
LOG(ERROR) << "Signal handler of SIGURG is already registered";
return false;
}
return true;

The root cause is the SIGURG signal is unfortunately used internally as sigPreempt for non-cooperative preemption by golang.

Image
file: https://go.dev/src/runtime/signal_unix.go

    43  // sigPreempt is the signal used for non-cooperative preemption.
    44  //
    45  // There's no good way to choose this signal, but there are some
    46  // heuristics:
    47  //
    48  // 1. It should be a signal that's passed-through by debuggers by
    49  // default. On Linux, this is SIGALRM, SIGURG, SIGCHLD, SIGIO,
    50  // SIGVTALRM, SIGPROF, and SIGWINCH, plus some glibc-internal signals.
    51  //
    52  // 2. It shouldn't be used internally by libc in mixed Go/C binaries
    53  // because libc may assume it's the only thing that can handle these
    54  // signals. For example SIGCANCEL or SIGSETXID.
    55  //
    56  // 3. It should be a signal that can happen spuriously without
    57  // consequences. For example, SIGALRM is a bad choice because the
    58  // signal handler can't tell if it was caused by the real process
    59  // alarm or not (arguably this means the signal is broken, but I
    60  // digress). SIGUSR1 and SIGUSR2 are also bad because those are often
    61  // used in meaningful ways by applications.
    62  //
    63  // 4. We need to deal with platforms without real-time signals (like
    64  // macOS), so those are out.
    65  //
    66  // We use SIGURG because it meets all of these criteria, is extremely
    67  // unlikely to be used by an application for its "real" meaning (both
    68  // because out-of-band data is basically unused and because SIGURG
    69  // doesn't report which socket has the condition, making it pretty
    70  // useless), and even if it is, the application has to be ready for
    71  // spurious SIGURG. SIGIO wouldn't be a bad choice either, but is more
    72  // likely to be used for real.

Any idea to fix? Maybe choose another signal number for stack trace as we can't change golang's implementation. @chenBright

To Reproduce

Expected behavior

Versions
OS:
Compiler:
brpc:
protobuf:

Additional context/screenshots

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions