fix: isolate event handler errors per spec 5.2.5#600
Conversation
Wrap handler invocations in try/except within run_client_handlers, run_global_handlers, and _run_immediate_handler so that a failing handler does not prevent subsequent handlers from executing. Log handler name, event type, and full traceback via logger.warning with exc_info=True for debugging visibility. Closes open-feature#596
There was a problem hiding this comment.
Code Review
This pull request introduces exception handling and logging when executing client, global, and immediate event handlers, preventing unhandled exceptions in user-defined callbacks from disrupting the execution flow. The feedback suggests a critical improvement: avoiding the execution of user-defined callbacks while holding locks (_client_lock and _global_lock) to prevent potential deadlocks and performance bottlenecks. Instead, the handler lists should be copied and executed outside the lock context. Additionally, the reviewer recommends using __qualname__ over __name__ for more descriptive logging of class methods and nested functions.
| for handler in _client_handlers[client][event]: | ||
| handler(details) | ||
| try: | ||
| handler(details) | ||
| except Exception: | ||
| logger.warning( | ||
| "Error in client handler %s for event %s", | ||
| getattr(handler, "__name__", repr(handler)), | ||
| event, | ||
| exc_info=True, | ||
| ) |
There was a problem hiding this comment.
Executing user-defined callbacks/handlers while holding a lock (_client_lock) is a known anti-pattern. If a handler blocks, performs a slow operation, or attempts to acquire another lock (or re-acquire this lock from another thread), it can lead to deadlocks or severely degrade performance by blocking other threads.
To prevent this, we should copy the list of handlers while holding the lock, and then iterate and execute them outside of the lock context.
Additionally, using __qualname__ instead of __name__ provides a more descriptive name for class methods or nested functions during logging.
| for handler in _client_handlers[client][event]: | |
| handler(details) | |
| try: | |
| handler(details) | |
| except Exception: | |
| logger.warning( | |
| "Error in client handler %s for event %s", | |
| getattr(handler, "__name__", repr(handler)), | |
| event, | |
| exc_info=True, | |
| ) | |
| handlers = list(_client_handlers[client][event]) | |
| for handler in handlers: | |
| try: | |
| handler(details) | |
| except Exception: | |
| logger.warning( | |
| "Error in client handler %s for event %s", | |
| getattr(handler, "__qualname__", getattr(handler, "__name__", repr(handler))), | |
| event, | |
| exc_info=True, | |
| ) |
| for handler in _global_handlers[event]: | ||
| handler(details) | ||
| try: | ||
| handler(details) | ||
| except Exception: | ||
| logger.warning( | ||
| "Error in global handler %s for event %s", | ||
| getattr(handler, "__name__", repr(handler)), | ||
| event, | ||
| exc_info=True, | ||
| ) |
There was a problem hiding this comment.
Similarly to run_client_handlers, executing user-defined global handlers while holding _global_lock can lead to deadlocks or performance issues if a handler blocks or performs slow operations.
We should copy the list of global handlers while holding the lock, and then execute them outside of the lock context. We can also use __qualname__ for better logging details.
| for handler in _global_handlers[event]: | |
| handler(details) | |
| try: | |
| handler(details) | |
| except Exception: | |
| logger.warning( | |
| "Error in global handler %s for event %s", | |
| getattr(handler, "__name__", repr(handler)), | |
| event, | |
| exc_info=True, | |
| ) | |
| handlers = list(_global_handlers[event]) | |
| for handler in handlers: | |
| try: | |
| handler(details) | |
| except Exception: | |
| logger.warning( | |
| "Error in global handler %s for event %s", | |
| getattr(handler, "__qualname__", getattr(handler, "__name__", repr(handler))), | |
| event, | |
| exc_info=True, | |
| ) |
| logger.warning( | ||
| "Error in immediate handler %s for event %s", | ||
| getattr(handler, "__name__", repr(handler)), | ||
| event, | ||
| exc_info=True, | ||
| ) |
There was a problem hiding this comment.
For consistency and more descriptive logging (especially for class methods or nested functions), use __qualname__ if available.
| logger.warning( | |
| "Error in immediate handler %s for event %s", | |
| getattr(handler, "__name__", repr(handler)), | |
| event, | |
| exc_info=True, | |
| ) | |
| logger.warning( | |
| "Error in immediate handler %s for event %s", | |
| getattr(handler, "__qualname__", getattr(handler, "__name__", repr(handler))), | |
| event, | |
| exc_info=True, | |
| ) |
What
Wrap handler invocations in
try/exceptto isolate errors per OpenFeature spec 5.2.5.If a handler function terminates abnormally, other handler functions MUST run.
Changes
run_client_handlers()— wraphandler(details)in try/exceptrun_global_handlers()— same_run_immediate_handler()— samelogger.warning()with handler name, event type, and full traceback (exc_info=True)Testing
Closes #596