Skip to content

[runtime] Support per-event-type configurable log levels for event log#609

Open
weiqingy wants to merge 8 commits intoapache:mainfrom
weiqingy:541-impl
Open

[runtime] Support per-event-type configurable log levels for event log#609
weiqingy wants to merge 8 commits intoapache:mainfrom
weiqingy:541-impl

Conversation

@weiqingy
Copy link
Copy Markdown
Collaborator

@weiqingy weiqingy commented Apr 7, 2026

Linked issue: #541 , #552

Purpose of change

Add per-event-type configurable log levels (OFF, STANDARD, VERBOSE) to the event log system, as designed in #552.

Key changes:

  • EventLogLevel enum (OFF, STANDARD, VERBOSE) with hierarchical config key inheritance (event-log.type.<EVENT_TYPE>.level)
  • At STANDARD level, large content fields are truncated using configurable per-field thresholds (max-string-length, max-array-elements, max-depth) with JSON-parseable wrapper objects (truncatedString, truncatedList, truncatedObject)
  • Top-level logLevel and eventType fields added to JSON log record schema for downstream filtering
  • EventFilter removed — fully subsumed by the log level system
  • eventLogTruncatedEvents counter metric for observability
  • Python config options and E2E tests

Config options:

Key Default Description
event-log.level STANDARD Root default log level
event-log.type.<EVENT_TYPE>.level (inherits) Per-type override
event-log.standard.max-string-length 2000 Max string chars at STANDARD
event-log.standard.max-array-elements 20 Max array elements at STANDARD
event-log.standard.max-depth 5 Max nesting depth at STANDARD

Tests

  • EventLogLevelResolverTest — 11 tests: exact match, hierarchy inheritance, root/built-in defaults, case insensitivity, edge cases
  • JsonTruncatorTest — 8 tests: string/array/depth truncation, wrappers, protected fields, composition, disabled thresholds
  • FileEventLoggerTest — 7 new tests: STANDARD truncation, VERBOSE preservation, OFF suppression, per-type override, hierarchical inheritance, JSON schema fields, backward-compatible deserialization
  • Python E2E tests — 3 new: VERBOSE, OFF, STANDARD truncation

API

  • New public API: EventLogLevel enum in api module
  • Removed public API: EventFilter interface (0.x pre-release, no known users)
  • Modified: EventLoggerConfig — removed eventFilter field/builder method

Documentation

  • doc-needed

weiqingy added 5 commits April 6, 2026 22:15
- Create EventLogLevel enum (OFF, STANDARD, VERBOSE) with case-insensitive
  fromString() parser in api module
- Add 4 new config options to AgentConfigOptions: event-log.level,
  event-log.standard.max-string-length, event-log.standard.max-array-elements,
  event-log.standard.max-depth
- Delete EventFilter.java — subsumed by the log level system
- Remove eventFilter field and references from EventLoggerConfig and
  FileEventLogger
- Remove 6 EventFilter-related tests from FileEventLoggerTest

Part of apache#541: per-event-type configurable log levels.
- EventLogLevelResolver: resolves effective EventLogLevel for event type
  strings using hierarchical config key inheritance (exact match → parent
  package walk-up → root default → built-in STANDARD). Caches results in
  ConcurrentHashMap. Takes Map<String, Object> to avoid runtime→plan
  module dependency.
- JsonTruncator: truncates Jackson JsonNode trees per configurable
  thresholds (max-string-length, max-array-elements, max-depth). Produces
  JSON-parseable wrapper objects (truncatedString, truncatedList,
  truncatedObject). Protected fields (eventType, id, attributes) at top
  level are never truncated.
- Comprehensive unit tests for both classes covering exact match,
  hierarchy inheritance, edge cases, all truncation types, composition,
  and disabled thresholds.

Part of apache#541: per-event-type configurable log levels.
…operator

- EventLogRecord: add logLevel, eventType, truncator, and
  truncatedEventsCounter fields. Old 2-arg constructor defaults to
  VERBOSE for backward compatibility.
- EventLogRecordJsonSerializer: write logLevel and eventType as
  top-level JSON fields. Apply JsonTruncator when level is STANDARD,
  increment truncatedEventsCounter when truncation occurs.
- EventLogRecordJsonDeserializer: read optional logLevel field,
  default to VERBOSE for old records (backward compat).
- FileEventLogger: resolve log level per event via
  EventLogLevelResolver, skip OFF events, initialize truncator from
  agent config. Accept truncation counter via setter.
- ActionExecutionOperator: pass full AgentConfiguration.getConfData()
  to logger via agentConfig property. Wire truncation counter from
  BuiltInMetrics to FileEventLogger after open().
- BuiltInMetrics: add eventLogTruncatedEvents counter with getter.
- FileEventLoggerTest: add 7 new tests for STANDARD truncation,
  VERBOSE preservation, OFF suppression, per-type override,
  hierarchical inheritance, new JSON fields, backward deserialization.

Part of apache#541: per-event-type configurable log levels.
- Add EVENT_LOG_LEVEL, EVENT_LOG_MAX_STRING_LENGTH,
  EVENT_LOG_MAX_ARRAY_ELEMENTS, EVENT_LOG_MAX_DEPTH config options to
  Python AgentConfigOptions (keys and defaults match Java)
- Update existing test_python_event_logging to verify new logLevel and
  top-level eventType fields in JSON output
- Add test_event_log_verbose_level: verifies VERBOSE produces full
  content without truncation
- Add test_event_log_off_level: verifies OFF suppresses all event
  logging
- Add test_event_log_standard_truncation: verifies STANDARD with low
  max-string-length threshold produces truncatedString wrappers
- Extract _run_event_logging_pipeline() and _read_log_records() helpers

Part of apache#541: per-event-type configurable log levels.
- FileEventLogger: use PythonEvent.getEventType() (Python module path)
  for level resolution instead of EventContext.getEventType() (Java
  wrapper class name). Without this, per-type Python config keys like
  event-log.type.flink_agents.api.events.event.OutputEvent.level would
  never match.
- EventLogRecordJsonSerializer: use Python module path for top-level
  eventType field for PythonEvents, consistent with the eventType
  inside the event object.
- BuiltInMetrics: remove dead markEventTruncated() method — counter is
  incremented via serializer path directly.

Part of apache#541: per-event-type configurable log levels.
@github-actions github-actions bot added doc-needed Your PR changes impact docs. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels Apr 7, 2026
@weiqingy
Copy link
Copy Markdown
Collaborator Author

weiqingy commented Apr 7, 2026

Hey @wenjin272 @xintongsong could you please take a look at the PR? Feedback is welcome. Thanks!

Copy link
Copy Markdown
Collaborator

@wenjin272 wenjin272 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @weiqingy, thanks for your work. I left some minor comments.

Map<String, Object> data = confData != null ? confData : Collections.emptyMap();

// Parse root default
Object rootValue = data.get(ROOT_LEVEL_KEY);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we just use EVENT_LOG_LEVEL.getKey() here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Changed to ConfigOption<EventLogLevel> with EventLogLevel.STANDARD as the default. AgentConfiguration.get() already supports enum types via ConfigurationUtils.convertValue(), so this works end-to-end.

if (pythonType != null) {
topLevelEventType = pythonType;
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for extracting the python event type is the same as in FileEventLogger. We can extract a common util function.

Copy link
Copy Markdown
Collaborator Author

@weiqingy weiqingy Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Extracted EventUtil.resolveEventType(Event, EventContext) and replaced the duplicated logic in both FileEventLogger.append() and EventLogRecordJsonSerializer.serialize().

private final EventLogLevel logLevel;
private final String eventType;
private final transient JsonTruncator truncator;
private final transient Counter truncatedEventsCounter;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can mark truncator and truncatedEventsCounter as @Nullable.
Additionally, these two members are actually required variables for the serialization of EventLogRecord, rather than part of its content. During deserialization, we have to set them to null. I wonder if they could be removed from within EventLogRecord itself.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added @Nullable on both fields.

Regarding removing them from EventLogRecord — I agree they are serialization plumbing rather than content. However, Jackson's custom serializer (EventLogRecordJsonSerializer) receives only the record object via serialize(EventLogRecord record, ...). Since the ObjectMapper is shared/static, there's no clean way to pass the truncator and counter to the serializer except through the record itself. The transient + @Nullable markers signal that these are not part of the logical content. Happy to explore alternatives if you have a preferred pattern in mind.

* values are "OFF", "STANDARD", and "VERBOSE". Defaults to "STANDARD".
*/
public static final ConfigOption<String> EVENT_LOG_LEVEL =
new ConfigOption<>("event-log.level", String.class, "STANDARD");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just use enum here

public static final ConfigOption<EventLogLevel> EVENT_LOG_LEVEL =
            new ConfigOption<>("event-log.level", EventLogLevel.class, EventLogLevel.STANDARD);

Copy link
Copy Markdown
Collaborator Author

@weiqingy weiqingy Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR.

this.rootDefault =
rootValue != null
? EventLogLevel.fromString(rootValue.toString())
: BUILT_IN_DEFAULT;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we just use EVENT_LOG_LEVEL.getDefaultValue() here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Replaced BUILT_IN_DEFAULT with AgentConfigOptions.EVENT_LOG_LEVEL.getDefaultValue(). Both removed constants are now derived from the single config option definition.

return;
}
eventLogger.open(new EventLoggerOpenParams(runtimeContext));
if (eventLogger instanceof FileEventLogger) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to specifically check the type of eventLogger here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instanceof check is needed because setTruncatedEventsCounter() is specific to FileEventLogger — it's not on the EventLogger interface. The truncation counter is registered centrally in BuiltInMetrics (alongside numOfEventProcessed, numOfActionsExecuted, etc.) so all operator metrics live in one place under the same FlinkAgentsMetricGroupImpl hierarchy. The operator then passes the counter reference to FileEventLogger after open().

We considered adding the counter to the EventLogger interface or having the logger self-register, but both options would either leak Flink metrics concerns into the public API or break the centralized metrics pattern. The instanceof check keeps the interface clean while allowing FileEventLogger to participate in the operator's metrics.

Open to suggestions if you'd prefer a different approach here.

- Use ConfigOption<EventLogLevel> enum type instead of String
- Derive root key and default from EVENT_LOG_LEVEL config constant
- Add @nullable on transient truncator/counter fields in EventLogRecord
- Extract EventUtil.resolveEventType() for PythonEvent type resolution
@weiqingy
Copy link
Copy Markdown
Collaborator Author

@wenjin272 Thanks for the review! I’ve addressed your comments - could you take another look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-needed Your PR changes impact docs. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants