Skip to content

add trace log to track event size expansion#49

Merged
kaisecheng merged 6 commits intologstash-plugins:mainfrom
kaisecheng:add_event_size_trace_log
Nov 19, 2025
Merged

add trace log to track event size expansion#49
kaisecheng merged 6 commits intologstash-plugins:mainfrom
kaisecheng:add_event_size_trace_log

Conversation

@kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented Nov 18, 2025

tested the following pipeline bin/logstash -f split.conf --log.level trace

input {
    generator {
        lines => [
            '{"kubernetes" : {"label": {"app": "somevalue" }}, "split": ["1","2","3"] }'
        ]
        count => 1
        codec => json
    }
}
filter { split { field => "split" }}
output { stdout { } }

log

[2025-11-18T15:29:49,196][TRACE][logstash.filters.split   ][main][e42465377ab12961a4ebbb80634862df2eacd4cc029b3817f2a772eb317df8e7] Event is split into 3 {:split_bytes=>819, :original_bytes=>261, :ratio=>3.14}
Copy link
Contributor

@donoghuc donoghuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will be helpful. I wonder if there is a similar log message we should add pre split detailing that we are going to be attempting to split into splits.size events. That way if there is a OOM or something during inflation we would have a log message indicating that the split is being attempted?

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>
@kaisecheng
Copy link
Contributor Author

@donoghuc Thanks for your considerate suggestion. I have updated the log message and removed an overly verbose debug entry that didn’t add much value. Please have a look.

@kaisecheng kaisecheng requested a review from donoghuc November 18, 2025 18:20
event_target = @target.nil? ? @field : @target

split_bytes = 0
logger.trace? && logger.trace("Event being split into #{splits.size} events")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity... Why are we guarding this with logger.trace? I get it for the other computations as they are somewhat expensive. If i understand correctly, the logging library will handle which messages to actually emit based on level. For example, a message sent with logger.trace('foo') would only ever show up in logs when the trace level logging is configured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guard it to prevent paying the cost of string concatenation string + #{splits.size} + string that is not used in other info or debug level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guard it to prevent paying the cost of string concatenation string + #{splits.size} + string that is not used in other info or debug level.

@kaisecheng kaisecheng merged commit 74051b1 into logstash-plugins:main Nov 19, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants