Skip to content

Conversation

@chowyu12
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Sep 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@kevwan
Copy link
Contributor

kevwan commented Sep 25, 2025

Thank you for your contribution and the effort to improve performance! 🙏

However, I have some concerns about this implementation that I'd like to discuss:

Memory Safety Issues

The current buffer pooling approach has a potential memory leak issue:

  1. Buffer Growth Problem: When bytes.Buffer processes large JSON payloads (e.g., 10MB), it grows its internal slice to accommodate the data and doesn't shrink back automatically.

  2. Pool Retention: sync.Pool keeps these enlarged buffers for reuse. A single large request can create a 10MB buffer that stays in the pool indefinitely.

  3. Memory Waste: Subsequent small requests (e.g., 1KB responses) will reuse the 10MB buffer, wasting 99.99% of allocated memory.

  4. Accumulation Effect: Multiple large requests will create multiple oversized buffers in the pool, potentially exhausting memory even when most requests are small.

Suggested Improvements

If we want to implement buffer pooling, we should add size limits:

const maxBufferSize = 64 * 1024 // 64KB

func putBuffer(buf *bytes.Buffer) {
    // Only return to pool if buffer isn't too large
    if buf.Cap() <= maxBufferSize {
        buf.Reset()
        bufferPool.Put(buf)
    }
    // Large buffers are discarded and will be GC'd
}

Performance Data Request

Could you provide benchmark data comparing:

  1. Current jsonx.Marshal() approach
  2. Your pooled buffer approach
  3. Pooled buffer with size limits

This would help us understand the actual performance benefits vs. memory safety trade-offs.

Current Recommendation

I'm not recommending to merge this PR as-is due to the memory safety concerns. The current approach of creating new buffers might actually be safer for production systems, as:

  • Small, frequent allocations are handled efficiently by Go's allocator
  • No risk of memory waste from buffer retention
  • Simpler code with fewer edge cases

Would you be interested in implementing the size-limited version, or would you prefer to explore other optimization approaches?

Thanks again for contributing to go-zero! 🚀

@kevwan kevwan changed the title 新增 MarshalWithBuffer 函数以支持使用缓冲区进行 JSON 序列化,并更新相关测试用例以验证新功能和错误处理。 Sep 25, 2025
@chowyu12
Copy link
Contributor Author

great advice

@chowyu12 chowyu12 closed this Sep 26, 2025
… MarshalWithBuffer function and replacing it with a built-in buffer pool to improve performance and simplify the code. Meanwhile, update the relevant test cases to verify the correctness of the new implementation.
@chowyu12
Copy link
Contributor Author

As seen from the benchmark:

  • Small to medium datasets (1-100 items): 4-10% performance improvement and over 60% memory savings.
  • Concurrent scenarios: Performance improvement of up to 26.6%, which is the most significant advantage.
  • High-frequency invocation scenarios: Reduced GC pressure and better overall system performance.
@chowyu12 chowyu12 reopened this Sep 26, 2025
@chowyu12
Copy link
Contributor Author

but when Simplified Pool Management

// before
if buf.Cap() <= maxBufferSize {
    bufferPool.Put(buf)
}

// after
bufferPool.Put(buf)

// before
return bytes.NewBuffer(make([]byte, 0, 64*1024))

// after
return new(bytes.Buffer)


📈 Performance Improvement Summary

Scenario Performance Improvement Memory Savings Key Advantage
Small Data 8-12% 60%+ Avoids over-allocation
Medium Data 4-5% 59-61% Precise size matching
Large Data 3.2% 59.7% Breakthrough improvement
Concurrency 28.5% 47% Reduces lock contention

🎯 Key Insights

  • The "less is more" principle: simplified designs are often more effective than complex optimizations
  • Dynamic adaptation: Go's bytes.Buffer has an efficient built-in expansion mechanism
  • Pool efficiency: simple pool management strategies avoid unnecessary complexity

✅ Final Recommendation
Your current simplified design is the optimal choice:

Reasons:
✅ Performance improvement across all scenarios (3-28%)
✅ Significant memory savings (47-62%)
✅ Concise and maintainable code
✅ Particularly suitable for large dataset processing

This is a perfect optimization result! The simplified design not only improves performance but also makes the code clearer and more maintainable.

@kevwan
Copy link
Contributor

kevwan commented Sep 26, 2025

In production, a key service using fasthttp crashed when an upstream returned large responses. fasthttp’s buffer pooling caused all pods to OOM, leading to a major outage.

What's the time of each operation in benchmark? I'd like to keep it as simple as possible.

@chowyu12
Copy link
Contributor Author

image
@yguilai
Copy link
Contributor

yguilai commented Oct 14, 2025

Hi @chowyu12
I was also considering using sync.Pool to optimize JSON serialization performance, but I noticed there's already an identical PR. Regarding your code, I have a question I'd like to discuss with you:

buf.Bytes() returns a slice pointing to the shared internal memory of the bytes.Buffer. When this buffer is reclaimed by pool.Put and subsequently retrieved by another goroutine via pool.Get, the next write operation will overwrite the content of the previously returned slice, leading to severe data corruption. Shouldn't we copy the data before returning the result to avoid this issue, as shown in the example below?

func MarshalWithBuffer(v any) ([]byte, error) {
	// ....

	bs := buf.Bytes()
	// Remove trailing newline added by json.Encoder.Encode
	if len(bs) > 0 && bs[len(bs)-1] == '\n' {
		bs = bs[:len(bs)-1]
	}

	res := make([]byte, len(bs))
	copy(res, bs)
	return res, nil
}
@chowyu12
Copy link
Contributor Author

@yguilai So, you must perform a reset operation before put.

func putBuffer(buf *bytes.Buffer) {
	buf.Reset()
	if buf.Cap() <= maxBufferSize {
		bufferPool.Put(buf)
	}
}

https://github.com/ickey-cn/go-zero/blob/sync_pool/core/jsonx/json.go#L24C1-L29C2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants