Skip to content

Conversation

@TakaHiR07
Copy link
Contributor

Motivation

Currently the replicateRate of batchRead in auto-recover may exceed Integer.MAX_VALUE and then become negative, which would result in batchRead operation stuck.

Changes

deal with the negative situation

@StevenLuMT
Copy link
Member

good jobs, I have a question: should maxBytesToReplicate use long type?

@TakaHiR07
Copy link
Contributor Author

good jobs, I have a question: should maxBytesToReplicate use long type?

rateLimiter.acquire(int) just support int value. Besides, Integer.max_value means 2GB throughput, it is enough for most situation.

@TakaHiR07
Copy link
Contributor Author

good jobs, I have a question: should maxBytesToReplicate use long type?

rateLimiter.acquire(int) just support int value. Besides, Integer.max_value means 2GB throughput, it is enough for most situation because replicate throughput should not be too high.

Comment on lines 458 to 460
int bytesToReplicateCnt = averageEntrySize.get() * entriesToReplicateCnt;
bytesToReplicateCnt = bytesToReplicateCnt >= 0 ? bytesToReplicateCnt : Integer.MAX_VALUE;
maxBytesToReplicate = Math.min(maxBytesToReplicate, bytesToReplicateCnt);
Copy link
Member

@lhotari lhotari Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making this calculation with a long value would produce more correct results.
It's not really realistic, but it I guess it's possible that the resulting value could "wrap around" the negative range and become positive.

Something like this:

Suggested change
int bytesToReplicateCnt = averageEntrySize.get() * entriesToReplicateCnt;
bytesToReplicateCnt = bytesToReplicateCnt >= 0 ? bytesToReplicateCnt : Integer.MAX_VALUE;
maxBytesToReplicate = Math.min(maxBytesToReplicate, bytesToReplicateCnt);
long bytesToReplicateCnt = (long) averageEntrySize.get() * entriesToReplicateCnt;
maxBytesToReplicate = bytesToReplicateCnt > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) bytesToReplicateCnt;
Copy link
Contributor Author

@TakaHiR07 TakaHiR07 Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modification is incorrect. maxBytesToReplicate still may be negative if averageEntrySize.get() is negative, and then replicationThrottle.acquire(maxBytesToReplicate) is stuck.

I think we should "wrap around" the negative value and make it become positive. We should keep the bookie availability first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can averageEntrySize.get() be negative?

Copy link
Member

@lhotari lhotari Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should "wrap around" the negative value and make it become positive.

The point is that such value could be positive, but it wouldn't match a meaningful value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example 4294967297 would be just 1 with a "wrapped around" value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point. you are right. 4294967297 is a corner case for int value.

Make it to long value would be better.

@TakaHiR07 TakaHiR07 force-pushed the fix_replicateRate_in_batchRead_is_negative branch from e21eb64 to 709bef6 Compare August 26, 2025 06:51
@TakaHiR07 TakaHiR07 requested a review from lhotari August 26, 2025 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants