Fix uniquify to handle multiple successive duplicates#126889
Merged
rjernst merged 5 commits intoelastic:mainfrom Apr 16, 2025
Merged
Fix uniquify to handle multiple successive duplicates#126889rjernst merged 5 commits intoelastic:mainfrom
rjernst merged 5 commits intoelastic:mainfrom
Conversation
CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes elastic#126883
Collaborator
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Collaborator
|
Hi @rjernst, I've created a changelog YAML for you. |
lkts
reviewed
Apr 16, 2025
| assertUniquify(List.of(1, 1, 1), Comparator.naturalOrder(), 1); | ||
| assertUniquify(List.of(1, 2, 2, 3), Comparator.naturalOrder(), 3); | ||
| assertUniquify(List.of(1, 2, 2, 2), Comparator.naturalOrder(), 2); | ||
| assertUniquify(List.of(1, 2, 2, 3, 3, 5), Comparator.naturalOrder(), 4); |
Contributor
There was a problem hiding this comment.
I think this needs a stronger test. I was thinking about this:
- Generate random int
sfor the number of unique items in list - Generate
sunique numbers - For each random generate random number of repeats (1 to 10)
- Put everything in the list
- uniquify the list
- compare resulting list with a list of only unique numbers that we have from 2
Member
Author
|
@ldematte I reworked this back to using iterators. I remembered that the reason to use iterators is to support linked lists, since doing |
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this pull request
Apr 16, 2025
CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes elastic#126883
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this pull request
Apr 16, 2025
CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes elastic#126883
This was referenced Apr 16, 2025
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this pull request
Apr 16, 2025
CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes elastic#126883
Collaborator
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this pull request
Apr 16, 2025
CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes elastic#126883
elasticsearchmachine
pushed a commit
that referenced
this pull request
Apr 16, 2025
) CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes #126883
elasticsearchmachine
pushed a commit
that referenced
this pull request
Apr 16, 2025
) CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes #126883
elasticsearchmachine
pushed a commit
that referenced
this pull request
Apr 17, 2025
) CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes #126883 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
elasticsearchmachine
pushed a commit
that referenced
this pull request
Apr 22, 2025
) CollectionUtils.uniquify is based on C++ std::unique. However, C++ iterators are not quite the same as Java iterators. In particular, advancing them only allows grabbing the value once. This commit reworks uniquify to be based on list indices instead of iterators. closes #126883 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CollectionUtils.uniquify is based on C++ std::unique. This commit fixes the implementation to correctly match std::unique, so that successive ranges of duplicates are handled.
closes #126883