Skip to content

Add max_batch_size setting to EIS sparse service settings#141185

Merged
dimitris-athanasiou merged 10 commits intoelastic:mainfrom
dimitris-athanasiou:max-batch-size-for-eis
Jan 27, 2026
Merged

Add max_batch_size setting to EIS sparse service settings#141185
dimitris-athanasiou merged 10 commits intoelastic:mainfrom
dimitris-athanasiou:max-batch-size-for-eis

Conversation

@dimitris-athanasiou
Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou commented Jan 23, 2026

Adds a new setting max_batch_size to the sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains 16.

Tweaking this setting can help users improve performance.

Relates #1592

Adds a new setting `max_batch_size` to the dense and sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates elastic#1592
@dimitris-athanasiou dimitris-athanasiou added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Inference Label for the Search Inference team v9.1.9 v9.3.1 v9.4.0 v8.19.11 v9.2.5 labels Jan 23, 2026
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@elasticsearchmachine
Copy link
Collaborator

Hi @dimitris-athanasiou, I've created a changelog YAML for you.

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, just left a few questions and suggestions.

@DonalEvans
Copy link
Contributor

Should the new max_batch_size service settings field also be included in the output of the ElasticInferenceService.createConfiguration() method? Right now we only include model_id and max_input_tokens there, but we should probably also have max_batch_size for sparse_embedding and text_embedding and similarity and dimensions for text_embedding. Maybe the latter two fields can be left to another PR to keep the scope of this one from expanding too much.

@dimitris-athanasiou
Copy link
Contributor Author

@DonalEvans Thank you for the review! I have addressed all of your points. I have added the max_batch_size to the configuration reported by the services API, not the other 2. I'd rather to follow up and add those in a separate PR.

@dimitris-athanasiou dimitris-athanasiou changed the title Add max_batch_size setting to EIS dense and sparse service settings Jan 26, 2026
@dimitris-athanasiou
Copy link
Contributor Author

@jonathan-buttner @DonalEvans I have removed the setting for dense models. Let me know if it's still good to go.

@dimitris-athanasiou dimitris-athanasiou merged commit 01ace39 into elastic:main Jan 27, 2026
35 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 141185

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 27, 2026
…141185)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates elastic#1592

(cherry picked from commit 01ace39)

# Conflicts:
#	server/src/main/resources/transport/upper_bounds/9.4.csv
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 27, 2026
…141185)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates elastic#1592

(cherry picked from commit 01ace39)

# Conflicts:
#	server/src/main/resources/transport/upper_bounds/9.4.csv
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/InferenceUtils.java
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/inference/InferenceUtilsTests.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java
#	x-pack/plugin/inference/src/internalClusterTest/java/org/elasticsearch/xpack/inference/integration/ModelRegistryIT.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/authorization/ElasticInferenceServiceAuthorizationModel.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsServiceSettingsTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionCreatorTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/authorization/AuthorizationPollerTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/authorization/ElasticInferenceServiceAuthorizationModelTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/response/ElasticInferenceServiceAuthorizationResponseEntityTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/embeddings/GoogleVertexAiEmbeddingsServiceSettingsTests.java
@dimitris-athanasiou
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
9.3
9.2
8.19

Questions ?

Please refer to the Backport tool documentation

dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 27, 2026
dimitris-athanasiou added a commit that referenced this pull request Jan 27, 2026
…41185) (#141335)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates #1592

(cherry picked from commit 01ace39)
dimitris-athanasiou added a commit that referenced this pull request Jan 27, 2026
…141185) (#141362)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates #1592

(cherry picked from commit 01ace39)
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 27, 2026
jonathan-buttner pushed a commit that referenced this pull request Jan 27, 2026
schase-es pushed a commit to schase-es/elasticsearch that referenced this pull request Jan 28, 2026
…141185)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates elastic#1592
dimitris-athanasiou added a commit that referenced this pull request Jan 28, 2026
…41185) (#141341)

Adds a new setting `max_batch_size` to the sparse service settings
for EIS. The setting allows values of [1, 512]. The default value remains `16`.

Tweaking this setting can help users improve performance.

Relates #1592

(cherry picked from commit 01ace39)
@dimitris-athanasiou dimitris-athanasiou deleted the max-batch-size-for-eis branch January 28, 2026 08:41
dimitris-athanasiou added a commit that referenced this pull request Jan 28, 2026
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 29, 2026
We want to allow batches to be up to `1024` in size.

Relates elastic#141185
dimitris-athanasiou added a commit that referenced this pull request Jan 29, 2026
We want to allow batches to be up to `1024` in size.

Relates #141185
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 29, 2026
We want to allow batches to be up to `1024` in size.

Relates elastic#141185

(cherry picked from commit e7f0bdd)
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 29, 2026
We want to allow batches to be up to `1024` in size.

Relates elastic#141185

(cherry picked from commit e7f0bdd)
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 29, 2026
We want to allow batches to be up to `1024` in size.

Relates elastic#141185

(cherry picked from commit e7f0bdd)
dimitris-athanasiou added a commit that referenced this pull request Jan 29, 2026
…1537)

We want to allow batches to be up to `1024` in size.

Relates #141185

(cherry picked from commit e7f0bdd)
dimitris-athanasiou added a commit that referenced this pull request Jan 29, 2026
…1536)

We want to allow batches to be up to `1024` in size.

Relates #141185

(cherry picked from commit e7f0bdd)
dimitris-athanasiou added a commit that referenced this pull request Jan 30, 2026
…1535)

We want to allow batches to be up to `1024` in size.

Relates #141185

(cherry picked from commit e7f0bdd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants