Add max_batch_size setting to EIS sparse service settings#141185
Add max_batch_size setting to EIS sparse service settings#141185dimitris-athanasiou merged 10 commits intoelastic:mainfrom
max_batch_size setting to EIS sparse service settings#141185Conversation
Adds a new setting `max_batch_size` to the dense and sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains `16`. Tweaking this setting can help users improve performance. Relates elastic#1592
|
Pinging @elastic/search-inference-team (Team:Search - Inference) |
|
Hi @dimitris-athanasiou, I've created a changelog YAML for you. |
jonathan-buttner
left a comment
There was a problem hiding this comment.
Thanks for the changes, just left a few questions and suggestions.
...ervices/elastic/sparseembeddings/ElasticInferenceServiceSparseEmbeddingsServiceSettings.java
Outdated
Show resolved
Hide resolved
...s/elastic/densetextembeddings/ElasticInferenceServiceDenseTextEmbeddingsServiceSettings.java
Outdated
Show resolved
Hide resolved
...st/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java
Outdated
Show resolved
Hide resolved
|
Should the new |
.../inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsServiceSettingsTests.java
Outdated
Show resolved
Hide resolved
.../inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsServiceSettingsTests.java
Outdated
Show resolved
Hide resolved
...s/elastic/densetextembeddings/ElasticInferenceServiceDenseTextEmbeddingsServiceSettings.java
Outdated
Show resolved
Hide resolved
...stic/densetextembeddings/ElasticInferenceServiceDenseTextEmbeddingsServiceSettingsTests.java
Outdated
Show resolved
Hide resolved
.../inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsServiceSettingsTests.java
Outdated
Show resolved
Hide resolved
...stic/densetextembeddings/ElasticInferenceServiceDenseTextEmbeddingsServiceSettingsTests.java
Outdated
Show resolved
Hide resolved
|
@DonalEvans Thank you for the review! I have addressed all of your points. I have added the |
max_batch_size setting to EIS dense and sparse service settings|
@jonathan-buttner @DonalEvans I have removed the setting for dense models. Let me know if it's still good to go. |
💔 Backport failedYou can use sqren/backport to manually backport by running |
…141185) Adds a new setting `max_batch_size` to the sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains `16`. Tweaking this setting can help users improve performance. Relates elastic#1592 (cherry picked from commit 01ace39) # Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
…141185) Adds a new setting `max_batch_size` to the sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains `16`. Tweaking this setting can help users improve performance. Relates elastic#1592 (cherry picked from commit 01ace39) # Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv # x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/InferenceUtils.java # x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/inference/InferenceUtilsTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java # x-pack/plugin/inference/src/internalClusterTest/java/org/elasticsearch/xpack/inference/integration/ModelRegistryIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elastic/authorization/ElasticInferenceServiceAuthorizationModel.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceSparseEmbeddingsServiceSettingsTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/action/ElasticInferenceServiceActionCreatorTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/authorization/AuthorizationPollerTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/authorization/ElasticInferenceServiceAuthorizationModelTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/elastic/response/ElasticInferenceServiceAuthorizationResponseEntityTests.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/embeddings/GoogleVertexAiEmbeddingsServiceSettingsTests.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…141185) Adds a new setting `max_batch_size` to the sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains `16`. Tweaking this setting can help users improve performance. Relates elastic#1592
We want to allow batches to be up to `1024` in size. Relates elastic#141185
We want to allow batches to be up to `1024` in size. Relates #141185
We want to allow batches to be up to `1024` in size. Relates elastic#141185 (cherry picked from commit e7f0bdd)
We want to allow batches to be up to `1024` in size. Relates elastic#141185 (cherry picked from commit e7f0bdd)
We want to allow batches to be up to `1024` in size. Relates elastic#141185 (cherry picked from commit e7f0bdd)
Adds a new setting
max_batch_sizeto the sparse service settings for EIS. The setting allows values of [1, 512]. The default value remains16.Tweaking this setting can help users improve performance.
Relates #1592