Add Azure AI Rerank support by Evgenii-Kazannik · Pull Request #129848 · elastic/elasticsearch

Evgenii-Kazannik · 2025-06-23T12:28:01Z

As of now, it appearsthat only the Cohere provider is applicable for reranking
Azure AI Foundry Models available for standard deployment
Cohere docs

PUT {{base-url}}/_inference/rerank/cohere
{
"service": "azureaistudio",
"service_settings": {
"target": "https://Cohere-rerank-v3-5-samwq.swedencentral.models.ai.azure.com",
"provider": "COHERE",
"endpoint_type": "token",
"api_key": "{{cohere-api-key}}"
},
"task_settings": {
"top_n": 2,
"return_documents": true
}
}

POST {{base-url}}/_inference/rerank/cohere
{
"input": ["Luke", "like", "leia", "chewy","r2d2", "star", "wars"],
"query": "star wars main character",
"top_n": 7,
"return_documents": true
}

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

dan-rubinstein · 2025-07-07T18:07:29Z

...main/java/org/elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioService.java

            return completionModel;
        }

+        if (taskType == TaskType.RERANK) {


Can we simplify the logic in this method with a switch statement to make the model and then a single call to checkProviderAndEndpointTypeForTask?

It will require casting for the service settings in the checkProviderAndEndpointTypeForTask so it won't be one single call unfortunately

Can we not cast it to AzureAiStudioServiceSettings? Something like:

AzureAiStudioModel model; switch(taskType) { case TEXT_EMBEDDING -> { model = new AzureAiStudioEmbeddingsModel( inferenceEntityId, taskType, NAME, serviceSettings, taskSettings, chunkingSettings, secretSettings, context ); } case COMPLETION -> { ... } default -> throw new ElasticsearchStatusException( failureMessage, RestStatus.BAD_REQUEST ); } AzureAiStudioServiceSettings azureAiStudioServiceSettings = (AzureAiStudioServiceSettings) model.getServiceSettings(); checkProviderAndEndpointTypeForTask( taskType, azureAiStudioServiceSettings.provider(), azureAiStudioServiceSettings.endpointType() );

Well. My bad. Refactored.
Thank you

dan-rubinstein · 2025-07-07T18:11:19Z

.../elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioProviderCapabilities.java

    // these providers have chat completion inference (all providers at the moment)
    public static final List<AzureAiStudioProvider> chatCompletionProviders = List.of(AzureAiStudioProvider.values());

+    // these providers allow token ("pay as you go") embeddings endpoints


Can you clarify what this comment means? Why do they have to support token ("pay as you go") billing to be valid for rerank? Why are they embeddings endpoints instead of rerank endpoints? Can this just say // these providers have rerank inference?

Cohere Rerank billing is based on search_units. So effectively we pay for what we use. That's why I think ""pay as you go" is applicable. All in all this comment is not relevant since the constant with it has been deleted as suggested in other comment. Thanks

dan-rubinstein · 2025-07-07T18:12:40Z

.../elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioProviderCapabilities.java

+    public static final List<AzureAiStudioProvider> rerankProviders = List.of(AzureAiStudioProvider.COHERE);
+
+    // these providers allow token ("pay as you go") embeddings endpoints
+    public static final List<AzureAiStudioProvider> tokenRerankProviders = List.of(AzureAiStudioProvider.COHERE);


Why does this need to be split from rerankProviders? Do we expect there to be rerankProviders that don't offer token rerank capabilities?

Currently we use only Cohere for reranking so that's wiser not to split. Thank you. Done

dan-rubinstein · 2025-07-07T18:13:06Z

.../elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioProviderCapabilities.java

+    public static final List<AzureAiStudioProvider> tokenRerankProviders = List.of(AzureAiStudioProvider.COHERE);
+
+    // these providers allow realtime rerank endpoints (none at the moment)
+    public static final List<AzureAiStudioProvider> realtimeRerankProviders = List.of();


Do we suspect these will be added at some point? If not do we need to have this in code until any are added?

I think it should be left.
It's used to check if the realtime type is valid for a provider allowing to show more descriptive error
e.g. we try to set realtime as an endpoint_type while creating an inference endpoint:
"The [realtime] endpoint type with [rerank] task type for provider [cohere] is not available"

Makes sense, thanks for clarifying!

dan-rubinstein · 2025-07-08T14:53:52Z

...elasticsearch/xpack/inference/services/azureaistudio/request/AzureAiStudioRerankRequest.java

+    }
+
+    private AzureAiStudioRerankRequestEntity createRequestEntity() {
+        var taskSettings = rerankModel.getTaskSettings();


Nit: Do we need this stored in a separate variable if it's only referenced once?

Agree. That's a bit cleaner not to introduce it, corrected. Thx

dan-rubinstein · 2025-07-08T18:24:59Z

...arch/xpack/inference/services/azureaistudio/rerank/AzureAiStudioRerankTaskSettingsTests.java

+
+    private static AzureAiStudioRerankTaskSettings createRandom() {
+        return new AzureAiStudioRerankTaskSettings(
+            randomFrom(randomFrom(new Boolean[] { null, randomBoolean() })),


Do we need the second randomFrom here?

Well. That's overdo. I removed unnecessary randomFrom. Thank you

dan-rubinstein · 2025-07-08T18:28:41Z

.../elasticsearch/xpack/inference/services/azureaistudio/AzureAiStudioProviderCapabilities.java

                    ? tokenEmbeddingsProviders.contains(provider)
                    : realtimeEmbeddingsProviders.contains(provider);
            }
+            case RERANK -> {


Is this logic covered in testing anywhere right now?

Yep. It is covered here:

AzureAiStudioServiceTests#testParseRequestConfig_ThrowsWhenEndpointTypeIsNotValidForRerankProvider

My mistake, thanks for clarifying.

dan-rubinstein · 2025-07-08T18:34:45Z

...ack/inference/services/azureaistudio/rerank/AzureAiStudioRerankRequestTaskSettingsTests.java

+        assertThat(settings, is(AzureAiStudioRerankRequestTaskSettings.EMPTY_SETTINGS));
+    }
+
+    public void testFromMap_ReturnsDoSample() {


What does this test name mean by DoSample?

I referenced AzureAiStudioChatCompletionRequestTaskSettingsTests and missed to rename the test method. Apologies. Renamed

dan-rubinstein · 2025-07-08T18:36:22Z

...ack/inference/services/azureaistudio/rerank/AzureAiStudioRerankRequestTaskSettingsTests.java

+
+    public void testFromMap_ReturnsDoSample() {
+        final var settings = AzureAiStudioRerankRequestTaskSettings.fromMap(new HashMap<>(Map.of(RETURN_DOCUMENTS_FIELD, true)));
+        assertThat(settings.returnDocuments(), is(true));


Should we just compare the settings here to an expected settings object similar to the tests above?

It seems better. I did it. Thank you

dan-rubinstein · 2025-07-08T18:38:23Z

...icsearch/xpack/inference/services/azureaistudio/request/AzureAiStudioRerankRequestTests.java

+        assertThat(requestMap.get(INPUT), is(List.of(input)));
+    }
+
+    public void testCreateRequest_WithCohereProviderTokenEndpoint_WithTopNParam() throws IOException {


Can we have a test for creating the request with the other parameter?

Sure. Added the test for the return documents parameter. Thanks

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

elasticsearchmachine · 2025-07-11T17:13:11Z

Pinging @elastic/ml-core (Team:ML)

dan-rubinstein · 2025-07-15T18:44:43Z

...arch/xpack/inference/services/azureaistudio/rerank/AzureAiStudioRerankTaskSettingsTests.java

+    public void testUpdatedTaskSettings_WithAllValues() {
+        final AzureAiStudioRerankTaskSettings initialSettings = createRandom();
+        AzureAiStudioRerankTaskSettings newSettings;
+        int retries = 0;


Instead of running retries which can require multiple loops can we just create the newSettings objects using randomValueOtherThan ourselves:

AzureAiStudioRerankTaskSettings newSettings = new AzureAiStudioRerankTaskSettings(randomValueOtherThan(intialSettings.returnDocuments(), () -> randomFrom(new Boolean[] { null, randomBoolean() }), randomValueOtherThan(initialSettings.topN(), () -> randomFrom(new Integer[] { null, randomNonNegativeInt() })));

Adjust the above based on whether we want to allow null values. The same comment applies to the other tests below. If we find we're reusing this we can create a helper function to do this for us across the various tests.

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

dan-rubinstein

LGTM

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

elasticsearchmachine added v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 23, 2025

Add Azure AI Rerank support

9ba481d

Evgenii-Kazannik force-pushed the Add-Azure-AI-Foundry-Rerank-support branch from 3c5f0eb to 9ba481d Compare June 25, 2025 09:02

Evgenii-Kazannik marked this pull request as ready for review June 25, 2025 09:16

elasticsearchmachine added needs:triage Requires assignment of a team area label v9.2.0 and removed v9.1.0 labels Jun 25, 2025

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

c3809d7

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

dan-rubinstein reviewed Jul 8, 2025

View reviewed changes

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

36a4304

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

PeteGillinElastic added :ml Machine learning and removed needs:triage Requires assignment of a team area label labels Jul 11, 2025

elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 11, 2025

Evgenii-Kazannik added 2 commits July 14, 2025 23:43

address comments

af24b42

address comments

e6b861b

dan-rubinstein reviewed Jul 15, 2025

View reviewed changes

Evgenii-Kazannik added 3 commits July 16, 2025 09:16

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

14b967d

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

refactor azure ai studio service

20148c0

update rerank task settings test

ce407dc

dan-rubinstein self-assigned this Jul 16, 2025

dan-rubinstein added the >enhancement label Jul 16, 2025

dan-rubinstein approved these changes Jul 16, 2025

View reviewed changes

Evgenii-Kazannik added 6 commits July 16, 2025 17:30

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

6b4e84a

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

78917b8

add provider for rerank

b00bf73

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

325179b

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

966092f

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

f4a5033

Merge branch 'main' into Add-Azure-AI-Foundry-Rerank-support

7a818df

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java

dan-rubinstein merged commit d06b0c8 into elastic:main Jul 17, 2025
35 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Azure AI Rerank support#129848

Add Azure AI Rerank support#129848
dan-rubinstein merged 15 commits intoelastic:mainfrom
Evgenii-Kazannik:Add-Azure-AI-Foundry-Rerank-support

Evgenii-Kazannik commented Jun 23, 2025 •

edited

Loading

dan-rubinstein Jul 7, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 15, 2025 •

edited

Loading

Evgenii-Kazannik Jul 16, 2025

dan-rubinstein Jul 7, 2025

Evgenii-Kazannik Jul 15, 2025

dan-rubinstein Jul 7, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 7, 2025

Evgenii-Kazannik Jul 15, 2025

dan-rubinstein Jul 15, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 15, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

dan-rubinstein Jul 8, 2025

Evgenii-Kazannik Jul 14, 2025

elasticsearchmachine commented Jul 11, 2025

dan-rubinstein Jul 15, 2025 •

edited

Loading

dan-rubinstein left a comment

Uh oh!

Labels

4 participants

Conversation

Evgenii-Kazannik commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dan-rubinstein Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jul 11, 2025

dan-rubinstein Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

dan-rubinstein left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

Evgenii-Kazannik commented Jun 23, 2025 •

edited

Loading

dan-rubinstein Jul 15, 2025 •

edited

Loading

dan-rubinstein Jul 15, 2025 •

edited

Loading