[ML] Integrate OpenAi Chat Completion in SageMaker by prwhelan · Pull Request #127767 · elastic/elasticsearch

prwhelan · 2025-05-06T16:58:36Z

SageMaker now supports Completion and Chat Completion using the OpenAI interfaces.

Additionally:

Fixed bug related to timeouts being nullable, default to 30s timeout
Exposed existing OpenAi request/response parsing logic for reuse

SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse

elasticsearchmachine · 2025-05-06T16:59:15Z

Hi @prwhelan, I've created a changelog YAML for you.

elasticsearchmachine · 2025-05-06T22:50:34Z

Pinging @elastic/ml-core (Team:ML)

jonathan-buttner

Left a couple questions

jonathan-buttner · 2025-05-20T19:18:26Z

...rc/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiStreamingProcessor.java

+        } catch (IOException e) {
+            throw new ElasticsearchStatusException(
+                "Failed to parse event from inference provider: {}",
+                RestStatus.INTERNAL_SERVER_ERROR,


We've talked about switching to use 502s, do you think that'd be appropriate here?

I don't think so? Because this IOException is an error with our parsing logic, which may or may not mean there is something wrong with their response. It could be that we're out of date.

jonathan-buttner · 2025-05-20T19:25:18Z

...nce/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerService.java

        Map<String, Object> taskSettings,
        InputType inputType,
-        TimeValue timeout,
+        @Nullable TimeValue timeout,


Hmm I thought the timeout is defaulted in the InferenceAction

elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceAction.java

Line 182 in a4a2714

this.inferenceTimeout = DEFAULT_TIMEOUT;

Can it be null here?

Yeah I believe I was hitting an issue when I was using curl, I think it can be null through this path:

elasticsearch/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/rest/BaseInferenceAction.java

Line 49 in a4a2714

var inferTimeout = parseTimeout(restRequest);

We should be defaulting it there too I think:

static TimeValue parseTimeout(RestRequest restRequest) { return restRequest.paramAsTime(InferenceAction.Request.TIMEOUT.getPreferredName(), InferenceAction.Request.DEFAULT_TIMEOUT); }

I think we should consider it a bug if it's null once it gets to the infer() calls. We should make sure it's defaulted prior to those calls.

jonathan-buttner · 2025-05-20T19:32:17Z

...sticsearch/xpack/inference/services/sagemaker/schema/openai/SageMakerOpenAiTaskSettings.java

+
+    @Override
+    public TransportVersion getMinimalSupportedVersion() {
+        return TransportVersions.ML_INFERENCE_SAGEMAKER;


I think we need to create a new transport version right?

I don't think so, but I did anyway. In theory, since the name and parsing logic hadn't changed, both node versions should be able to parse the input/output. But in practice, I couldn't create a multi-node cluster with the same version (9.1.0) and different docker hashes, so I have no way to verify this assumption

jonathan-buttner · 2025-05-20T19:42:14Z

...ticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiCompletionPayloadTests.java

+                    }
+                ]
+            }
+            """.replaceAll("\\s+", "").replaceAll("\\n+", "") + "\n\n";


Would XContentHelper.stripWhitespace() work here?

SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse

[ML] Integrate OpenAi Chat Completion in SageMaker

bf2de8f

SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse

prwhelan added >enhancement :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels May 6, 2025

Update docs/changelog/127767.yaml

37683f2

prwhelan added 3 commits May 6, 2025 14:18

Merge branch 'main' into sagemaker-openai-completion

fe52265

Fix checkstyle

1aea84c

Fix task types

bd9aa11

prwhelan marked this pull request as ready for review May 6, 2025 22:50

jonathan-buttner reviewed May 20, 2025

View reviewed changes

prwhelan added 4 commits May 21, 2025 16:44

Add new transport version

d0d7b7c

Merge branch 'main' into sagemaker-openai-completion

a3dc3a4

Merge branch 'main' into sagemaker-openai-completion

c30ddc0

Update IT

f28a3f8

jonathan-buttner approved these changes May 27, 2025

View reviewed changes

Merge branch 'main' into sagemaker-openai-completion

31e7d60

prwhelan enabled auto-merge (squash) May 27, 2025 17:41

prwhelan removed the auto-backport Automatically create backport pull requests when merged label May 27, 2025

prwhelan merged commit 2830768 into elastic:main May 27, 2025
17 of 18 checks passed

prwhelan mentioned this pull request May 27, 2025

[ML] Integrate OpenAi Chat Completion in SageMaker (#127767) #128536

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Integrate OpenAi Chat Completion in SageMaker#127767

[ML] Integrate OpenAi Chat Completion in SageMaker#127767
prwhelan merged 10 commits intoelastic:mainfrom
prwhelan:sagemaker-openai-completion

prwhelan commented May 6, 2025

elasticsearchmachine commented May 6, 2025

elasticsearchmachine commented May 6, 2025

jonathan-buttner left a comment

jonathan-buttner May 20, 2025

prwhelan May 21, 2025

jonathan-buttner May 20, 2025

prwhelan May 21, 2025

jonathan-buttner May 22, 2025

prwhelan May 22, 2025

jonathan-buttner May 20, 2025

prwhelan May 21, 2025

jonathan-buttner May 20, 2025

Uh oh!

Labels

3 participants

Conversation

prwhelan commented May 6, 2025

elasticsearchmachine commented May 6, 2025

elasticsearchmachine commented May 6, 2025

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants