[ML] Integrate OpenAi Chat Completion in SageMaker#127767
[ML] Integrate OpenAi Chat Completion in SageMaker#127767prwhelan merged 10 commits intoelastic:mainfrom
Conversation
SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse
|
Hi @prwhelan, I've created a changelog YAML for you. |
|
Pinging @elastic/ml-core (Team:ML) |
jonathan-buttner
left a comment
There was a problem hiding this comment.
Left a couple questions
| } catch (IOException e) { | ||
| throw new ElasticsearchStatusException( | ||
| "Failed to parse event from inference provider: {}", | ||
| RestStatus.INTERNAL_SERVER_ERROR, |
There was a problem hiding this comment.
We've talked about switching to use 502s, do you think that'd be appropriate here?
There was a problem hiding this comment.
I don't think so? Because this IOException is an error with our parsing logic, which may or may not mean there is something wrong with their response. It could be that we're out of date.
| Map<String, Object> taskSettings, | ||
| InputType inputType, | ||
| TimeValue timeout, | ||
| @Nullable TimeValue timeout, |
There was a problem hiding this comment.
Hmm I thought the timeout is defaulted in the InferenceAction
Can it be null here?
There was a problem hiding this comment.
Yeah I believe I was hitting an issue when I was using curl, I think it can be null through this path:
There was a problem hiding this comment.
We should be defaulting it there too I think:
static TimeValue parseTimeout(RestRequest restRequest) {
return restRequest.paramAsTime(InferenceAction.Request.TIMEOUT.getPreferredName(), InferenceAction.Request.DEFAULT_TIMEOUT);
}
I think we should consider it a bug if it's null once it gets to the infer() calls. We should make sure it's defaulted prior to those calls.
|
|
||
| @Override | ||
| public TransportVersion getMinimalSupportedVersion() { | ||
| return TransportVersions.ML_INFERENCE_SAGEMAKER; |
There was a problem hiding this comment.
I think we need to create a new transport version right?
There was a problem hiding this comment.
I don't think so, but I did anyway. In theory, since the name and parsing logic hadn't changed, both node versions should be able to parse the input/output. But in practice, I couldn't create a multi-node cluster with the same version (9.1.0) and different docker hashes, so I have no way to verify this assumption
| } | ||
| ] | ||
| } | ||
| """.replaceAll("\\s+", "").replaceAll("\\n+", "") + "\n\n"; |
There was a problem hiding this comment.
Would XContentHelper.stripWhitespace() work here?
SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse
SageMaker now supports Completion and Chat Completion using the OpenAI interfaces. Additionally: - Fixed bug related to timeouts being nullable, default to 30s timeout - Exposed existing OpenAi request/response parsing logic for reuse
SageMaker now supports Completion and Chat Completion using the OpenAI interfaces.
Additionally: