Skip to content

[ML] Integrate SageMaker with OpenAI Embeddings (#126856)#127610

Merged
elasticsearchmachine merged 1 commit intoelastic:8.19from
prwhelan:backport/8.19/126856
May 1, 2025
Merged

[ML] Integrate SageMaker with OpenAI Embeddings (#126856)#127610
elasticsearchmachine merged 1 commit intoelastic:8.19from
prwhelan:backport/8.19/126856

Conversation

@prwhelan
Copy link
Member

@prwhelan prwhelan commented May 1, 2025

Integrating with SageMaker.

Current design:

  • SageMaker accepts any byte payload, which can be text, csv, or json. api represents the structure of the payload that we will send, for example openai, elastic, common, probably cohere or huggingface as well.
  • api implementations are extensions of SageMakerSchemaPayload, which supports:
    • "extra" service and task settings specific to the payload structure, so cohere would require embedding_type and openai would require dimensions in the service_settings
    • conversion logic from model, service settings, task settings, and input to SdkBytes
    • conversion logic from responding SdkBytes to InferenceServiceResults
  • Everything else is tunneling, there are a number of base service_settings and task_settings that are independent of the api format that we will store and set
  • We let the SDK do the bulk of the work in terms of connection details, rate limiting, retries, etc.
Integrating with SageMaker.

Current design:
- SageMaker accepts any byte payload, which can be text, csv, or json. `api` represents the structure of the payload that we will send, for example `openai`, `elastic`, `common`, probably `cohere` or `huggingface` as well.
- `api` implementations are extensions of `SageMakerSchemaPayload`, which supports:
  - "extra" service and task settings specific to the payload structure, so `cohere` would require `embedding_type` and `openai` would require `dimensions` in the `service_settings`
  - conversion logic from model, service settings, task settings, and input to `SdkBytes`
  - conversion logic from responding `SdkBytes` to `InferenceServiceResults`
- Everything else is tunneling, there are a number of base `service_settings` and `task_settings` that are independent of the api format that we will store and set
- We let the SDK do the bulk of the work in terms of connection details, rate limiting, retries, etc.
@prwhelan prwhelan added >enhancement :ml Machine learning backport Team:ML Meta label for the ML team auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) v8.19.0 labels May 1, 2025
@elasticsearchmachine elasticsearchmachine merged commit 577a6f8 into elastic:8.19 May 1, 2025
15 checks passed
@prwhelan prwhelan deleted the backport/8.19/126856 branch May 1, 2025 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport >enhancement :ml Machine learning Team:ML Meta label for the ML team v8.19.0

2 participants