Add Minimal Service Settings to the Model Registry by jimczi · Pull Request #120560 · elastic/elasticsearch

jimczi · 2025-01-21T21:25:21Z

This commit introduces minimal service settings in the model registry, accessible without querying the inference index. These settings are now available for the default models exposed by the inference service.

The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation. This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index.

Follow-Up Plans:

Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state.
Update the semantic text field to eagerly retrieve service settings directly from the model registry.

This commit introduces minimal service settings in the model registry, accessible without querying the inference index. These settings are now available for the default models exposed by the inference service. The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation. This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index. ### Follow-Up Plans: 1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state. 2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.

elasticsearchmachine · 2025-01-21T21:25:45Z

Pinging @elastic/ml-core (Team:ML)

davidkyle

LGTM

Left some minor comments mainly about using sometimes using minimalModelSettings instead of minimalServiceSettings

...n/inference/src/test/java/org/elasticsearch/xpack/inference/registry/ModelRegistryTests.java

...earch/xpack/inference/services/elasticsearch/MultilingualE5SmallInternalServiceSettings.java

davidkyle · 2025-01-22T09:17:24Z

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistry.java

+     * If the {@code inferenceEntityId} is not found, the method behaves as follows:
+     * <ul>
+     *   <li>Returns {@code null} if the id might exist but its configuration is not available locally.</li>
+     *   <li>Throws a {@link ResourceNotFoundException} if it is certain that the id does not exist in the cluster.</li>
+     * </ul>


This comment appears to be out of date as the function does not throw

It's just adding a note here to allow for throwing an exception if we ever determine that the inference ID cannot exist.
This isn't possible with the current implementation, but it ensures that future API consumers (e.g., semantic text field validation) handle this case upfront.

We could throw from this method today for internal inference IDs (i.e. those that start with .) that we can't find in defaultConfigIds

Mikep86

LGTM, except I would like to see test coverage for an invalid task type added before we merge

Mikep86 · 2025-01-22T17:26:19Z

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistry.java

+     * If the {@code inferenceEntityId} is not found, the method behaves as follows:
+     * <ul>
+     *   <li>Returns {@code null} if the id might exist but its configuration is not available locally.</li>
+     *   <li>Throws a {@link ResourceNotFoundException} if it is certain that the id does not exist in the cluster.</li>
+     * </ul>


We could throw from this method today for internal inference IDs (i.e. those that start with .) that we can't find in defaultConfigIds

Mikep86 · 2025-01-22T17:34:01Z

...inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldTests.java

+            new MinimalServiceSettings(TaskType.TEXT_EMBEDDING, 10, SimilarityMeasure.COSINE, null);
        });
        assertThat(ex.getMessage(), containsString("required [element_type] field is missing"));
    }


Can we add a test for an invalid task_type? That got removed in the refactoring.

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/registry/ModelRegistry.java

kderusso

I like this approach!

A couple of high level comments:

Would a name like InferenceModelSettings or something be more descriptive? It's not quite clear to me from the name that MinimalServiceSettings applies to models.
As we add service settings in the future we may want to think about a class hierarchy here, but for now since text embedding models are the only models with settings attached to them, I think this is fine.

Thanks for adding this!

…tings_model_registry

This commit introduces minimal service settings in the model registry, accessible without querying the inference index. These settings are now available for the default models exposed by the inference service. The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation. This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index. ### Follow-Up Plans: 1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state. 2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.

This commit introduces minimal service settings in the model registry, accessible without querying the inference index. These settings are now available for the default models exposed by the inference service. The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation. This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index. ### Follow-Up Plans: 1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state. 2. Update the semantic text field to eagerly retrieve service settings directly from the model registry. Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>

This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.

This commit integrates `MinimalServiceSettings` (introduced in #120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.

* Add ModelRegistryMetadata to Cluster State (#121106) This commit integrates `MinimalServiceSettings` (introduced in #120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index. * fix test compil * fix serialisation * Exclude Default Inference Endpoints from Cluster State Storage (#125242) When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint. However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node. Since #121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node. This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup. **Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.

This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`. These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations. To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index. If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.

jimczi added >non-issue :ml Machine learning v9.0.0 v8.18.0 labels Jan 21, 2025

jimczi requested review from Mikep86 and davidkyle January 21, 2025 21:25

elasticsearchmachine added the Team:ML Meta label for the ML team label Jan 21, 2025

[CI] Auto commit changes from spotless

1683081

davidkyle approved these changes Jan 22, 2025

View reviewed changes

jimczi added 3 commits January 22, 2025 10:50

address review comments

f8c393a

Merge branch 'main' into minimal_service_settings_model_registry

9a81434

Merge branch 'main' into minimal_service_settings_model_registry

eab5ba4

Mikep86 approved these changes Jan 22, 2025

View reviewed changes

kderusso approved these changes Jan 23, 2025

View reviewed changes

jimczi added 3 commits January 27, 2025 10:10

Merge remote-tracking branch 'upstream/main' into minimal_service_set…

d2bcad2

…tings_model_registry

add test for task type

78d505a

Merge branch 'main' into minimal_service_settings_model_registry

99b34cd

jimczi merged commit d28e9ed into elastic:main Jan 27, 2025
16 checks passed

jimczi deleted the minimal_service_settings_model_registry branch January 27, 2025 17:59

jimczi added the backport pending label Jan 27, 2025

jimczi mentioned this pull request Jan 27, 2025

[8.x] Add Minimal Service Settings to the Model Registry #120946

Merged

jimczi mentioned this pull request Jan 28, 2025

Add ModelRegistryMetadata to Cluster State #121106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Minimal Service Settings to the Model Registry#120560

Add Minimal Service Settings to the Model Registry#120560
jimczi merged 8 commits intoelastic:mainfrom
jimczi:minimal_service_settings_model_registry

jimczi commented Jan 21, 2025

elasticsearchmachine commented Jan 21, 2025

davidkyle left a comment

Uh oh!

Uh oh!

davidkyle Jan 22, 2025

jimczi Jan 22, 2025

Mikep86 Jan 22, 2025

Mikep86 left a comment

Mikep86 Jan 22, 2025

Mikep86 Jan 22, 2025

Uh oh!

kderusso left a comment

Uh oh!

Labels

5 participants

Conversation

jimczi commented Jan 21, 2025

Follow-Up Plans:

elasticsearchmachine commented Jan 21, 2025

davidkyle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidkyle Jan 22, 2025

Choose a reason for hiding this comment

jimczi Jan 22, 2025

Choose a reason for hiding this comment

Mikep86 Jan 22, 2025

Choose a reason for hiding this comment

Mikep86 left a comment

Choose a reason for hiding this comment

Mikep86 Jan 22, 2025

Choose a reason for hiding this comment

Mikep86 Jan 22, 2025

Choose a reason for hiding this comment

Uh oh!

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

5 participants