Skip to content

Add Minimal Service Settings to the Model Registry#120560

Merged
jimczi merged 8 commits intoelastic:mainfrom
jimczi:minimal_service_settings_model_registry
Jan 27, 2025
Merged

Add Minimal Service Settings to the Model Registry#120560
jimczi merged 8 commits intoelastic:mainfrom
jimczi:minimal_service_settings_model_registry

Conversation

@jimczi
Copy link
Contributor

@jimczi jimczi commented Jan 21, 2025

This commit introduces minimal service settings in the model registry, accessible without querying the inference index. These settings are now available for the default models exposed by the inference service.

The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation. This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index.

Follow-Up Plans:

  1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state.
  2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.
This commit introduces minimal service settings in the model registry, accessible without querying the inference index.
These settings are now available for the default models exposed by the inference service.

The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation.
This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index.

### Follow-Up Plans:
1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state.
2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jan 21, 2025
Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Left some minor comments mainly about using sometimes using minimalModelSettings instead of minimalServiceSettings

Comment on lines +204 to +208
* If the {@code inferenceEntityId} is not found, the method behaves as follows:
* <ul>
* <li>Returns {@code null} if the id might exist but its configuration is not available locally.</li>
* <li>Throws a {@link ResourceNotFoundException} if it is certain that the id does not exist in the cluster.</li>
* </ul>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment appears to be out of date as the function does not throw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just adding a note here to allow for throwing an exception if we ever determine that the inference ID cannot exist.
This isn't possible with the current implementation, but it ensures that future API consumers (e.g., semantic text field validation) handle this case upfront.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could throw from this method today for internal inference IDs (i.e. those that start with .) that we can't find in defaultConfigIds

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except I would like to see test coverage for an invalid task type added before we merge

Comment on lines +204 to +208
* If the {@code inferenceEntityId} is not found, the method behaves as follows:
* <ul>
* <li>Returns {@code null} if the id might exist but its configuration is not available locally.</li>
* <li>Throws a {@link ResourceNotFoundException} if it is certain that the id does not exist in the cluster.</li>
* </ul>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could throw from this method today for internal inference IDs (i.e. those that start with .) that we can't find in defaultConfigIds

new MinimalServiceSettings(TaskType.TEXT_EMBEDDING, 10, SimilarityMeasure.COSINE, null);
});
assertThat(ex.getMessage(), containsString("required [element_type] field is missing"));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for an invalid task_type? That got removed in the refactoring.

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach!

A couple of high level comments:

  • Would a name like InferenceModelSettings or something be more descriptive? It's not quite clear to me from the name that MinimalServiceSettings applies to models.
  • As we add service settings in the future we may want to think about a class hierarchy here, but for now since text embedding models are the only models with settings attached to them, I think this is fine.

Thanks for adding this!

@jimczi jimczi merged commit d28e9ed into elastic:main Jan 27, 2025
16 checks passed
@jimczi jimczi deleted the minimal_service_settings_model_registry branch January 27, 2025 17:59
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Jan 27, 2025
This commit introduces minimal service settings in the model registry, accessible without querying the inference index.
These settings are now available for the default models exposed by the inference service.

The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation.
This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index.

### Follow-Up Plans:
1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state.
2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.
elasticsearchmachine pushed a commit that referenced this pull request Jan 28, 2025
This commit introduces minimal service settings in the model registry, accessible without querying the inference index.
These settings are now available for the default models exposed by the inference service.

The ability to access settings without an inference index query is needed for the semantic text field, as it would benefit from eager validation of configuration during field creation.
This is not feasible currently because retrieving service settings relies on an asynchronous call to the inference index.

### Follow-Up Plans:
1. Extend this capability to include minimal service settings for all newly added models, making them accessible via the cluster state.
2. Update the semantic text field to eagerly retrieve service settings directly from the model registry.

Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Jan 28, 2025
This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`.
These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.

To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index.
If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Jan 28, 2025
This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`.
These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.

To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index.
If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
jimczi added a commit that referenced this pull request Mar 18, 2025
This commit integrates `MinimalServiceSettings` (introduced in #120560) into the cluster state for all registered models in the `ModelRegistry`.
These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.

To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index.
If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
elasticsearchmachine pushed a commit that referenced this pull request Mar 20, 2025
* Add ModelRegistryMetadata to Cluster State (#121106)

This commit integrates `MinimalServiceSettings` (introduced in #120560) into the cluster state for all registered models in the `ModelRegistry`.
These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.

To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index.
If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.

* fix test compil

* fix serialisation

* Exclude Default Inference Endpoints from Cluster State Storage (#125242)

When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint.
However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node.

Since #121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node.

This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup.

**Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
This commit integrates `MinimalServiceSettings` (introduced in elastic#120560) into the cluster state for all registered models in the `ModelRegistry`.
These settings allow consumers to access configuration details without requiring asynchronous calls to retrieve full model configurations.

To ensure consistency, the cluster state metadata must remain synchronized with the models in the inference index.
If a mismatch is detected during startup, the master node performs an upgrade to load all model settings from the index.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 participants