ES|QL: add support for include_execution_metadata parameter#134446
ES|QL: add support for include_execution_metadata parameter#134446luigidellaquila merged 14 commits intoelastic:mainfrom
Conversation
|
Hi @luigidellaquila, I've created a changelog YAML for you. |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
| private boolean columnar; | ||
| private boolean profile; | ||
| private boolean includeCCSMetadata; | ||
| private Boolean includeCPSMetadata; |
There was a problem hiding this comment.
Is there a reason why we need both of these and can't just have include_cps_metadata option turn on includeCCSMetadata?
There was a problem hiding this comment.
The only practical reason is that I'm doing the validation at a later stage, in TransportEsqlQueryAction.java (just because I already have the cluster state there, and I can tell if we are in serverless), so at that point I need to know which of the flags was set.
I don't know if that's a strong enough reason, maybe we can try to inject the cluster state in RequestXContent.objectParserCommon() and do the validation at parse time, but it seems a bit inconvenient
There was a problem hiding this comment.
I see. I just wonder how important it is that include_cps_metadata would only work in CPS instead of just being a synonym for the other option and make the whole thing simple.
There was a problem hiding this comment.
I don't have a strong preference on this.
In S2D38 it was decided that it will be available only on Serverless, so I'm just sticking to this decision
There was a problem hiding this comment.
I just wonder how important it is that include_cps_metadata would only work in CPS instead of just being a synonym for the other option and make the whole thing simple.
As Luigi said, we don't want to expose it in stateful. And ideally I think we would put in the API spec that
include_ccs_metadatais only present in statefulinclude_cps_metadatais only present in serverless
so that clients/users generally use the correct parameter.
However, practically we would accept include_ccs_metadata in serverless if the user puts that (a hidden option), but we wouldn't accept include_cps_metadata in stateful. I think this is consistent with Luigi's current implementation.
There was a problem hiding this comment.
Thanks @quux00, I like the idea of having a more generic name that we can also apply to Stateful, but the concept of "execution" in ES|QL is pretty broad, and includes compute details (that today you can retrieve with profile=true, and I'm pretty sure we don't want in this context).
I wonder if include_execution_info could be confusing for this reason, and if we could find a more specific name. include_execution_metadata could be an option
There was a problem hiding this comment.
I agree with @luigidellaquila, execution sounds too broad and could mean a lot of things. It sounds like we are trying to find a common name for a remote project and remote cluster. I wonder if we can use include_remote_metadata? Possibly we need to come up with a name for operations happening across multiple clusters/projects without explicitly mentioning cluster/projects.
There was a problem hiding this comment.
@quux00 @luigidellaquila @idegtiarenko @naj-h
I like the idea of using a generic name, but I also agree that include_execution_info is a very, very broad name. In the context of ESQL, "execution" can mean a lot of things, and is far too broad about query execution in linked projects.
If we need a new name, include_execution_info is not my preference. Let's find something else.
I do not mind include_execution_metadata, even though that is also pretty generic.
include_project_metadata may be an option too.
QQ: do we still use _clusters in the response? This doesn't feel super compatible with project model in CPS.
There was a problem hiding this comment.
include_remote_metadata
Would be problematic because this section also includes the local cluster info and a potential future goal of this additional info is to include it even when it is a local only search, as I mentioned above, as not doing it is a violation of flat world principles in serverless.
include_project_metadata
Would be problematic as it has the same problem of include_cps_metadata, namely the "project" concept doesn't exist in stateful and we want a generic name that can work in both.
include_execution_metadata
This is fine with me.
QQ: do we still use _clusters in the response?
We asked this of Product and the decision was to leave it, since modifying it would result in different response bodies between serverless and stateful.
There was a problem hiding this comment.
I am OK with include_execution_metadata
|
|
||
| public Boolean includeCPSMetadata() { | ||
| return includeCPSMetadata; | ||
| } |
There was a problem hiding this comment.
Should we add a validation check either in this class or in the RestEsqlQueryAction that the user hasn't set both include_ccs_metadata and include_cps_metadata? It's a minor edge case, but we probably don't want users in serverless to be able to set them both and potentially to different values, so we should throw an exception if both are set (in serverless) with an error message like "Both include_cps_metadata and include_ccs_metadata query parameters are set. Use only include_cps_metadata."
There was a problem hiding this comment.
It reduces chances of user errors, so ++, let me add it
There was a problem hiding this comment.
++ both should not be allowed at the same time
|
Let's also do an API spec ticket for this in https://github.com/elastic/elasticsearch-specification I think we should docuement there that:
so that clients/users generally use the correct parameter. |
| * the functionality to do it the right way is not yet ready -- replace this code when it's ready. | ||
| */ | ||
| this.canUseSkipUnavailable = settings.getAsBoolean("serverless.cross_project.enabled", false) == false; | ||
| this.crossProjectEnabled = settings.getAsBoolean("serverless.cross_project.enabled", false); |
There was a problem hiding this comment.
I'm no longer using this for this PR, but I'll need it for include_execution_metadata, so I'm keeping it here.
| return CollectionUtils.appendToCopy(super.nodePlugins(clusterAlias), CpsPlugin.class); | ||
| } | ||
|
|
||
| protected EsqlQueryResponse runQuery(String query, Boolean cpsMetadataInResponse) { |
There was a problem hiding this comment.
NIT:
| protected EsqlQueryResponse runQuery(String query, Boolean cpsMetadataInResponse) { | |
| protected EsqlQueryResponse runQuery(String query, Boolean includeCpsMetadata) { |
| } | ||
| } | ||
|
|
||
| private static final Setting<String> CpsEnableSetting = Setting.simpleString( |
There was a problem hiding this comment.
NIT: should it be named CPS_ENABLED_SETTING? I am a bit surprised spotless did not complain about it.
There was a problem hiding this comment.
I borrowed that code from other tests, but I agree with you that CPS_ENABLED_SETTING would be better
There was a problem hiding this comment.
As we discussed off-line, I'm moving this test to Serverless
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
Add support for parameter in ES|QL request.
Add support for
include_cps_metadatainclude_execution_metadataparameter in ES|QL request.It will be supported both in Stateful and in Serverless, as a synonym for
include_ccs_metadata(that will remain for bwc)