Skip to content

[Fleet] Add integration knowledge opt out UI setting and enable feature flag#245080

Merged
juliaElastic merged 63 commits intoelastic:mainfrom
juliaElastic:integration-knowledge-ff
Dec 12, 2025
Merged

[Fleet] Add integration knowledge opt out UI setting and enable feature flag#245080
juliaElastic merged 63 commits intoelastic:mainfrom
juliaElastic:integration-knowledge-ff

Conversation

@juliaElastic
Copy link
Contributor

@juliaElastic juliaElastic commented Dec 3, 2025

Summary

Closes https://github.com/elastic/ingest-dev/issues/6276

  • Enable feature flag installIntegrationsKnowledge by default
  • Added setting integration_knowledge_enabled to ingest_manager_settings SO to save the UI setting
  • When the integration knowledge user setting is turned off, new package installations will skip the knowledge base indexing step
  • When the integration knowledge user setting is turned on again, there is a one-time async task scheduled that reindexes knowledge base for installed packages

The Manage integration knowledge action is added to Installed integrations table.

image

Clicking the action opens the flyout about the Knowledge base information and the switch to be able to opt out.
image

UX design follows Figma

To verify:

  • Open the flyout and verify that the switch is on by default
  • Turn the switch off and click Save on the flyout
  • Install a new package and verify that the knowledge_base ES asset is not created
  • Go back to the flyout and switch the setting on again
  • Verify that the previously installed package has now an ES asset type knowledge_base
GET .kibana_ingest/_doc/epm-packages:apache

"installed_es": [
...
        {
          "id": "apache-README.md",
          "type": "knowledge_base"
        }
      ],

Kibana logs


[2025-12-03T13:30:34.253+01:00][INFO ][plugins.fleet] Scheduling task to reindex integration knowledge for installed packages

[2025-12-03T13:30:36.584+01:00][DEBUG][plugins.fleet] Successfully indexed 1 knowledge base documents for package apache. Document IDs: apache-README.md
[2025-12-03T13:30:36.585+01:00][DEBUG][plugins.fleet] Knowledge base step: Saved 1 documents to index for package apache@2.1.1
[2025-12-03T13:30:36.587+01:00][DEBUG][plugins.fleet] Successfully indexed 1 knowledge base documents for package system. Document IDs: system-README.md
[2025-12-03T13:30:36.587+01:00][DEBUG][plugins.fleet] Knowledge base step: Saved 1 documents to index for package system@2.8.0

Added tour component and learn more link:

image image

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines
  • Review the backport guidelines and apply applicable backport:* labels.

Identify risks

Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.

@juliaElastic juliaElastic added release_note:feature Makes this part of the condensed release notes backport:version Backport to applied version labels v9.3.0 labels Dec 3, 2025
}
];

if (selectedItems.length > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the Actions button was disabled if there were no integrations selected.
Now the new Manage integration knowledge is available even if there are no selected items.

await pMap(
installedPackages.saved_objects,
async ({ attributes: installation }) => {
// TODO archiveIterator is different if `install_source !== 'registry'`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a different way to create archiveIterator for packages depending on install source, which makes this logic a little complex.
Alternatively we could call reinstallPackageForInstallation which would call all package install steps.

Or skip the reindex if the package is not installed from registry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved this for bundled packages and skipping other types (uploaded, custom).

This is similar to who the reinstall logic works: https://github.com/elastic/kibana/blob/main/x-pack/platform/plugins/shared/fleet/server/services/epm/packages/reinstall.ts#L29

…atus --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/streams --include-path /api/fleet --include-path /api/saved_objects/_import --include-path /api/saved_objects/_export --include-path /api/maintenance_window --include-path /api/agent_builder --update
installation.version,
{ useStreaming: true }
);
await indexKnowledgeBase(
Copy link
Contributor Author

@juliaElastic juliaElastic Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to determine if an installed package already has the latest knowledge base assets created?
If so, we could skip the reindexing for those packages.
@Supplementing You might now this?

We could check the existence of any knowledge_base type assets, but it's not guaranteed it belongs to the latest package version.

{
    "id": "apache-README.md",
    "type": "knowledge_base"
  }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, theres an internal endpoint that will return the indexed KB docs for the package, along with the current version for each asset.

GET /internal/fleet/epm/packages/{pkgName}/knowledge_base

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I could use this to check if knowledge base items exist for the currently installed package version, and if so, skip the reindexing.

[2025-12-03T16:28:40.479+01:00][DEBUG][plugins.fleet] Skipping reindexing knowledge base for package apm@8.16.1 - already indexed
</EuiButtonEmpty>
</EuiFlexItem>
<EuiFlexItem grow={false}>
<EuiButton
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the setting is changed only when the Save button is clicked.
I'm wondering if it would be simpler to change the setting when the switch is clicked, and get rid of the Save button.
cc @sileschristian

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the setting is changed only when the Save button is clicked. I'm wondering if it would be simpler to change the setting when the switch is clicked, and get rid of the Save button. cc @sileschristian

Makes sense Julia. Question related the details/explanations: is someone working on what we should explain and how? FYI, don't take my design copies as something we should place there as it was meant to only indicate that we need to add explanations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nimarezainia @kpollich Could you review the copy in the flyout and let me know if we should change anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @vishaangelova @karenzone as well for copy review

One thing we should clarify: EIS is only going to be used for cloud/serverless deployments. Self-hosted deployments will generate embeddings on a local ML node (ES will nominate one if it doesn't exist). So there needs to be some conditional language here.

Maybe an easy way to approach this would be to change the language slightly and only show the "cost" section for ECH/serverless deployments, e.g.

## How it works?

Integration documentation and metadata are processed by Elasticsearch to provide context about your installed integrations to [agent builder](https://www.elastic.co/docs/solutions/search/elastic-agent-builder) and [AI assistant](https://www.elastic.co/docs/explore-analyze/ai-features/ai-assistant).

## What gets indexed

- Integration documentation
- Configuration metadata
- Field definitions

{ if isECH || isServerless }
## Cost

Indexing uses Elastic Inference Service and incurs minimal per token charges.
{ /if }

☑️ Use integration knowledge in all installed integrations
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the copy and made it conditional.

In cloud/serverless:
image

In self-managed:
image

@juliaElastic juliaElastic marked this pull request as ready for review December 3, 2025 16:00
@juliaElastic juliaElastic requested review from a team as code owners December 3, 2025 16:00
@juliaElastic juliaElastic requested a review from a team as a code owner December 4, 2025 09:02
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Dec 4, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

spong added a commit that referenced this pull request Dec 11, 2025
## Summary

Adds an integration knowledge tool to Agent Builder that retrieves
documentation from Fleet-installed integrations using semantic search on
the `.integration_knowledge` index. The tool uses the conditional
availability pattern and is only available when the integration
knowledge index exists.


<p align="center">
<img width="405"
src="https://github.com/user-attachments/assets/640d4f54-34cc-47e3-b731-b3913139e84e"
/> <img width="395"
src="https://github.com/user-attachments/assets/fd66c044-5536-4947-98d8-45e4b168b34c"
/>
</p> 


## Changes

* Added `platform.core.integration_knowledge` builtin tool to
`agent_builder_platform` that searches Fleet integration documentation
* Tool is registered in plugin `setup()` with conditional availability
using the `availability` configuration pattern
* Availability is checked at runtime via ES search on
`.integration_knowledge` index (using `size: 0` query)
* Returns structured resource results with package name, version,
filename, and content

## Technical Details

* Tool registration added to `registerTools()` in plugin `setup()`
phase, following the same pattern as `productDocumentationTool`
* Uses `availability` configuration with `cacheMode: 'space'` to
conditionally show/hide the tool based on index availability
* Searches using Elasticsearch semantic search on the `content` field
* `esClient.asInternalUser` is used for both handler execution and
availability checking (index permissions require internal user)
* Results include reference URLs to integration detail pages
(`/app/integrations/detail/{package_name}`)

## Considerations

* Tool requires Fleet to have indexed integration knowledge into
`.integration_knowledge`
* Tool availability is checked per-space and cached for performance
* No Kibana restart required - tool appears/disappears dynamically based
on index availability
* This is the onechat/Agent Builder equivalent of the existing
`IntegrationKnowledgeTool` in Security Solution's Assistant
(#236197) and Observability
Solution's Assistant (#237085)
added in `9.2`.

---

## Testing

> [!NOTE]
> You must enable the `xpack.fleet.enableExperimental:
["installIntegrationsKnowledge"]` feature flag until this PR enabling it
by default is merged (#245080).


1. Upload this sample
[system-2.3.3-NEXT.zip](https://github.com/user-attachments/files/22546766/system-2.3.3-NEXT.zip)
package via Integrations > Create new integration
- The test package just copies the existing `docs/README.md` to
`docs/knowledge_base/README.md` so that Fleet ingests it into
`.integrations_knowledge`
2. Create new Agent with the new Integration Knowledge tool and ask
questions related to system integrations, such as:
    - How can I collect CPU and memory data for my windows host?
    - What OS can I run the system integration on?
    - What does the system integration do?
3. Observe that the responses returned contain relevant information that
is cited from the system integration.





_PR developed with Cursor + Opus 4.5_

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
@pmuellr pmuellr requested review from pmuellr and removed request for pmuellr December 11, 2025 23:31
Copy link
Contributor

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I tried to remove my review, but it wasn't good enough, it seems.

I understand my issues will be addressed in a future PR ...

@juliaElastic juliaElastic requested review from a team as code owners December 12, 2025 09:24
@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#10064

[❌] x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/serverless/oblt.ai_assistant.serverless.config.ts: 0/1 tests passed.
[❌] x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.ai_assistant.stateful.config.ts: 0/1 tests passed.

see run history

it('emits 5 messageAdded events', () => {
expect(messageAddedEvents.length).to.be(5);
it('emits at least 3 messageAdded events', () => {
expect(messageAddedEvents.length).to.be.greaterThan(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine.
@arturoliduena Since you wrote this, WDYT?

@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#10066

[✅] x-pack/solutions/observability/test/api_integration_deployment_agnostic/feature_flag_configs/serverless/oblt.ai_assistant.serverless.config.ts: 1/1 tests passed.

see run history

@elasticmachine
Copy link
Contributor

elasticmachine commented Dec 12, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #6 / GroupedAlertsTable resets all levels pagination when global query updates
  • [job] [logs] Jest Tests #6 / GroupedAlertsTable resets all levels pagination when selected group changes
  • [job] [logs] Jest Tests #6 / GroupedAlertsTable resets innermost level's current page when that level's page size updates
  • [job] [logs] Jest Tests #6 / GroupedAlertsTable resets only most inner group pagination when its parent groups open/close
  • [job] [logs] Jest Tests #6 / GroupedAlertsTable resets outermost level's current page when that level's page size updates

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
fleet 1375 1377 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fleet 2.1MB 2.1MB +6.3KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
fleet 179.3KB 179.4KB +31.0B

History

@juliaElastic juliaElastic merged commit 544fe9f into elastic:main Dec 12, 2025
12 checks passed
@kibanamachine kibanamachine added backport:skip This PR does not require backporting and removed backport:version Backport to applied version labels labels Dec 12, 2025
seanrathier pushed a commit to seanrathier/kibana that referenced this pull request Dec 15, 2025
)

## Summary

Adds an integration knowledge tool to Agent Builder that retrieves
documentation from Fleet-installed integrations using semantic search on
the `.integration_knowledge` index. The tool uses the conditional
availability pattern and is only available when the integration
knowledge index exists.


<p align="center">
<img width="405"
src="https://github.com/user-attachments/assets/640d4f54-34cc-47e3-b731-b3913139e84e"
/> <img width="395"
src="https://github.com/user-attachments/assets/fd66c044-5536-4947-98d8-45e4b168b34c"
/>
</p> 


## Changes

* Added `platform.core.integration_knowledge` builtin tool to
`agent_builder_platform` that searches Fleet integration documentation
* Tool is registered in plugin `setup()` with conditional availability
using the `availability` configuration pattern
* Availability is checked at runtime via ES search on
`.integration_knowledge` index (using `size: 0` query)
* Returns structured resource results with package name, version,
filename, and content

## Technical Details

* Tool registration added to `registerTools()` in plugin `setup()`
phase, following the same pattern as `productDocumentationTool`
* Uses `availability` configuration with `cacheMode: 'space'` to
conditionally show/hide the tool based on index availability
* Searches using Elasticsearch semantic search on the `content` field
* `esClient.asInternalUser` is used for both handler execution and
availability checking (index permissions require internal user)
* Results include reference URLs to integration detail pages
(`/app/integrations/detail/{package_name}`)

## Considerations

* Tool requires Fleet to have indexed integration knowledge into
`.integration_knowledge`
* Tool availability is checked per-space and cached for performance
* No Kibana restart required - tool appears/disappears dynamically based
on index availability
* This is the onechat/Agent Builder equivalent of the existing
`IntegrationKnowledgeTool` in Security Solution's Assistant
(elastic#236197) and Observability
Solution's Assistant (elastic#237085)
added in `9.2`.

---

## Testing

> [!NOTE]
> You must enable the `xpack.fleet.enableExperimental:
["installIntegrationsKnowledge"]` feature flag until this PR enabling it
by default is merged (elastic#245080).


1. Upload this sample
[system-2.3.3-NEXT.zip](https://github.com/user-attachments/files/22546766/system-2.3.3-NEXT.zip)
package via Integrations > Create new integration
- The test package just copies the existing `docs/README.md` to
`docs/knowledge_base/README.md` so that Fleet ingests it into
`.integrations_knowledge`
2. Create new Agent with the new Integration Knowledge tool and ask
questions related to system integrations, such as:
    - How can I collect CPU and memory data for my windows host?
    - What OS can I run the system integration on?
    - What does the system integration do?
3. Observe that the responses returned contain relevant information that
is cited from the system integration.





_PR developed with Cursor + Opus 4.5_

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
seanrathier pushed a commit to seanrathier/kibana that referenced this pull request Dec 15, 2025
…re flag (elastic#245080)

## Summary

Closes elastic/ingest-dev#6276

- Enable feature flag `installIntegrationsKnowledge` by default
- Added setting `integration_knowledge_enabled` to
`ingest_manager_settings` SO to save the UI setting
- When the integration knowledge user setting is turned off, new package
installations will skip the knowledge base indexing step
- When the integration knowledge user setting is turned on again, there
is a one-time async task scheduled that reindexes knowledge base for
installed packages

The `Manage integration knowledge` action is added to Installed
integrations table.

<img width="1174" height="759" alt="image"
src="https://github.com/user-attachments/assets/18d8822d-762c-4bb8-a9b0-f9a44e88be98"
/>

Clicking the action opens the flyout about the Knowledge base
information and the switch to be able to opt out.
<img width="1179" height="864" alt="image"
src="https://github.com/user-attachments/assets/3d9b918e-22b8-4fd0-a124-ab3ead660709"
/>

UX design follows
[Figma](https://www.figma.com/design/HC1aLbWSn9mKFhv0cZN1gF/Ingest-small-improvements?node-id=264-299&p=f&t=4Y476O5LHWPQDKgp-0)

To verify:
- Open the flyout and verify that the switch is on by default
- Turn the switch off and click Save on the flyout
- Install a new package and verify that the knowledge_base ES asset is
not created
- Go back to the flyout and switch the setting on again
- Verify that the previously installed package has now an ES asset type
knowledge_base

```
GET .kibana_ingest/_doc/epm-packages:apache

"installed_es": [
...
        {
          "id": "apache-README.md",
          "type": "knowledge_base"
        }
      ],
```

Kibana logs

```

[2025-12-03T13:30:34.253+01:00][INFO ][plugins.fleet] Scheduling task to reindex integration knowledge for installed packages

[2025-12-03T13:30:36.584+01:00][DEBUG][plugins.fleet] Successfully indexed 1 knowledge base documents for package apache. Document IDs: apache-README.md
[2025-12-03T13:30:36.585+01:00][DEBUG][plugins.fleet] Knowledge base step: Saved 1 documents to index for package apache@2.1.1
[2025-12-03T13:30:36.587+01:00][DEBUG][plugins.fleet] Successfully indexed 1 knowledge base documents for package system. Document IDs: system-README.md
[2025-12-03T13:30:36.587+01:00][DEBUG][plugins.fleet] Knowledge base step: Saved 1 documents to index for package system@2.8.0
```

Added tour component and learn more link:

<img width="1690" height="814" alt="image"
src="https://github.com/user-attachments/assets/f98ddded-8e01-4c29-ad2c-2ceedbff1e24"
/>
<img width="1785" height="1150" alt="image"
src="https://github.com/user-attachments/assets/66250391-39a6-4192-adec-a1f45406440f"
/>



### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: florent-leborgne <florent.leborgne@elastic.co>
Co-authored-by: Cristina Amico <criamico@users.noreply.github.com>
juliaElastic added a commit that referenced this pull request Dec 18, 2025
## Summary

Closes #246383
Closes #246204
Closes #235915
Closes #246213
Closes #246183
Closes #246272

Use retry instead of wait for api tests to wait for knowledge_base docs
when installing packages, as it is an async step.
Relates #245080

It seems the knowledge base indexing is flaky, sometimes failing with
errors.

```
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] Bulk index operation failed: {"index":{"_index":".integration_knowledge","_id":"all_assets-README.md","status":400,"error":{"type":"inference_exception","reason":"Exception when running inference id [.elser-2-elasticsearch] on field [content]","caused_by":{"type":"status_exception","reason":"Model definition truncated. Unable to deserialize trained model definition [.elser_model_2_linux-x86_64]"}}}} {"service":{"node":{"roles":["background_tasks","ui"]}}}
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] 1 out of 1 documents failed to index for package all_assets {"service":{"node":{"roles":["background_tasks","ui"]}}}

[00:02:46]         │ proc [kibana] [2025-12-16T10:27:11.732+00:00][WARN ][plugins.fleet] Retrying Elasticsearch operation after [8s] due to error: TimeoutError: Request timed out TimeoutError: Request timed out
[00:02:46]         │ proc [kibana]     at KibanaTransport._request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:564:50)
[00:02:46]         │ proc [kibana]     at processTicksAndRejections (node:internal/process/task_queues:105:5)
[00:02:46]         │ proc [kibana]     at /opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:631:32
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:627:20)
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)
[00:02:46]         │ proc [kibana]     at ClientTraced.BulkApi [as bulk] (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/elasticsearch/lib/api/api/bulk.js:75:12)
[00:02:46]         │ proc [kibana]     at retryTransientEsErrors (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/elasticsearch/retry.js:36:12)
[00:02:46]         │ proc [kibana]     at saveKnowledgeBaseContentToIndex (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/knowledge_base_index.js:60:26)
[00:02:46]         │ proc [kibana]     at indexKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:110:27)
[00:02:46]         │ proc [kibana]     at stepSaveKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:98:10) {"service":{"node":{"roles":["background_tasks","ui"]}}}
```

There is also some flakyness with deleting ingest pipelines of old
package versions, so changed some asserts to check that expected assets
are in `installed_es` array, instead of exact equality.

```
 { id: 'logs-all_assets.test_logs-0.1.0',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline1',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline2',
       type: 'ingest_pipeline' },
```

### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 19, 2025
## Summary

Closes elastic#246383
Closes elastic#246204
Closes elastic#235915
Closes elastic#246213
Closes elastic#246183
Closes elastic#246272

Use retry instead of wait for api tests to wait for knowledge_base docs
when installing packages, as it is an async step.
Relates elastic#245080

It seems the knowledge base indexing is flaky, sometimes failing with
errors.

```
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] Bulk index operation failed: {"index":{"_index":".integration_knowledge","_id":"all_assets-README.md","status":400,"error":{"type":"inference_exception","reason":"Exception when running inference id [.elser-2-elasticsearch] on field [content]","caused_by":{"type":"status_exception","reason":"Model definition truncated. Unable to deserialize trained model definition [.elser_model_2_linux-x86_64]"}}}} {"service":{"node":{"roles":["background_tasks","ui"]}}}
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] 1 out of 1 documents failed to index for package all_assets {"service":{"node":{"roles":["background_tasks","ui"]}}}

[00:02:46]         │ proc [kibana] [2025-12-16T10:27:11.732+00:00][WARN ][plugins.fleet] Retrying Elasticsearch operation after [8s] due to error: TimeoutError: Request timed out TimeoutError: Request timed out
[00:02:46]         │ proc [kibana]     at KibanaTransport._request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:564:50)
[00:02:46]         │ proc [kibana]     at processTicksAndRejections (node:internal/process/task_queues:105:5)
[00:02:46]         │ proc [kibana]     at /opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:631:32
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:627:20)
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)
[00:02:46]         │ proc [kibana]     at ClientTraced.BulkApi [as bulk] (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/elasticsearch/lib/api/api/bulk.js:75:12)
[00:02:46]         │ proc [kibana]     at retryTransientEsErrors (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/elasticsearch/retry.js:36:12)
[00:02:46]         │ proc [kibana]     at saveKnowledgeBaseContentToIndex (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/knowledge_base_index.js:60:26)
[00:02:46]         │ proc [kibana]     at indexKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:110:27)
[00:02:46]         │ proc [kibana]     at stepSaveKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:98:10) {"service":{"node":{"roles":["background_tasks","ui"]}}}
```

There is also some flakyness with deleting ingest pipelines of old
package versions, so changed some asserts to check that expected assets
are in `installed_es` array, instead of exact equality.

```
 { id: 'logs-all_assets.test_logs-0.1.0',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline1',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline2',
       type: 'ingest_pipeline' },
```

### Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

(cherry picked from commit f128446)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 19, 2025
## Summary

Closes elastic#246383
Closes elastic#246204
Closes elastic#235915
Closes elastic#246213
Closes elastic#246183
Closes elastic#246272

Use retry instead of wait for api tests to wait for knowledge_base docs
when installing packages, as it is an async step.
Relates elastic#245080

It seems the knowledge base indexing is flaky, sometimes failing with
errors.

```
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] Bulk index operation failed: {"index":{"_index":".integration_knowledge","_id":"all_assets-README.md","status":400,"error":{"type":"inference_exception","reason":"Exception when running inference id [.elser-2-elasticsearch] on field [content]","caused_by":{"type":"status_exception","reason":"Model definition truncated. Unable to deserialize trained model definition [.elser_model_2_linux-x86_64]"}}}} {"service":{"node":{"roles":["background_tasks","ui"]}}}
[00:04:20]           │ proc [kibana] [2025-12-16T12:24:05.592+00:00][ERROR][plugins.fleet] 1 out of 1 documents failed to index for package all_assets {"service":{"node":{"roles":["background_tasks","ui"]}}}

[00:02:46]         │ proc [kibana] [2025-12-16T10:27:11.732+00:00][WARN ][plugins.fleet] Retrying Elasticsearch operation after [8s] due to error: TimeoutError: Request timed out TimeoutError: Request timed out
[00:02:46]         │ proc [kibana]     at KibanaTransport._request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:564:50)
[00:02:46]         │ proc [kibana]     at processTicksAndRejections (node:internal/process/task_queues:105:5)
[00:02:46]         │ proc [kibana]     at /opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:631:32
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/transport/lib/Transport.js:627:20)
[00:02:46]         │ proc [kibana]     at KibanaTransport.request (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/core-elasticsearch-client-server-internal/src/create_transport.js:60:16)
[00:02:46]         │ proc [kibana]     at ClientTraced.BulkApi [as bulk] (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@elastic/elasticsearch/lib/api/api/bulk.js:75:12)
[00:02:46]         │ proc [kibana]     at retryTransientEsErrors (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/elasticsearch/retry.js:36:12)
[00:02:46]         │ proc [kibana]     at saveKnowledgeBaseContentToIndex (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/knowledge_base_index.js:60:26)
[00:02:46]         │ proc [kibana]     at indexKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:110:27)
[00:02:46]         │ proc [kibana]     at stepSaveKnowledgeBase (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1765880237283723306/elastic/kibana-flaky-test-suite-runner/kibana-build-xpack/node_modules/@kbn/fleet-plugin/server/services/epm/packages/install_state_machine/steps/step_save_knowledge_base.js:98:10) {"service":{"node":{"roles":["background_tasks","ui"]}}}
```

There is also some flakyness with deleting ingest pipelines of old
package versions, so changed some asserts to check that expected assets
are in `installed_es` array, instead of exact equality.

```
 { id: 'logs-all_assets.test_logs-0.1.0',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline1',
       type: 'ingest_pipeline' },
     { id: 'logs-all_assets.test_logs-0.1.0-pipeline2',
       type: 'ingest_pipeline' },
```

### Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [ ] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels.

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

(cherry picked from commit f128446)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:feature Makes this part of the condensed release notes Team:Fleet Team label for Observability Data Collection Fleet team v9.3.0