repo-mind-light-aw is the agentic workflow surface for Repo Mind Light. It packages the GitHub Agentic Workflows integration as a reusable shared workflow.
This repository intentionally contains workflow packaging and documentation only. It does not contain the Repo Mind Light implementation itself.
The content here is written for two audiences at the same time:
- humans who need to understand what the integration does and when to use it
- agents that need enough operational detail to compose or modify gh-aw workflows correctly
- A reusable gh-aw shared workflow for Repo Mind Light
- Repo Mind Light index preparation and artifact handoff
- MCP server startup, preload readiness checks, and cleanup
- Shared prompt guidance that tells agents to use Repo Mind Light first
- Shared Repo Mind Light tool timeout configuration
Repo Mind Light provides holistic repository understanding over GitHub repository data.
It is designed to answer questions such as:
- what parts of the repository are relevant to this feature or problem
- what historical issues or pull requests are related
- what architectural or implementation decisions shaped the current state
- what code areas are likely to matter for a given symptom or topic
At the workflow level, this integration exposes one MCP server named repo-mind with one allowed tool:
query: takes a focused natural-language query and returns retrieved repository context grounded in the prepared Repo Mind Light index
The final answer can combine evidence from:
- indexed issues
- indexed pull requests
- repository content already captured in the Repo Mind Light index
- GitHub code search results used to mine relevant code snippets
Repo Mind Light does not just retrieve raw matches. It also uses embeddings and a generative model to synthesize a final answer from the retrieved evidence.
The intended usage pattern is:
- prepare or restore an index for the target repository
- start the Repo Mind Light MCP server against that index
- ask one focused
query - ask at most one follow-up
queryif a specific gap remains - only then fall back to broader GitHub or repository exploration
.github/workflows/shared/repo-mind-light.md: the shared gh-aw importLICENSE: MIT license for this repository
Use this shared workflow when you want a gh-aw workflow to query repository context through Repo Mind Light without duplicating the setup in every consumer workflow.
It is a good fit for workflows such as:
- holistic repository understanding
- high-level architectural analysis
- historical investigation across issues, pull requests, and code
- workflows that need repository-specific context before taking another action
- workflows that need help locating relevant subsystems, past decisions, or strongly related code
It is less about narrow issue-routing automation and more about building a high-value repository context layer for an agent.
Import the shared workflow from a consuming gh-aw workflow:
imports:
- uses: githubnext/repo-mind-light-aw/.github/workflows/shared/repo-mind-light.md@main
with:
config:
yaml: |
slug: ${{ github.repository }}
store_path: /var/lib/repo-mind-light/index
refresh_if_older_than: 1d
query:
preload_query_sources_on_startup: trueThen add consumer-specific permissions, GitHub tool settings, safe outputs, and task instructions in your own workflow.
For stability-sensitive consumers, prefer pinning the import to a commit SHA or release tag instead of @main.
Using Repo Mind Light adds extra cost beyond the base gh-aw workflow run.
That cost exists because Repo Mind Light:
- computes embedding vectors during indexing and refresh
- uses a generative model to produce the final
queryanswer
This is why the shared workflow requires COPILOT_GITHUB_TOKEN.
Consumer workflows can still use another outer agent runtime such as Claude or Codex.
The important constraint is not the outer workflow engine. The important constraint is that Repo Mind Light itself depends on Copilot-backed model access for embeddings and answer generation.
So even when the surrounding workflow uses a non-Copilot engine, COPILOT_GITHUB_TOKEN still must be available to Repo Mind Light or indexing and query synthesis will not work.
The shared workflow accepts these inputs:
config: Repo Mind Light configuration object. Required.config.yaml: full Repo Mind Light configuration content stored under the requiredyamlproperty.image: optional Repo Mind Light container image override.cache-prefix: optional prefix for refreshed cache keys.cache-restore-key: optional restore key for the index cache.container-name: optional Docker container name for the MCP server.artifact-name: optional artifact base name for the prepared index.
These inputs are intentionally narrow. Most behavioral tuning happens inside config.yaml, which is passed through directly to Repo Mind Light.
The most important underlying Repo Mind Light configuration fields are:
slug: owner/repo
store_path: /var/lib/repo-mind-light/index
chat_model: claude-sonnet-4.6
refresh_if_older_than: 1d
indexing:
keep_count: 1000
issue_state: all
pr_state: all
issue_labels: null
pr_labels: null
ignore_bot_authored: true
query:
preload_query_sources_on_startup: true
graph_rag_zero:
top_k_initial: 1000
max_chunks: 200
relative_dropoff_threshold: 0.1
knn_neighbors: 5
max_cluster_size: 10
code_search:
enabled: true
limit: 50Important schema notes for workflow authors and agents:
slugis the only required Repo Mind Light field.refresh_if_older_thancontrols whether cached indexes are reused or refreshed.issue_stateacceptsopen,closed,all,none, ornull.pr_stateacceptsopen,closed,merged,all,none, ornull.issue_labelsandpr_labelsare optional any-of filters.ignore_bot_authored: trueis usually a good default for support and triage workflows.query.preload_query_sources_on_startup: truereduces first-query latency by warming preloadable sources after server startup.query.code_search.enabledlets Repo Mind Light use its code-search-backed retrieval path when available.
The query tool is meant for targeted repository-context retrieval, not for broad speculative searching.
Good query patterns include:
- an exact user-visible error message
- a feature or workflow name
- a suspected subsystem or ownership area
- a request for similar incidents or regressions
- a request for files, classes, or modules relevant to a concrete symptom
Agents should prefer precise, discriminating prompts over broad prompts such as "tell me about this repo".
This shared workflow has several storage behaviors that matter operationally:
- The prepared index is uploaded as a GitHub Actions artifact with
retention-days: 1. - The local working directory uses
.index-storeas the mounted index path during workflow execution. - Refreshed cache entries are saved only when the Repo Mind Light indexer reports
refreshed=truein its structured--result-jsonoutput. - Cache keys are derived from the refresh date and are intentionally immutable.
- Cache eviction after that point is controlled by GitHub Actions cache retention policy, not by this repository.
This means the workflow gives consumers short-lived artifact handoff plus reusable cache restore behavior, but it does not implement custom cache pruning.
Consuming workflows should provide:
COPILOT_GITHUB_TOKENso Repo Mind Light can call Copilot model endpoints for embeddings and final answer generation- GitHub read permissions appropriate for the repository data being indexed
- Docker support on the runner used by the workflow
The shared workflow expects to use the public Repo Mind Light image published at ghcr.io/githubnext/repo-mind-light unless the image input overrides it.
By default, the shared workflow tracks ghcr.io/githubnext/repo-mind-light:latest so it picks up the current published release. Consumers that need stricter reproducibility should override image with an explicit tag or digest.
Consumers that only need Repo Mind Light for some events can pass a GitHub Actions expression through run-if. The expression is applied to the Repo Mind Light prep job; when it is false, the dependent agent job is skipped too.
imports:
- uses: githubnext/repo-mind-light-aw/.github/workflows/shared/repo-mind-light.md@main
with:
run-if: >-
github.event_name == 'workflow_dispatch' ||
contains(github.event.issue.labels.*.name, 'support-escalation') ||
contains(github.event.issue.labels.*.name, 'internal-feedback')
config:
yaml: |
slug: ${{ github.repository }}
store_path: /var/lib/repo-mind-light/index
refresh_if_older_than: 1dConsumers should also ensure:
- the runner can bind port
8000for the Repo Mind Light MCP server jqandcurlare available on the runner image used by gh-aw jobs- enough disk space exists for the prepared index and temporary result files
At a high level, the import does four things:
- Refreshes or reuses a Repo Mind Light index in a dedicated prep job.
- Uploads that index as an artifact and restores it into the agent job.
- Starts the Repo Mind Light MCP server and waits for preload readiness.
- Adds prompt instructions telling the agent to query Repo Mind Light before broader repo exploration.
More concretely:
- the prep job writes the config file, restores
.index-store, and runsrepo-mind-light index --result-json ... - the structured JSON result determines whether a refreshed cache key should be saved
- the agent job downloads the prepared index artifact, starts the MCP server, and waits on
/preload-status - the import exposes only the
querytool to the agent from the Repo Mind Light server - the import sets shared tool timeout values of
600seconds for startup and tool execution
Agents and workflow authors should assume the following constraints:
- this repository contains integration logic, not the private or implementation-specific source for Repo Mind Light
- the shared workflow assumes Repo Mind Light is enabled for the whole consumer workflow unless the consumer passes
run-if - event-specific gating policy belongs in the consuming workflow and can be passed through the
run-ifinput - the import is designed for repository-scoped retrieval, not for cross-repository discovery by itself
- the MCP server startup path depends on a valid prepared index; if no index exists or indexing fails, the consumer workflow should fail fast
To validate the shared workflow locally:
gh aw compile .github/workflows/shared/repo-mind-light.mdTo validate end-to-end behavior, import it into a consumer workflow and compile that consumer workflow as well.
For real validation, also exercise:
- one run that reuses a cache
- one run that forces a refresh
- one query that succeeds quickly
- one query against a low-signal input so the consumer workflow can verify its fallback behavior
This repository is the gh-aw integration surface for Repo Mind Light.
MIT. See LICENSE.