Skip to content

[panw] Add state field to IPSec tunnels#48403

Merged
6 commits merged into
mainfrom
unknown repository
Mar 12, 2026
Merged

[panw] Add state field to IPSec tunnels#48403
6 commits merged into
mainfrom
unknown repository

Conversation

@ghost

@ghost ghost commented Jan 13, 2026

Copy link
Copy Markdown

Overview

This PR adds the state field to IPSec tunnels.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic Bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 13, 2026
@ghost ghost force-pushed the panw_tunnels_state branch from a7155bc to 34496d1 Compare January 13, 2026 11:51
@github-actions

Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
@ghost ghost self-assigned this Jan 13, 2026
@ghost ghost added the Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team label Jan 13, 2026
@botelastic botelastic Bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 13, 2026
@ghost ghost added enhancement needs_team Indicates that the issue/PR needs a Team:* label labels Jan 13, 2026
@botelastic botelastic Bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 13, 2026
@mergify

mergify Bot commented Jan 13, 2026

Copy link
Copy Markdown
Contributor

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @gpop63? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.
@ghost ghost force-pushed the panw_tunnels_state branch from 1ca7afa to c126334 Compare January 30, 2026 10:48
@ghost ghost marked this pull request as ready for review January 30, 2026 12:34
@ghost ghost self-requested a review as a code owner January 30, 2026 12:34
@ghost

ghost commented Mar 6, 2026

Copy link
Copy Markdown
Author
    "panw": {
      "interfaces": {
        "ipsec_tunnel.TSi_proto": 0,
        "ipsec_tunnel.TSr_prefix": 0,
        "ipsec_tunnel.proto": "ESP",
        "ipsec_tunnel.kb": 0,
        "ipsec_tunnel.hash": "SHA1",
        "ipsec_tunnel.mode": "tunl",
        "ipsec_tunnel.TSi_port": 0,
        "ipsec_tunnel.enc": "AES256",
        "ipsec_tunnel.id": 2,
        "ipsec_tunnel.TSr_proto": 0,
        "ipsec_tunnel.gw": "REDACTED",
        "ipsec_tunnel.life.sec": 8400,
        "ipsec_tunnel.TSr_ip": "0.0.0.0",
        "ipsec_tunnel.TSi_ip": "0.0.0.0",
        "ipsec_tunnel.TSi_prefix": 0,
        "ipsec_tunnel.TSr_port": 0,
        "ipsec_tunnel.state": "init",
        "ipsec_tunnel.name": "REDACTED",
        "ipsec_tunnel.dh": "no-pfs"
      }
@coderabbitai

coderabbitai Bot commented Mar 6, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds IPSec tunnel state tracking to the panw Metricbeat module. Introduces a changelog fragment and a State field on TunnelsEntry. Adds new types for tunnel flow responses (TunnelFlowResponse, TunnelFlowResult, TunnelFlowIPSec, TunnelFlowEntry) and functions tunnelFlowQuery and getTunnelState. Implements concurrent per-tunnel flow queries to enrich entries and includes ipsec_tunnel.state in emitted events. Adds extensive tests covering state/no-state cases, multiple entries, empty results, extra fields, monitor data, invalid XML, client/flow errors, and large responses.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@x-pack/metricbeat/module/panw/interfaces/tunnels.go`:
- Around line 23-40: getTunnelState currently ignores PAN-OS error payloads
because it only checks transport/unmarshal errors; update getTunnelState to
inspect the unmarshaled TunnelFlowResponse.Status (and any error/message fields
on TunnelFlowResponse) and return a formatted error when Status indicates a
failure instead of falling through to return an empty state. Locate
getTunnelState and the TunnelFlowResponse handling, check response.Status (e.g.
!= "success" or == "error") and return an error that includes the PAN-OS error
details so the caller's warning path is exercised rather than emitting an empty
tunnel state.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8da48d73-17d8-453d-84a6-9bbfb6494c3c

📥 Commits

Reviewing files that changed from the base of the PR and between f4dbbfc and e1f69eb.

📒 Files selected for processing (4)
  • changelog/fragments/1768219693-add-ipsec-tunnel-state.yaml
  • x-pack/metricbeat/module/panw/interfaces/interface_types.go
  • x-pack/metricbeat/module/panw/interfaces/tunnels.go
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
Comment thread x-pack/metricbeat/module/panw/interfaces/tunnels.go
Comment thread x-pack/metricbeat/module/panw/interfaces/tunnels.go Outdated
@github-actions

Copy link
Copy Markdown
Contributor

golangci-lint failed in run 22891784820 due to 4 lint violations in x-pack/metricbeat/module/panw/interfaces/tunnels_test.go; Ubuntu failed and macOS/Windows were canceled afterward.

Root cause from logs:

  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go:24:18errcheck: return m.opFunc(req.(string))
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go:406:2testifylint: use assert.Empty
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go:442:2testifylint: use assert.Empty
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go:577:2testifylint: use assert.Empty

Recommended minimal fix:

  1. In mockPanwClient.Op, avoid direct return m.opFunc(req.(string)); do explicit type assertion + explicit error path, then call m.opFunc and return values after checking/propagating err.
  2. Replace these assertions with assert.Empty(...):
    • assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"], ...) (line ~406)
    • assert.Equal(t, "", events[1].MetricSetFields["ipsec_tunnel.state"]) (line ~442)
    • assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"]) (line ~577)

After patching, rerun golangci-lint for the module (or push and let this workflow re-run).

Tests run here: workflow-log analysis only (no local test execution).


What is this? | From workflow: PR Actions Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@shmsr

shmsr commented Mar 10, 2026

Copy link
Copy Markdown
Member

A few issues still need to be addressed before I’d be comfortable merging this.

  1. Major: please treat PAN-OS XML status="error" as a real failure, not as “no state”
    Both the per-tunnel flow lookup in x-pack/metricbeat/module/panw/interfaces/tunnels.go and the top-level IPSec tunnel query should validate the response status explicitly. A valid XML error payload should surface as an error path, not silently collapse into an empty state or empty event set.

  2. Major: the per-tunnel flow lookup is on the fetch hot path and currently scales poorly
    In x-pack/metricbeat/module/panw/interfaces/tunnels.go, this adds one blocking RPC per tunnel during collection. On devices with many tunnels, scrape time grows linearly and may miss the metricset interval. Please switch this to a bounded concurrency or batched strategy.

  3. Medium: avoid clobbering an existing base state with an empty flow response
    If the main tunnel response already contains a non-empty state, the flow query should not overwrite it with "". The safer behavior is:

  • skip flow lookup when base state is already present, or
  • only overwrite when the flow response returns a non-empty state
  1. Medium: test coverage should be strengthened around enrichment behavior
    Please add targeted coverage in x-pack/metricbeat/module/panw/interfaces/tunnels_test.go for:
  • flow response with status="error"
  • top-level tunnel response with status="error"
  • mixed input where some entries already have base state and should not trigger flow lookups
  • larger tunnel sets to validate enrichment/query behavior under more realistic counts
  1. Address lint issues and coderabbit reviews

cc: @gpop63

I already have the patch for the same; let me know if you want me to push?

@shmsr

shmsr commented Mar 10, 2026

Copy link
Copy Markdown
Member
diff --git a/x-pack/metricbeat/module/panw/interfaces/tunnels.go b/x-pack/metricbeat/module/panw/interfaces/tunnels.go
index e3bb2f7b7b..7c206c9cca 100644
--- a/x-pack/metricbeat/module/panw/interfaces/tunnels.go
+++ b/x-pack/metricbeat/module/panw/interfaces/tunnels.go
@@ -7,6 +7,7 @@ package interfaces
 import (
 	"encoding/xml"
 	"fmt"
+	"sync"
 	"time"
 
 	"github.com/elastic/beats/v7/metricbeat/mb"
@@ -14,7 +15,10 @@ import (
 	"github.com/elastic/elastic-agent-libs/mapstr"
 )
 
-const IPSecTunnelsQuery = "<show><vpn><tunnel></tunnel></vpn></show>"
+const (
+	IPSecTunnelsQuery               = "<show><vpn><tunnel></tunnel></vpn></show>"
+	maxConcurrentTunnelStateQueries = 5
+)
 
 func tunnelFlowQuery(tunnelID int) string {
 	return fmt.Sprintf("<show><running><tunnel><flow><tunnel-id>%d</tunnel-id></flow></tunnel></running></show>", tunnelID)
@@ -32,6 +36,9 @@ func getTunnelState(m *MetricSet, tunnelID int) (string, error) {
 	if err != nil {
 		return "", fmt.Errorf("error unmarshaling tunnel flow response for tunnel %d: %w", tunnelID, err)
 	}
+	if response.Status != "success" {
+		return "", fmt.Errorf("tunnel flow query for tunnel %d returned status %q", tunnelID, response.Status)
+	}
 
 	if len(response.Result.IPSec.Entries) > 0 {
 		return response.Result.IPSec.Entries[0].State, nil
@@ -41,7 +48,6 @@ func getTunnelState(m *MetricSet, tunnelID int) (string, error) {
 }
 
 func getIPSecTunnelEvents(m *MetricSet) ([]mb.Event, error) {
-
 	var response TunnelsResponse
 
 	output, err := m.client.Op(IPSecTunnelsQuery, panw.Vsys, nil, nil)
@@ -55,21 +61,62 @@ func getIPSecTunnelEvents(m *MetricSet) ([]mb.Event, error) {
 		m.logger.Error("Error: %s", err)
 		return nil, fmt.Errorf("error unmarshaling IPSec tunnels response: %w", err)
 	}
+	if response.Status != "success" {
+		return nil, fmt.Errorf("IPSec tunnels query returned status %q", response.Status)
+	}
+
+	type stateResult struct {
+		index int
+		id    int
+		state string
+		err   error
+	}
+
+	jobs := make(chan int)
+	results := make(chan stateResult, len(response.Result.Entries))
+
+	workers := maxConcurrentTunnelStateQueries
+	if workers > len(response.Result.Entries) {
+		workers = len(response.Result.Entries)
+	}
+
+	var wg sync.WaitGroup
+	for w := 0; w < workers; w++ {
+		wg.Add(1)
+		go func() {
+			defer wg.Done()
+			for i := range jobs {
+				entry := response.Result.Entries[i]
+				state, err := getTunnelState(m, entry.ID)
+				results <- stateResult{index: i, id: entry.ID, state: state, err: err}
+			}
+		}()
+	}
 
-	// Fetch state for each tunnel via individual flow queries
 	for i, entry := range response.Result.Entries {
-		state, err := getTunnelState(m, entry.ID)
-		if err != nil {
-			m.logger.Warnf("Failed to get state for tunnel %d: %s", entry.ID, err)
+		if entry.State != "" {
 			continue
 		}
-		response.Result.Entries[i].State = state
+		jobs <- i
+	}
+	close(jobs)
+
+	wg.Wait()
+	close(results)
+
+	for result := range results {
+		if result.err != nil {
+			m.logger.Warnf("Failed to get state for tunnel %d: %s", result.id, result.err)
+			continue
+		}
+		if result.state != "" {
+			response.Result.Entries[result.index].State = result.state
+		}
 	}
 
 	events := formatIPSecTunnelEvents(m, response.Result.Entries)
 
 	return events, nil
-
 }
 
 func formatIPSecTunnelEvents(m *MetricSet, entries []TunnelsEntry) []mb.Event {
@@ -111,5 +158,4 @@ func formatIPSecTunnelEvents(m *MetricSet, entries []TunnelsEntry) []mb.Event {
 	}
 
 	return events
-
 }
diff --git a/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go b/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
index 7260a1f578..fec701507a 100644
--- a/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
+++ b/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
@@ -7,6 +7,9 @@ package interfaces
 import (
 	"errors"
 	"fmt"
+	"strconv"
+	"strings"
+	"sync"
 	"testing"
 
 	"github.com/stretchr/testify/assert"
@@ -21,7 +24,17 @@ type mockPanwClient struct {
 }
 
 func (m *mockPanwClient) Op(req any, vsys string, extras, ans any) ([]byte, error) {
-	return m.opFunc(req.(string))
+	reqStr, ok := req.(string)
+	if !ok {
+		return nil, fmt.Errorf("unexpected request type %T", req)
+	}
+
+	out, err := m.opFunc(reqStr)
+	if err != nil {
+		return nil, err
+	}
+
+	return out, nil
 }
 
 func flowResponseXML(state string) string {
@@ -35,6 +48,49 @@ func flowResponseXML(state string) string {
 </result></response>`, state)
 }
 
+func largeTunnelEntryXML(id int, state string) string {
+	stateTag := ""
+	if state != "" {
+		stateTag = fmt.Sprintf("<state>%s</state>", state)
+	}
+
+	return fmt.Sprintf(`<entry>
+      %s
+      <gw>gw_%d</gw>
+      <kb>%d</kb>
+      <life>3600</life>
+      <TSr_port>0</TSr_port>
+      <hash>SHA256</hash>
+      <TSi_prefix>0</TSi_prefix>
+      <TSi_ip>0.0.0.0</TSi_ip>
+      <proto>ESP</proto>
+      <TSr_proto>0</TSr_proto>
+      <enc>AES256-GCM16</enc>
+      <TSr_prefix>0</TSr_prefix>
+      <mode>tunl</mode>
+      <TSi_port>0</TSi_port>
+      <TSr_ip>0.0.0.0</TSr_ip>
+      <dh>DH20</dh>
+      <id>%d</id>
+      <TSi_proto>0</TSi_proto>
+      <name>tl_%d</name>
+    </entry>`, stateTag, id, id*10, id, id)
+}
+
+func largeTunnelsXML(total int, withBaseState func(id int) string) string {
+	var b strings.Builder
+	b.WriteString(`<response status="success"><result><ntun>`)
+	b.WriteString(strconv.Itoa(total))
+	b.WriteString(`</ntun><entries>`)
+
+	for i := 1; i <= total; i++ {
+		b.WriteString(largeTunnelEntryXML(i, withBaseState(i)))
+	}
+
+	b.WriteString(`</entries></result></response>`)
+	return b.String()
+}
+
 const emptyFlowResponseXML = `<response status="success"><result>
   <dp>dp0</dp>
   <IPSec>
@@ -71,6 +127,7 @@ const tunnelXMLWithState = `<response status="success"><result>
   <ntun>1</ntun>
   <entries>
     <entry>
+      <state>active</state>
       <gw>gw_NY01502_009953200587246_CA03028945</gw>
       <kb>512</kb>
       <life>7200</life>
@@ -179,6 +236,53 @@ const tunnelXMLMultipleEntries = `<response status="success"><result>
   </entries>
 </result></response>`
 
+const tunnelXMLMixedState = `<response status="success"><result>
+  <ntun>2</ntun>
+  <entries>
+    <entry>
+      <state>active</state>
+      <gw>gw_LA03601_001122334455667_TX04037856</gw>
+      <kb>1024</kb>
+      <life>1800</life>
+      <TSr_port>0</TSr_port>
+      <hash>SHA256</hash>
+      <TSi_prefix>16</TSi_prefix>
+      <TSi_ip>172.16.0.0</TSi_ip>
+      <proto>ESP</proto>
+      <TSr_proto>0</TSr_proto>
+      <enc>AES256-GCM16</enc>
+      <TSr_prefix>16</TSr_prefix>
+      <mode>tunl</mode>
+      <TSi_port>0</TSi_port>
+      <TSr_ip>172.17.0.0</TSr_ip>
+      <dh>DH19</dh>
+      <id>1</id>
+      <TSi_proto>0</TSi_proto>
+      <name>tl_LA03601_001122334455667_TX04037856</name>
+    </entry>
+    <entry>
+      <gw>gw_SF02701_009988776655443_OR05048967</gw>
+      <kb>0</kb>
+      <life>3600</life>
+      <TSr_port>0</TSr_port>
+      <hash>SHA384</hash>
+      <TSi_prefix>0</TSi_prefix>
+      <TSi_ip>0.0.0.0</TSi_ip>
+      <proto>ESP</proto>
+      <TSr_proto>0</TSr_proto>
+      <enc>AES192-CBC</enc>
+      <TSr_prefix>0</TSr_prefix>
+      <mode>tunl</mode>
+      <TSi_port>0</TSi_port>
+      <TSr_ip>0.0.0.0</TSr_ip>
+      <dh>DH21</dh>
+      <id>2</id>
+      <TSi_proto>0</TSi_proto>
+      <name>tl_SF02701_009988776655443_OR05048967</name>
+    </entry>
+  </entries>
+</result></response>`
+
 const tunnelXMLEmptyEntries = `<response status="success"><result>
   <ntun>0</ntun>
   <entries>
@@ -363,8 +467,8 @@ func TestGetIPSecTunnelEvents_WithState(t *testing.T) {
 			if req == IPSecTunnelsQuery {
 				return []byte(tunnelXMLWithState), nil
 			}
-			// Flow query for tunnel 8
-			return []byte(flowResponseXML("active")), nil
+			t.Fatalf("did not expect flow query when base tunnel state is present, got req=%s", req)
+			return nil, nil
 		},
 	}
 	m := newTestMetricSet(client)
@@ -375,7 +479,6 @@ func TestGetIPSecTunnelEvents_WithState(t *testing.T) {
 
 	event := events[0]
 	assert.Equal(t, 8, event.MetricSetFields["ipsec_tunnel.id"])
-	assert.Equal(t, "tl_NY01502_009953200587246_CA03028945", event.MetricSetFields["ipsec_tunnel.name"])
 	assert.Equal(t, "active", event.MetricSetFields["ipsec_tunnel.state"])
 	assert.Equal(t, "gw_NY01502_009953200587246_CA03028945", event.MetricSetFields["ipsec_tunnel.gw"])
 	assert.Equal(t, "ESP", event.MetricSetFields["ipsec_tunnel.proto"])
@@ -403,7 +506,7 @@ func TestGetIPSecTunnelEvents_NoState(t *testing.T) {
 	event := events[0]
 	assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
 	assert.Equal(t, "tl_DC02401_008842100459123_WS02019834", event.MetricSetFields["ipsec_tunnel.name"])
-	assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"], "State should be empty when flow query returns no entries")
+	assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"], "State should be empty when flow query returns no entries")
 }
 
 func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
@@ -439,7 +542,7 @@ func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
 
 	// Entry 2: no state (empty flow response)
 	assert.Equal(t, 2, events[1].MetricSetFields["ipsec_tunnel.id"])
-	assert.Equal(t, "", events[1].MetricSetFields["ipsec_tunnel.state"])
+	assert.Empty(t, events[1].MetricSetFields["ipsec_tunnel.state"])
 
 	// Entry 3: state = "init"
 	assert.Equal(t, 3, events[2].MetricSetFields["ipsec_tunnel.id"])
@@ -450,6 +553,141 @@ func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
 	assert.Equal(t, "down", events[3].MetricSetFields["ipsec_tunnel.state"])
 }
 
+func TestGetIPSecTunnelEvents_QueriesOnlyMissingState(t *testing.T) {
+	var mu sync.Mutex
+	flowQueries := map[string]int{}
+
+	client := &mockPanwClient{
+		opFunc: func(req string) ([]byte, error) {
+			if req == IPSecTunnelsQuery {
+				return []byte(tunnelXMLMixedState), nil
+			}
+
+			mu.Lock()
+			flowQueries[req]++
+			mu.Unlock()
+
+			if req == tunnelFlowQuery(2) {
+				return []byte(flowResponseXML("down")), nil
+			}
+
+			return []byte(emptyFlowResponseXML), nil
+		},
+	}
+	m := newTestMetricSet(client)
+
+	events, err := getIPSecTunnelEvents(m)
+	require.NoError(t, err)
+	require.Len(t, events, 2)
+
+	assert.Equal(t, 1, events[0].MetricSetFields["ipsec_tunnel.id"])
+	assert.Equal(t, "active", events[0].MetricSetFields["ipsec_tunnel.state"])
+
+	assert.Equal(t, 2, events[1].MetricSetFields["ipsec_tunnel.id"])
+	assert.Equal(t, "down", events[1].MetricSetFields["ipsec_tunnel.state"])
+
+	mu.Lock()
+	defer mu.Unlock()
+	assert.Equal(t, 0, flowQueries[tunnelFlowQuery(1)])
+	assert.Equal(t, 1, flowQueries[tunnelFlowQuery(2)])
+}
+
+func TestGetIPSecTunnelEvents_LargeResponseStateEnrichment(t *testing.T) {
+	const totalTunnels = 40
+
+	baseStateForID := func(id int) string {
+		if id%3 == 0 {
+			return "base-active"
+		}
+		return ""
+	}
+
+	var mu sync.Mutex
+	flowQueryCount := 0
+	var callbackErr error
+
+	client := &mockPanwClient{
+		opFunc: func(req string) ([]byte, error) {
+			if req == IPSecTunnelsQuery {
+				return []byte(largeTunnelsXML(totalTunnels, baseStateForID)), nil
+			}
+
+			idTagStart := strings.Index(req, "<tunnel-id>")
+			idTagEnd := strings.Index(req, "</tunnel-id>")
+			if idTagStart == -1 || idTagEnd == -1 {
+				mu.Lock()
+				if callbackErr == nil {
+					callbackErr = fmt.Errorf("missing tunnel-id in request %q", req)
+				}
+				mu.Unlock()
+				return nil, callbackErr
+			}
+
+			idRaw := req[idTagStart+len("<tunnel-id>") : idTagEnd]
+			id, err := strconv.Atoi(idRaw)
+			if err != nil {
+				mu.Lock()
+				if callbackErr == nil {
+					callbackErr = fmt.Errorf("invalid tunnel-id %q: %w", idRaw, err)
+				}
+				mu.Unlock()
+				return nil, callbackErr
+			}
+
+			mu.Lock()
+			flowQueryCount++
+			if id%3 == 0 && callbackErr == nil {
+				callbackErr = fmt.Errorf("unexpected flow query for tunnel with base state: id=%d", id)
+			}
+			mu.Unlock()
+
+			if id%3 == 0 {
+				return nil, fmt.Errorf("unexpected flow query for tunnel with base state: id=%d", id)
+			}
+
+			return []byte(flowResponseXML(fmt.Sprintf("flow-%d", id))), nil
+		},
+	}
+
+	m := newTestMetricSet(client)
+	events, err := getIPSecTunnelEvents(m)
+	require.NoError(t, err)
+	require.NoError(t, callbackErr)
+	require.Len(t, events, totalTunnels)
+
+	for i, event := range events {
+		id := i + 1
+		assert.Equal(t, id, event.MetricSetFields["ipsec_tunnel.id"])
+
+		if id%3 == 0 {
+			assert.Equal(t, "base-active", event.MetricSetFields["ipsec_tunnel.state"])
+		} else {
+			assert.Equal(t, fmt.Sprintf("flow-%d", id), event.MetricSetFields["ipsec_tunnel.state"])
+		}
+	}
+
+	expectedFlowQueries := totalTunnels - (totalTunnels / 3)
+	mu.Lock()
+	defer mu.Unlock()
+	assert.Equal(t, expectedFlowQueries, flowQueryCount)
+}
+
+func TestGetIPSecTunnelEvents_StatusError(t *testing.T) {
+	const errorTunnelsResponseXML = `<response status="error"><result><msg>request failed</msg></result></response>`
+
+	client := &mockPanwClient{
+		opFunc: func(req string) ([]byte, error) {
+			return []byte(errorTunnelsResponseXML), nil
+		},
+	}
+	m := newTestMetricSet(client)
+
+	events, err := getIPSecTunnelEvents(m)
+	require.Error(t, err)
+	assert.Nil(t, events)
+	assert.Contains(t, err.Error(), `IPSec tunnels query returned status "error"`)
+}
+
 func TestGetIPSecTunnelEvents_EmptyEntries(t *testing.T) {
 	client := &mockPanwClient{
 		opFunc: func(req string) ([]byte, error) {
@@ -559,7 +797,7 @@ func TestGetIPSecTunnelEvents_FlowQueryError(t *testing.T) {
 	client := &mockPanwClient{
 		opFunc: func(req string) ([]byte, error) {
 			if req == IPSecTunnelsQuery {
-				return []byte(tunnelXMLWithState), nil
+				return []byte(tunnelXMLNoState), nil
 			}
 			// Flow query fails
 			return nil, errors.New("timeout")
@@ -573,6 +811,28 @@ func TestGetIPSecTunnelEvents_FlowQueryError(t *testing.T) {
 
 	// State should be empty since flow query failed
 	event := events[0]
-	assert.Equal(t, 8, event.MetricSetFields["ipsec_tunnel.id"])
-	assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"])
+	assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
+	assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"])
+}
+
+func TestGetIPSecTunnelEvents_FlowQueryStatusError(t *testing.T) {
+	const errorFlowResponseXML = `<response status="error"><result><msg>invalid tunnel</msg></result></response>`
+
+	client := &mockPanwClient{
+		opFunc: func(req string) ([]byte, error) {
+			if req == IPSecTunnelsQuery {
+				return []byte(tunnelXMLNoState), nil
+			}
+			return []byte(errorFlowResponseXML), nil
+		},
+	}
+	m := newTestMetricSet(client)
+
+	events, err := getIPSecTunnelEvents(m)
+	require.NoError(t, err)
+	require.Len(t, events, 1)
+
+	event := events[0]
+	assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
+	assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"])
 }

@shmsr shmsr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments.

Comment thread changelog/fragments/1768219693-add-ipsec-tunnel-state.yaml Outdated
mrg-elastic and others added 2 commits March 10, 2026 12:32
Co-authored-by: subham sarkar <sarkar.subhams2@gmail.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@x-pack/metricbeat/module/panw/interfaces/tunnels.go`:
- Around line 83-90: The worker goroutines iterate over jobs and concurrently
call getTunnelState which uses the shared MetricSet.client (a pango.Firewall.Op)
that is not thread-safe; protect those calls by serializing access to the
client—either run the fetch loop single-threaded or add a mutex around every
call that touches MetricSet.client/ pango.Firewall.Op (e.g., wrap the body of
getTunnelState or the call site in the worker loop with a sync.Mutex
lock/unlock) so only one goroutine mutates or uses the client at a time; update
references in the worker pool (jobs loop, getTunnelState, and MetricSet.client)
accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c08498dc-328b-445e-9669-7d1162f147dd

📥 Commits

Reviewing files that changed from the base of the PR and between 67635ca and 40bdd71.

📒 Files selected for processing (3)
  • changelog/fragments/1768219693-add-ipsec-tunnel-state.yaml
  • x-pack/metricbeat/module/panw/interfaces/tunnels.go
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
Comment thread x-pack/metricbeat/module/panw/interfaces/tunnels.go

@shmsr shmsr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; just that one comment from coderabbit

@ghost ghost merged commit 2d47ee9 into elastic:main Mar 12, 2026
34 checks passed
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team

4 participants