[panw] Add state field to IPSec tunnels#48403
Conversation
a7155bc to
34496d1
Compare
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
1ca7afa to
c126334
Compare
"panw": {
"interfaces": {
"ipsec_tunnel.TSi_proto": 0,
"ipsec_tunnel.TSr_prefix": 0,
"ipsec_tunnel.proto": "ESP",
"ipsec_tunnel.kb": 0,
"ipsec_tunnel.hash": "SHA1",
"ipsec_tunnel.mode": "tunl",
"ipsec_tunnel.TSi_port": 0,
"ipsec_tunnel.enc": "AES256",
"ipsec_tunnel.id": 2,
"ipsec_tunnel.TSr_proto": 0,
"ipsec_tunnel.gw": "REDACTED",
"ipsec_tunnel.life.sec": 8400,
"ipsec_tunnel.TSr_ip": "0.0.0.0",
"ipsec_tunnel.TSi_ip": "0.0.0.0",
"ipsec_tunnel.TSi_prefix": 0,
"ipsec_tunnel.TSr_port": 0,
"ipsec_tunnel.state": "init",
"ipsec_tunnel.name": "REDACTED",
"ipsec_tunnel.dh": "no-pfs"
} |
📝 WalkthroughWalkthroughAdds IPSec tunnel state tracking to the panw Metricbeat module. Introduces a changelog fragment and a ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@x-pack/metricbeat/module/panw/interfaces/tunnels.go`:
- Around line 23-40: getTunnelState currently ignores PAN-OS error payloads
because it only checks transport/unmarshal errors; update getTunnelState to
inspect the unmarshaled TunnelFlowResponse.Status (and any error/message fields
on TunnelFlowResponse) and return a formatted error when Status indicates a
failure instead of falling through to return an empty state. Locate
getTunnelState and the TunnelFlowResponse handling, check response.Status (e.g.
!= "success" or == "error") and return an error that includes the PAN-OS error
details so the caller's warning path is exercised rather than emitting an empty
tunnel state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8da48d73-17d8-453d-84a6-9bbfb6494c3c
📒 Files selected for processing (4)
changelog/fragments/1768219693-add-ipsec-tunnel-state.yamlx-pack/metricbeat/module/panw/interfaces/interface_types.gox-pack/metricbeat/module/panw/interfaces/tunnels.gox-pack/metricbeat/module/panw/interfaces/tunnels_test.go
|
Root cause from logs:
Recommended minimal fix:
After patching, rerun Tests run here: workflow-log analysis only (no local test execution). What is this? | From workflow: PR Actions Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
|
A few issues still need to be addressed before I’d be comfortable merging this.
cc: @gpop63 I already have the patch for the same; let me know if you want me to push? |
diff --git a/x-pack/metricbeat/module/panw/interfaces/tunnels.go b/x-pack/metricbeat/module/panw/interfaces/tunnels.go
index e3bb2f7b7b..7c206c9cca 100644
--- a/x-pack/metricbeat/module/panw/interfaces/tunnels.go
+++ b/x-pack/metricbeat/module/panw/interfaces/tunnels.go
@@ -7,6 +7,7 @@ package interfaces
import (
"encoding/xml"
"fmt"
+ "sync"
"time"
"github.com/elastic/beats/v7/metricbeat/mb"
@@ -14,7 +15,10 @@ import (
"github.com/elastic/elastic-agent-libs/mapstr"
)
-const IPSecTunnelsQuery = "<show><vpn><tunnel></tunnel></vpn></show>"
+const (
+ IPSecTunnelsQuery = "<show><vpn><tunnel></tunnel></vpn></show>"
+ maxConcurrentTunnelStateQueries = 5
+)
func tunnelFlowQuery(tunnelID int) string {
return fmt.Sprintf("<show><running><tunnel><flow><tunnel-id>%d</tunnel-id></flow></tunnel></running></show>", tunnelID)
@@ -32,6 +36,9 @@ func getTunnelState(m *MetricSet, tunnelID int) (string, error) {
if err != nil {
return "", fmt.Errorf("error unmarshaling tunnel flow response for tunnel %d: %w", tunnelID, err)
}
+ if response.Status != "success" {
+ return "", fmt.Errorf("tunnel flow query for tunnel %d returned status %q", tunnelID, response.Status)
+ }
if len(response.Result.IPSec.Entries) > 0 {
return response.Result.IPSec.Entries[0].State, nil
@@ -41,7 +48,6 @@ func getTunnelState(m *MetricSet, tunnelID int) (string, error) {
}
func getIPSecTunnelEvents(m *MetricSet) ([]mb.Event, error) {
-
var response TunnelsResponse
output, err := m.client.Op(IPSecTunnelsQuery, panw.Vsys, nil, nil)
@@ -55,21 +61,62 @@ func getIPSecTunnelEvents(m *MetricSet) ([]mb.Event, error) {
m.logger.Error("Error: %s", err)
return nil, fmt.Errorf("error unmarshaling IPSec tunnels response: %w", err)
}
+ if response.Status != "success" {
+ return nil, fmt.Errorf("IPSec tunnels query returned status %q", response.Status)
+ }
+
+ type stateResult struct {
+ index int
+ id int
+ state string
+ err error
+ }
+
+ jobs := make(chan int)
+ results := make(chan stateResult, len(response.Result.Entries))
+
+ workers := maxConcurrentTunnelStateQueries
+ if workers > len(response.Result.Entries) {
+ workers = len(response.Result.Entries)
+ }
+
+ var wg sync.WaitGroup
+ for w := 0; w < workers; w++ {
+ wg.Add(1)
+ go func() {
+ defer wg.Done()
+ for i := range jobs {
+ entry := response.Result.Entries[i]
+ state, err := getTunnelState(m, entry.ID)
+ results <- stateResult{index: i, id: entry.ID, state: state, err: err}
+ }
+ }()
+ }
- // Fetch state for each tunnel via individual flow queries
for i, entry := range response.Result.Entries {
- state, err := getTunnelState(m, entry.ID)
- if err != nil {
- m.logger.Warnf("Failed to get state for tunnel %d: %s", entry.ID, err)
+ if entry.State != "" {
continue
}
- response.Result.Entries[i].State = state
+ jobs <- i
+ }
+ close(jobs)
+
+ wg.Wait()
+ close(results)
+
+ for result := range results {
+ if result.err != nil {
+ m.logger.Warnf("Failed to get state for tunnel %d: %s", result.id, result.err)
+ continue
+ }
+ if result.state != "" {
+ response.Result.Entries[result.index].State = result.state
+ }
}
events := formatIPSecTunnelEvents(m, response.Result.Entries)
return events, nil
-
}
func formatIPSecTunnelEvents(m *MetricSet, entries []TunnelsEntry) []mb.Event {
@@ -111,5 +158,4 @@ func formatIPSecTunnelEvents(m *MetricSet, entries []TunnelsEntry) []mb.Event {
}
return events
-
}
diff --git a/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go b/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
index 7260a1f578..fec701507a 100644
--- a/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
+++ b/x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
@@ -7,6 +7,9 @@ package interfaces
import (
"errors"
"fmt"
+ "strconv"
+ "strings"
+ "sync"
"testing"
"github.com/stretchr/testify/assert"
@@ -21,7 +24,17 @@ type mockPanwClient struct {
}
func (m *mockPanwClient) Op(req any, vsys string, extras, ans any) ([]byte, error) {
- return m.opFunc(req.(string))
+ reqStr, ok := req.(string)
+ if !ok {
+ return nil, fmt.Errorf("unexpected request type %T", req)
+ }
+
+ out, err := m.opFunc(reqStr)
+ if err != nil {
+ return nil, err
+ }
+
+ return out, nil
}
func flowResponseXML(state string) string {
@@ -35,6 +48,49 @@ func flowResponseXML(state string) string {
</result></response>`, state)
}
+func largeTunnelEntryXML(id int, state string) string {
+ stateTag := ""
+ if state != "" {
+ stateTag = fmt.Sprintf("<state>%s</state>", state)
+ }
+
+ return fmt.Sprintf(`<entry>
+ %s
+ <gw>gw_%d</gw>
+ <kb>%d</kb>
+ <life>3600</life>
+ <TSr_port>0</TSr_port>
+ <hash>SHA256</hash>
+ <TSi_prefix>0</TSi_prefix>
+ <TSi_ip>0.0.0.0</TSi_ip>
+ <proto>ESP</proto>
+ <TSr_proto>0</TSr_proto>
+ <enc>AES256-GCM16</enc>
+ <TSr_prefix>0</TSr_prefix>
+ <mode>tunl</mode>
+ <TSi_port>0</TSi_port>
+ <TSr_ip>0.0.0.0</TSr_ip>
+ <dh>DH20</dh>
+ <id>%d</id>
+ <TSi_proto>0</TSi_proto>
+ <name>tl_%d</name>
+ </entry>`, stateTag, id, id*10, id, id)
+}
+
+func largeTunnelsXML(total int, withBaseState func(id int) string) string {
+ var b strings.Builder
+ b.WriteString(`<response status="success"><result><ntun>`)
+ b.WriteString(strconv.Itoa(total))
+ b.WriteString(`</ntun><entries>`)
+
+ for i := 1; i <= total; i++ {
+ b.WriteString(largeTunnelEntryXML(i, withBaseState(i)))
+ }
+
+ b.WriteString(`</entries></result></response>`)
+ return b.String()
+}
+
const emptyFlowResponseXML = `<response status="success"><result>
<dp>dp0</dp>
<IPSec>
@@ -71,6 +127,7 @@ const tunnelXMLWithState = `<response status="success"><result>
<ntun>1</ntun>
<entries>
<entry>
+ <state>active</state>
<gw>gw_NY01502_009953200587246_CA03028945</gw>
<kb>512</kb>
<life>7200</life>
@@ -179,6 +236,53 @@ const tunnelXMLMultipleEntries = `<response status="success"><result>
</entries>
</result></response>`
+const tunnelXMLMixedState = `<response status="success"><result>
+ <ntun>2</ntun>
+ <entries>
+ <entry>
+ <state>active</state>
+ <gw>gw_LA03601_001122334455667_TX04037856</gw>
+ <kb>1024</kb>
+ <life>1800</life>
+ <TSr_port>0</TSr_port>
+ <hash>SHA256</hash>
+ <TSi_prefix>16</TSi_prefix>
+ <TSi_ip>172.16.0.0</TSi_ip>
+ <proto>ESP</proto>
+ <TSr_proto>0</TSr_proto>
+ <enc>AES256-GCM16</enc>
+ <TSr_prefix>16</TSr_prefix>
+ <mode>tunl</mode>
+ <TSi_port>0</TSi_port>
+ <TSr_ip>172.17.0.0</TSr_ip>
+ <dh>DH19</dh>
+ <id>1</id>
+ <TSi_proto>0</TSi_proto>
+ <name>tl_LA03601_001122334455667_TX04037856</name>
+ </entry>
+ <entry>
+ <gw>gw_SF02701_009988776655443_OR05048967</gw>
+ <kb>0</kb>
+ <life>3600</life>
+ <TSr_port>0</TSr_port>
+ <hash>SHA384</hash>
+ <TSi_prefix>0</TSi_prefix>
+ <TSi_ip>0.0.0.0</TSi_ip>
+ <proto>ESP</proto>
+ <TSr_proto>0</TSr_proto>
+ <enc>AES192-CBC</enc>
+ <TSr_prefix>0</TSr_prefix>
+ <mode>tunl</mode>
+ <TSi_port>0</TSi_port>
+ <TSr_ip>0.0.0.0</TSr_ip>
+ <dh>DH21</dh>
+ <id>2</id>
+ <TSi_proto>0</TSi_proto>
+ <name>tl_SF02701_009988776655443_OR05048967</name>
+ </entry>
+ </entries>
+</result></response>`
+
const tunnelXMLEmptyEntries = `<response status="success"><result>
<ntun>0</ntun>
<entries>
@@ -363,8 +467,8 @@ func TestGetIPSecTunnelEvents_WithState(t *testing.T) {
if req == IPSecTunnelsQuery {
return []byte(tunnelXMLWithState), nil
}
- // Flow query for tunnel 8
- return []byte(flowResponseXML("active")), nil
+ t.Fatalf("did not expect flow query when base tunnel state is present, got req=%s", req)
+ return nil, nil
},
}
m := newTestMetricSet(client)
@@ -375,7 +479,6 @@ func TestGetIPSecTunnelEvents_WithState(t *testing.T) {
event := events[0]
assert.Equal(t, 8, event.MetricSetFields["ipsec_tunnel.id"])
- assert.Equal(t, "tl_NY01502_009953200587246_CA03028945", event.MetricSetFields["ipsec_tunnel.name"])
assert.Equal(t, "active", event.MetricSetFields["ipsec_tunnel.state"])
assert.Equal(t, "gw_NY01502_009953200587246_CA03028945", event.MetricSetFields["ipsec_tunnel.gw"])
assert.Equal(t, "ESP", event.MetricSetFields["ipsec_tunnel.proto"])
@@ -403,7 +506,7 @@ func TestGetIPSecTunnelEvents_NoState(t *testing.T) {
event := events[0]
assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
assert.Equal(t, "tl_DC02401_008842100459123_WS02019834", event.MetricSetFields["ipsec_tunnel.name"])
- assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"], "State should be empty when flow query returns no entries")
+ assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"], "State should be empty when flow query returns no entries")
}
func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
@@ -439,7 +542,7 @@ func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
// Entry 2: no state (empty flow response)
assert.Equal(t, 2, events[1].MetricSetFields["ipsec_tunnel.id"])
- assert.Equal(t, "", events[1].MetricSetFields["ipsec_tunnel.state"])
+ assert.Empty(t, events[1].MetricSetFields["ipsec_tunnel.state"])
// Entry 3: state = "init"
assert.Equal(t, 3, events[2].MetricSetFields["ipsec_tunnel.id"])
@@ -450,6 +553,141 @@ func TestGetIPSecTunnelEvents_MultipleEntries(t *testing.T) {
assert.Equal(t, "down", events[3].MetricSetFields["ipsec_tunnel.state"])
}
+func TestGetIPSecTunnelEvents_QueriesOnlyMissingState(t *testing.T) {
+ var mu sync.Mutex
+ flowQueries := map[string]int{}
+
+ client := &mockPanwClient{
+ opFunc: func(req string) ([]byte, error) {
+ if req == IPSecTunnelsQuery {
+ return []byte(tunnelXMLMixedState), nil
+ }
+
+ mu.Lock()
+ flowQueries[req]++
+ mu.Unlock()
+
+ if req == tunnelFlowQuery(2) {
+ return []byte(flowResponseXML("down")), nil
+ }
+
+ return []byte(emptyFlowResponseXML), nil
+ },
+ }
+ m := newTestMetricSet(client)
+
+ events, err := getIPSecTunnelEvents(m)
+ require.NoError(t, err)
+ require.Len(t, events, 2)
+
+ assert.Equal(t, 1, events[0].MetricSetFields["ipsec_tunnel.id"])
+ assert.Equal(t, "active", events[0].MetricSetFields["ipsec_tunnel.state"])
+
+ assert.Equal(t, 2, events[1].MetricSetFields["ipsec_tunnel.id"])
+ assert.Equal(t, "down", events[1].MetricSetFields["ipsec_tunnel.state"])
+
+ mu.Lock()
+ defer mu.Unlock()
+ assert.Equal(t, 0, flowQueries[tunnelFlowQuery(1)])
+ assert.Equal(t, 1, flowQueries[tunnelFlowQuery(2)])
+}
+
+func TestGetIPSecTunnelEvents_LargeResponseStateEnrichment(t *testing.T) {
+ const totalTunnels = 40
+
+ baseStateForID := func(id int) string {
+ if id%3 == 0 {
+ return "base-active"
+ }
+ return ""
+ }
+
+ var mu sync.Mutex
+ flowQueryCount := 0
+ var callbackErr error
+
+ client := &mockPanwClient{
+ opFunc: func(req string) ([]byte, error) {
+ if req == IPSecTunnelsQuery {
+ return []byte(largeTunnelsXML(totalTunnels, baseStateForID)), nil
+ }
+
+ idTagStart := strings.Index(req, "<tunnel-id>")
+ idTagEnd := strings.Index(req, "</tunnel-id>")
+ if idTagStart == -1 || idTagEnd == -1 {
+ mu.Lock()
+ if callbackErr == nil {
+ callbackErr = fmt.Errorf("missing tunnel-id in request %q", req)
+ }
+ mu.Unlock()
+ return nil, callbackErr
+ }
+
+ idRaw := req[idTagStart+len("<tunnel-id>") : idTagEnd]
+ id, err := strconv.Atoi(idRaw)
+ if err != nil {
+ mu.Lock()
+ if callbackErr == nil {
+ callbackErr = fmt.Errorf("invalid tunnel-id %q: %w", idRaw, err)
+ }
+ mu.Unlock()
+ return nil, callbackErr
+ }
+
+ mu.Lock()
+ flowQueryCount++
+ if id%3 == 0 && callbackErr == nil {
+ callbackErr = fmt.Errorf("unexpected flow query for tunnel with base state: id=%d", id)
+ }
+ mu.Unlock()
+
+ if id%3 == 0 {
+ return nil, fmt.Errorf("unexpected flow query for tunnel with base state: id=%d", id)
+ }
+
+ return []byte(flowResponseXML(fmt.Sprintf("flow-%d", id))), nil
+ },
+ }
+
+ m := newTestMetricSet(client)
+ events, err := getIPSecTunnelEvents(m)
+ require.NoError(t, err)
+ require.NoError(t, callbackErr)
+ require.Len(t, events, totalTunnels)
+
+ for i, event := range events {
+ id := i + 1
+ assert.Equal(t, id, event.MetricSetFields["ipsec_tunnel.id"])
+
+ if id%3 == 0 {
+ assert.Equal(t, "base-active", event.MetricSetFields["ipsec_tunnel.state"])
+ } else {
+ assert.Equal(t, fmt.Sprintf("flow-%d", id), event.MetricSetFields["ipsec_tunnel.state"])
+ }
+ }
+
+ expectedFlowQueries := totalTunnels - (totalTunnels / 3)
+ mu.Lock()
+ defer mu.Unlock()
+ assert.Equal(t, expectedFlowQueries, flowQueryCount)
+}
+
+func TestGetIPSecTunnelEvents_StatusError(t *testing.T) {
+ const errorTunnelsResponseXML = `<response status="error"><result><msg>request failed</msg></result></response>`
+
+ client := &mockPanwClient{
+ opFunc: func(req string) ([]byte, error) {
+ return []byte(errorTunnelsResponseXML), nil
+ },
+ }
+ m := newTestMetricSet(client)
+
+ events, err := getIPSecTunnelEvents(m)
+ require.Error(t, err)
+ assert.Nil(t, events)
+ assert.Contains(t, err.Error(), `IPSec tunnels query returned status "error"`)
+}
+
func TestGetIPSecTunnelEvents_EmptyEntries(t *testing.T) {
client := &mockPanwClient{
opFunc: func(req string) ([]byte, error) {
@@ -559,7 +797,7 @@ func TestGetIPSecTunnelEvents_FlowQueryError(t *testing.T) {
client := &mockPanwClient{
opFunc: func(req string) ([]byte, error) {
if req == IPSecTunnelsQuery {
- return []byte(tunnelXMLWithState), nil
+ return []byte(tunnelXMLNoState), nil
}
// Flow query fails
return nil, errors.New("timeout")
@@ -573,6 +811,28 @@ func TestGetIPSecTunnelEvents_FlowQueryError(t *testing.T) {
// State should be empty since flow query failed
event := events[0]
- assert.Equal(t, 8, event.MetricSetFields["ipsec_tunnel.id"])
- assert.Equal(t, "", event.MetricSetFields["ipsec_tunnel.state"])
+ assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
+ assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"])
+}
+
+func TestGetIPSecTunnelEvents_FlowQueryStatusError(t *testing.T) {
+ const errorFlowResponseXML = `<response status="error"><result><msg>invalid tunnel</msg></result></response>`
+
+ client := &mockPanwClient{
+ opFunc: func(req string) ([]byte, error) {
+ if req == IPSecTunnelsQuery {
+ return []byte(tunnelXMLNoState), nil
+ }
+ return []byte(errorFlowResponseXML), nil
+ },
+ }
+ m := newTestMetricSet(client)
+
+ events, err := getIPSecTunnelEvents(m)
+ require.NoError(t, err)
+ require.Len(t, events, 1)
+
+ event := events[0]
+ assert.Equal(t, 5, event.MetricSetFields["ipsec_tunnel.id"])
+ assert.Empty(t, event.MetricSetFields["ipsec_tunnel.state"])
} |
Co-authored-by: subham sarkar <sarkar.subhams2@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@x-pack/metricbeat/module/panw/interfaces/tunnels.go`:
- Around line 83-90: The worker goroutines iterate over jobs and concurrently
call getTunnelState which uses the shared MetricSet.client (a pango.Firewall.Op)
that is not thread-safe; protect those calls by serializing access to the
client—either run the fetch loop single-threaded or add a mutex around every
call that touches MetricSet.client/ pango.Firewall.Op (e.g., wrap the body of
getTunnelState or the call site in the worker loop with a sync.Mutex
lock/unlock) so only one goroutine mutates or uses the client at a time; update
references in the worker pool (jobs loop, getTunnelState, and MetricSet.client)
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c08498dc-328b-445e-9669-7d1162f147dd
📒 Files selected for processing (3)
changelog/fragments/1768219693-add-ipsec-tunnel-state.yamlx-pack/metricbeat/module/panw/interfaces/tunnels.gox-pack/metricbeat/module/panw/interfaces/tunnels_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- x-pack/metricbeat/module/panw/interfaces/tunnels_test.go
shmsr
left a comment
There was a problem hiding this comment.
Looks good; just that one comment from coderabbit
Overview
This PR adds the
statefield to IPSec tunnels.Checklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs