Skip to content

[metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization#45857

Merged
mykola-elastic merged 21 commits intoelastic:mainfrom
mykola-elastic:fix-sqlserver-leak
Aug 22, 2025
Merged

[metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization#45857
mykola-elastic merged 21 commits intoelastic:mainfrom
mykola-elastic:fix-sqlserver-leak

Conversation

@mykola-elastic
Copy link
Contributor

@mykola-elastic mykola-elastic commented Aug 8, 2025

Suggested fix for the connection string leak described in the issue: #45852

After this change not only sensitive string will be replaced in the error message, but the escaped and url-encoded representations of it will be replaced. (Escaped as in fmt.Sprintf("%q", sensitive)). Additionally, this PR also improves sanitizeError by adding pattern-based sanitization.

What triggered the original issue: net/url library (in case of the issue, invoked by mssql driver) has its own error type which uses %q when printing URLs:

// go/src/net/url/url.go
type Error struct {
	Op  string
	URL string
	Err error
}

func (e *Error) Error() string { return fmt.Sprintf("%s %q: %s", e.Op, e.URL, e.Err) }

Additional refactoring:

  • moved SanitizeError to the "github.com/elastic/beats/v7/metricbeat/helper/sql" package as it is a generic helper function, which is also related targeted at errors returned by sql driver libs
  • implemented custom error type sqlSanitizedError so that errors.Is() and errors.As() still work with the errors returned by SanitizeError
  • adding error checks or //nolint:errcheck comments in "github.com/elastic/beats/v7/metricbeat/helper/sql" package, as changes there triggered more CI checks

Proposed commit message

See title.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Use metricbeat module sql with driver set to driver: mssql
  2. Set hosts to some invalid string with symbols (", \, etc.)
  3. The parse error shouldn't contain the host connection string, but contain (redacted) in the place of it

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 8, 2025
@mykola-elastic mykola-elastic added Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Aug 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Aug 8, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @mykola-elastic? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.
@mykola-elastic mykola-elastic marked this pull request as ready for review August 8, 2025 14:46
@mykola-elastic mykola-elastic requested a review from a team as a code owner August 8, 2025 14:46
@mykola-elastic mykola-elastic added the backport-active-all Automated backport with mergify to all the active branches label Aug 8, 2025
@mergify
Copy link
Contributor

mergify bot commented Aug 11, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix-sqlserver-leak upstream/fix-sqlserver-leak
git merge upstream/main
git push upstream fix-sqlserver-leak
@muthu-mps muthu-mps requested a review from Copilot August 14, 2025 10:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a security issue where sensitive connection strings could leak in error messages from the SQL metricbeat module. The fix enhances the sanitizeError function to replace both the original sensitive string and its escaped representation in error messages.

  • Enhances sanitizeError function to replace escaped sensitive strings using fmt.Sprintf("%q", sensitive)
  • Adds comprehensive test cases for both unescaped and escaped sensitive string scenarios
  • Updates changelog to document the security enhancement

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
x-pack/metricbeat/module/sql/query/query.go Enhanced sanitizeError function to handle escaped sensitive strings
x-pack/metricbeat/module/sql/query/query_test.go Added test cases for escaped sensitive string scenarios
CHANGELOG.next.asciidoc Documents the security enhancement

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@shmsr
Copy link
Member

shmsr commented Aug 18, 2025

Can we use sanitizeError here as well; please check if there are any other cases that are missed (across all log levels)

diff --git a/x-pack/metricbeat/module/sql/query/dsn.go b/x-pack/metricbeat/module/sql/query/dsn.go
index f072b472fd..97b43a423a 100644
--- a/x-pack/metricbeat/module/sql/query/dsn.go
+++ b/x-pack/metricbeat/module/sql/query/dsn.go
@@ -88,7 +88,7 @@ func sanitize(host string) string {
 func oracleParseDSN(config ConnectionDetails, host string) (mb.HostData, error) {
 	params, err := godror.ParseDSN(host)
 	if err != nil {
-		return mb.HostData{}, fmt.Errorf("error trying to parse connection string in field 'hosts': %w", err)
+		return mb.HostData{}, fmt.Errorf("error trying to parse connection string in field 'hosts': %w", sanitizeError(err, host))
 	}
 	if params.Username == "" {
 		params.Username = config.Username
@@ -110,7 +110,7 @@ func mysqlParseDSN(config ConnectionDetails, host string, logger *logp.Logger) (
 	c, err := mysql.ParseDSN(host)
 
 	if err != nil {
-		return mb.HostData{}, fmt.Errorf("error trying to parse connection string in field 'hosts': %w", err)
+		return mb.HostData{}, fmt.Errorf("error trying to parse connection string in field 'hosts': %w", sanitizeError(err, host))
 	}
 
 	sanitized := c.Addr
diff --git a/x-pack/metricbeat/module/sql/query/query.go b/x-pack/metricbeat/module/sql/query/query.go
index e86329589b..62de44b3d6 100644
--- a/x-pack/metricbeat/module/sql/query/query.go
+++ b/x-pack/metricbeat/module/sql/query/query.go
@@ -232,7 +232,7 @@ func (m *MetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) (fetchErr
 	if !m.Config.FetchFromAllDatabases {
 		reported, err := m.fetch(ctx, db, reporter, queries)
 		if err != nil {
-			m.Logger().Warn("error while fetching:", err)
+			m.Logger().Warn("error while fetching:", sanitizeError(err, m.HostData().URI))
 		}
 		if !reported {
 			m.Logger().Debug("error trying to emit event")
@@ -274,7 +274,7 @@ func (m *MetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) (fetchErr
 
 		val, err := dbNames[i].GetValue("name")
 		if err != nil {
-			m.Logger().Warn("error with database name:", err)
+			m.Logger().Warn("error with database name:", sanitizeError(err, m.HostData().URI))
 			continue
 		}
 		dbName, ok := val.(string)
@@ -292,7 +292,7 @@ func (m *MetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) (fetchErr
 
 		reported, err := m.fetch(ctx, db, reporter, qs)
 		if err != nil {
-			m.Logger().Warn("error while fetching:", err)
+			m.Logger().Warn("error while fetching:", sanitizeError(err, m.HostData().URI))
 		}
 		if !reported {
 			m.Logger().Debug("error trying to emit event")
@shmsr
Copy link
Member

shmsr commented Aug 18, 2025

Gave few more requirements to Copilot to handle postgres/ sqlserver/ odbc-like connection string formats as well URL encoded secret inputs. I think now we are well covered now that handle common connection strings across the drivers we support through SQL module.

diff --git a/x-pack/metricbeat/module/sql/query/query.go b/x-pack/metricbeat/module/sql/query/query.go
index 62de44b3d6..c99929da98 100644
--- a/x-pack/metricbeat/module/sql/query/query.go
+++ b/x-pack/metricbeat/module/sql/query/query.go
@@ -10,6 +10,8 @@ import (
 	"context"
 	"errors"
 	"fmt"
+	"net/url"
+	"regexp"
 	"strings"
 
 	"github.com/elastic/beats/v7/metricbeat/helper/sql"
@@ -391,23 +393,63 @@ func inferTypeFromMetrics(ms mapstr.M) mapstr.M {
 	return ret
 }
 
// sanitizeError replaces all occurrences of 'sensitive' parameter in err.Error() with "(redacted)"
+// It also sanitizes common patterns that might contain passwords or sensitive data
 func sanitizeError(err error, sensitive string) error {
 	if err == nil {
 		return nil
 	}
 
-	sensitive = strings.TrimSpace(sensitive)
-
-	if sensitive == "" {
-		return err
+	msg := err.Error()
+
+	// First, replace the primary sensitive string if provided (raw, quoted, and URL-encoded forms)
+	if s := strings.TrimSpace(sensitive); s != "" {
+		// raw
+		msg = strings.ReplaceAll(msg, s, "(redacted)")
+		// quoted (fmt %q style)
+		quoted := fmt.Sprintf("%q", s)
+		msg = strings.ReplaceAll(msg, quoted, "(redacted)")
+		// URL-encoded (both query and path escaping just to be safe)
+		qEsc := url.QueryEscape(s)
+		if qEsc != s {
+			msg = strings.ReplaceAll(msg, qEsc, "(redacted)")
+		}
+		pEsc := url.PathEscape(s)
+		if pEsc != s && pEsc != qEsc {
+			msg = strings.ReplaceAll(msg, pEsc, "(redacted)")
+		}
 	}
 
-	escapedSensitive := fmt.Sprintf("%q", sensitive)
+	// Pattern-based sanitization for common secrets in errors (URLs, DSNs, key/value strings, JSON, query params).
+	// Order matters: apply more specific URL userinfo patterns first.
+	patterns := []struct {
+		re   *regexp.Regexp
+		repl string
+	}{
+		// scheme://user:password@host -> redact password
+		{regexp.MustCompile(`([a-z][a-z0-9+\.\-]*://[^:/?#\s]+):([^@/\s]+)@`), `$1:(redacted)@`},
+		// user:password@... (no scheme), common in MySQL DSNs like user:pass@tcp(...)
+		// Important: Disallow '/' in the password part to avoid matching URI schemes like "postgres://..."
+		{regexp.MustCompile(`(^|[\s'\"])([^:/@\s]+):([^@/\s]+)@`), `$1$2:(redacted)@`},
+
+		// Key=Value forms (connection strings): handle quoted and unquoted values.
+		// Single-quoted values: Password='secret'; Token='abc';
+		{regexp.MustCompile(`(?i)\b(password|pwd|pass|passwd|token|secret)\b\s*=\s*'[^']*'`), `$1='(redacted)'`},
+		// Double-quoted values: Password="secret value";
+		{regexp.MustCompile(`(?i)\b(password|pwd|pass|passwd|token|secret)\b\s*=\s*"[^"]*"`), `$1="(redacted)"`},
+		// Unquoted values until delimiter or whitespace: Password=secret123; PASS=foo
+		{regexp.MustCompile(`(?i)\b(password|pwd|pass|passwd|token|secret)\b\s*=\s*[^;,#&\s]+`), `$1=(redacted)`},
+
+		// JSON-style fields: {"password":"secret"}
+		{regexp.MustCompile(`(?i)"(password|pwd|pass|passwd|token|secret)"\s*:\s*"(?:[^"\\]|\\.)*"`), `"$1":"(redacted)"`},
+
+		// Query parameters in URLs: ?password=secret&...
+		{regexp.MustCompile(`(?i)([?&])(password|pwd|pass|passwd|token|secret)\s*=\s*([^&#\s]+)`), `$1$2=(redacted)`},
+	}
 
-	sanitizedMessage := err.Error()
-	sanitizedMessage = strings.ReplaceAll(sanitizedMessage, sensitive, "(redacted)")
-	sanitizedMessage = strings.ReplaceAll(sanitizedMessage, escapedSensitive, "(redacted)")
+	for _, p := range patterns {
+		msg = p.re.ReplaceAllString(msg, p.repl)
+	}
 
-	return errors.New(sanitizedMessage)
+	return errors.New(msg)
 }
diff --git a/x-pack/metricbeat/module/sql/query/query_test.go b/x-pack/metricbeat/module/sql/query/query_test.go
index 47b752f513..09d83d9961 100644
--- a/x-pack/metricbeat/module/sql/query/query_test.go
+++ b/x-pack/metricbeat/module/sql/query/query_test.go
@@ -83,6 +83,34 @@ func TestSanitizeError(t *testing.T) {
 			expectedErr:  "cannot open connection: testing connection: parse (redacted): net/url: invalid userinfo",
 			expectNilErr: false,
 		},
+		{
+			name:         "Pattern-based password sanitization in connection string",
+			err:          errors.New("Failed to connect: Server=localhost;Database=myDB;User Id=admin;Password=secret123;"),
+			sensitive:    "",
+			expectedErr:  "Failed to connect: Server=localhost;Database=myDB;User Id=admin;Password=(redacted);",
+			expectNilErr: false,
+		},
+		{
+			name:         "Pattern-based URL auth sanitization",
+			err:          errors.New("Connection failed for postgres://user:mypassword@localhost:5432/db"),
+			sensitive:    "",
+			expectedErr:  "Connection failed for postgres://user:(redacted)@localhost:5432/db",
+			expectNilErr: false,
+		},
+		{
+			name:         "URL-encoded sensitive data",
+			err:          errors.New("Failed to parse: secret%40123"),
+			sensitive:    "secret@123",
+			expectedErr:  "Failed to parse: (redacted)",
+			expectNilErr: false,
+		},
+		{
+			name:         "Multiple password patterns",
+			err:          errors.New("pwd=test123 failed, also PASS=another456 failed"),
+			sensitive:    "",
+			expectedErr:  "pwd=(redacted) failed, also PASS=(redacted) failed",
+			expectNilErr: false,
+		},
 	}
 
 	for _, test := range tests {
@mykola-elastic
Copy link
Contributor Author

#45857 (comment)

I guess I can add similar defer statements to fetch and ParseDSN functions, what do you think? @shmsr

diff --git a/x-pack/metricbeat/module/sql/query/dsn.go b/x-pack/metricbeat/module/sql/query/dsn.go
index f072b472fd..73137ffc49 100644
--- a/x-pack/metricbeat/module/sql/query/dsn.go
+++ b/x-pack/metricbeat/module/sql/query/dsn.go
@@ -31,7 +31,10 @@ type ConnectionDetails struct {
 const mysqlTLSConfigKey = "custom"
 
 // ParseDSN tries to parse the host
-func ParseDSN(mod mb.Module, host string) (mb.HostData, error) {
+func ParseDSN(mod mb.Module, host string) (_ mb.HostData, fetchErr error) {
+       defer func() {
+               fetchErr = sanitizeError(fetchErr, host)
+       }()
 
        logger := logp.NewLogger("")
        // At the time of writing, mod always is of type *mb.BaseModule.
diff --git a/x-pack/metricbeat/module/sql/query/query.go b/x-pack/metricbeat/module/sql/query/query.go
index e86329589b..d302821a20 100644
--- a/x-pack/metricbeat/module/sql/query/query.go
+++ b/x-pack/metricbeat/module/sql/query/query.go
@@ -146,7 +146,11 @@ func dbSelector(driver, dbName string) string {
        return ""
 }
 
-func (m *MetricSet) fetch(ctx context.Context, db *sql.DbClient, reporter mb.ReporterV2, queries []query) (bool, error) {
+func (m *MetricSet) fetch(ctx context.Context, db *sql.DbClient, reporter mb.ReporterV2, queries []query) (_ bool, fetchErr error) {
+       defer func() {
+               fetchErr = sanitizeError(fetchErr, m.HostData().URI)
+       }()
+
        var ok bool
        merged := make(mapstr.M, 0)
        storeQueries := make([]string, 0, len(queries))

That only doesn't handle the code for val, err := dbNames[i].GetValue("name") you have. But I don't think that case is needed. It is a map with database names, not with connection strings

Copy link
Contributor

@stefans-elastic stefans-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a good idea for sanitizeError to be to be discarding original (passed to sanitizeError) error. It means that all error info (except message itself) is lost: error type (like url.Error) or it might be a bunch of errors wrapped in each other and this info will be lost.

I think the function shouldn't have any side effects or at least state the side effect in godoc comment so it is easily noticeable to whoever chooses to use this function.

@mykola-elastic mykola-elastic requested a review from a team August 22, 2025 06:08
Copy link
Member

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex part is good for adding extra safegaurd. Although it made the code readability quite complex :) But I am ok with it.

Rest looks good, Thanks

Co-authored-by: Aman <38116245+devamanv@users.noreply.github.com>
@mykola-elastic mykola-elastic merged commit bf63860 into elastic:main Aug 22, 2025
51 checks passed
@github-actions
Copy link
Contributor

@Mergifyio backport 8.17 8.18 8.19 9.0 9.1

mergify bot pushed a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

# Conflicts:
#	x-pack/metricbeat/module/sql/query/dsn.go
#	x-pack/metricbeat/module/sql/query/query_test.go
mergify bot pushed a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

# Conflicts:
#	x-pack/metricbeat/module/sql/query/dsn.go
mergify bot pushed a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

# Conflicts:
#	x-pack/metricbeat/module/sql/query/dsn.go
#	x-pack/metricbeat/module/sql/query/query_test.go
mergify bot pushed a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)
mergify bot pushed a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

# Conflicts:
#	x-pack/metricbeat/module/sql/query/dsn.go
#	x-pack/metricbeat/module/sql/query/query_test.go
mykola-elastic added a commit that referenced this pull request Aug 22, 2025
…s escaped, add pattern-based sanitization (#45857) (#46189)

(cherry picked from commit bf63860)

Co-authored-by: Mykola Kmet <mykola.kmet@elastic.co>
@mykola-elastic
Copy link
Contributor Author

Conflicts in backport branches resolved. The backports were additionally manually tested:

  • using correct config with mysql to confirm the metrics are incoming
  • using invalid config which triggered error with the (redacted) inside.
mykola-elastic added a commit that referenced this pull request Aug 22, 2025
…sitive info even if it is escaped, add pattern-based sanitization (#46187)

* [metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

---------

Co-authored-by: Mykola Kmet <mykola.kmet@elastic.co>
mykola-elastic added a commit that referenced this pull request Aug 22, 2025
…sitive info even if it is escaped, add pattern-based sanitization (#46185)

* [metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

---------

Co-authored-by: Mykola Kmet <mykola.kmet@elastic.co>
mykola-elastic added a commit that referenced this pull request Aug 22, 2025
…sitive info even if it is escaped, add pattern-based sanitization (#46186)

* [metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)

---------

Co-authored-by: Mykola Kmet <mykola.kmet@elastic.co>
mykola-elastic added a commit that referenced this pull request Aug 22, 2025
…itive info even if it is escaped, add pattern-based sanitization (#46188)

* [metricbeat] [sql] sanitizeError: replace sensitive info even if it is escaped, add pattern-based sanitization (#45857)

(cherry picked from commit bf63860)


---------

Co-authored-by: Mykola Kmet <mykola.kmet@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches bugfix Metricbeat Metricbeat Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team

7 participants