Skip to content

FIX: Set levels explicitly to plot all classes in distinct colors in DecisionBoundaryDisplay#33300

Merged
ogrisel merged 4 commits intoscikit-learn:mainfrom
AnneBeyer:fix_plot_levels
Feb 25, 2026
Merged

FIX: Set levels explicitly to plot all classes in distinct colors in DecisionBoundaryDisplay#33300
ogrisel merged 4 commits intoscikit-learn:mainfrom
AnneBeyer:fix_plot_levels

Conversation

@AnneBeyer
Copy link
Contributor

Reference Issues/PRs

Fixes #32866 (supersedes #32867)

What does this implement/fix? Explain your changes.

Continuing the proposed fix from #32867, this sets the levels parameter explicitly to the unique target values for contour and extends them for contourf with response_method=predict (the other contourf cases are handles differently by plotting every class on a different surface, so they're not affected by this bug). This ensures, that all classes (and class boundaries, respectively) are displayed in different colors.

Note that in #33015 we noticed that this can also occur when n_classes < 7. This happens because the default values for the levels don't necessarily correspond to the target classes. I used the data from the example in #33015 for the non-regression test here.

@ogrisel @lucyleeow @ThexXTURBOXx @leweex95

AI usage disclosure

I used AI assistance for:

  • Code generation (e.g., when writing an implementation or fixing a bug)
  • Test/benchmark generation
  • Documentation (including examples)
  • Research and understanding

Any other comments?

@AnneBeyer
Copy link
Contributor Author

Repeating the visual check from #32867 (review)

Code
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.inspection import DecisionBoundaryDisplay


data = np.array(
    [
        [-1, -1],
        [-2, -1],
        [1, 1],
        [2, 1],
        [2, 2],
        [3, 2],
        [3, 3],
        [4, 3],
        [4, 4],
        [5, 4],
        [5, 5],
    ]
)
# target = np.asarray([str(i) for i in range(11)])
target = np.arange(11)
clf = LogisticRegression().fit(data, target)

cmap = "gist_rainbow"
_, axes = plt.subplots(nrows=2, ncols=3, figsize=(12, 8), constrained_layout=True)
for plot_method_idx, plot_method in enumerate(["contourf", "contour"]):
    for response_method_idx, response_method in enumerate(
        ["predict_proba", "decision_function", "predict"]
    ):
        ax = axes[plot_method_idx, response_method_idx]
        display = DecisionBoundaryDisplay.from_estimator(
            clf,
            data,
            multiclass_colors=cmap,
            response_method=response_method,
            plot_method=plot_method,
            ax=ax,
            alpha=0.5,
        )
        ax.scatter(
            data[:, 0],
            data[:, 1],
            c=target.astype(int),
            edgecolors="black",
            cmap=cmap,
        )
        if isinstance(display.surface_, list):
            levels = len(display.surface_[0].levels)
        else:
            levels = len(display.surface_.levels)
        ax.set_title(f"plot_method={plot_method}\nresponse_method={response_method}\nlevels={levels}")

main

(now with corrected colors for predict but not displaying all classes correctly)
image

this PR

image
@AnneBeyer AnneBeyer added this to Labs Feb 17, 2026
@github-project-automation github-project-automation bot moved this to Todo in Labs Feb 17, 2026
@AnneBeyer AnneBeyer moved this from Todo to In progress in Labs Feb 17, 2026
@ThexXTURBOXx
Copy link

I can confirm that this works in my use-case and fixes the mentioned issue. Thank you very much!

Copy link
Member

@virchan virchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks, @AnneBeyer!

@virchan virchan added the Waiting for Second Reviewer First reviewer is done, need a second one! label Feb 18, 2026
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AnneBeyer! LGTM besides the following.

# if hasattr(disp.surface_, "levels"):
# assert len(disp.surface_.levels) >= disp.n_classes

@pytest.mark.parametrize("y", [np.arange(6), [str(i) for i in np.arange(6)]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title of #32866 implies that we need more than 7 distinct classes to reproduce the original problem: have you checked that this version of the test would actually fail on main?

Copy link
Contributor Author

@AnneBeyer AnneBeyer Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this test now covers what I noticed here #32867 (comment) (and also why I removed the two lines above this test):

Even with 7 or fewer classes, the default levels don't necessarily match the classes, because they will be created as an evenly spaced array.
So we don't need to check the length of levels, but rather that the numbers actually correspond to our classes (or -0.5 and +0.5, as in the contourf case, though I have to admit that I'm not sure how @leweex95 figured that out.).

I haven't found a way to check which colors are actually visible in the plot (without actually looking at the plots), so checking the exact match of the levels is the best approach I could think of.

(I'm getting the faint suspicion that these plotting functions were not specifically designed with the multiclass-classification use case in mind.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multiclass support was indeed incrementally added and we weren't careful enough in our review and testing process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this didn't come across wrong, I rather meant the matplotlib functions in the first place. The levels parameter is not as intuitive as it seems, and the docs are not very extensive on it. And I couldn't find any multiclass-classification examples using contour in the matplotlib examples either, so this was very hard to detect.

Copy link
Member

@lucyleeow lucyleeow Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. The levels parameter is not as intuitive as it seems, and the docs are not very extensive on it.

Agreed, digging into the code, you can see auto determination of levels is quite complex

matplotlib/matplotlib#30996 (I opened an issue about it)

Copy link
Member

@lucyleeow lucyleeow Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found a way to check which colors are actually visible

I think you can add something like this to your test:

    if plot_method == "contour":
        colours =  [collection.get_edgecolor() for collection in disp.ax_.collections]
    elif plot_method == "contourf":
        colours =  [collection.get_facecolor() for collection in disp.ax_.collections]

this will give you the rgba values of each level.

It's a bit of a maze but contour/f returns QuadContourSet class -> base ContourSet -> bases ContourLabeler, Collection

Should be a quick PR to just add this to the end of the test. You can then match it with tab10, the default cmap.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel ogrisel merged commit 57aa064 into scikit-learn:main Feb 25, 2026
37 checks passed
@github-project-automation github-project-automation bot moved this from In progress to Done in Labs Feb 25, 2026
@AnneBeyer
Copy link
Contributor Author

Thank you everyone!

@AnneBeyer AnneBeyer deleted the fix_plot_levels branch February 26, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:inspection Waiting for Second Reviewer First reviewer is done, need a second one!

5 participants