Is this a closed-form analytical solution for the hard-margin SVM dual problem?

Question

I have been searching, without much success, for some discussion on the possibilities (or impossibilities) of a general closed-form analytical solution for the hard-margin (only) support vector machines problem. The texts on the subject that I came upon usually stop at its formulation as an optimization problem, in primal or dual forms, and delegate the solution to numerical methods.

Then I decided to start some exploration on the subject by myself, and what follows is the (highly uncertain) result of these exercises.

I would appreciate any comment or suggestion (including references) on the validity of the following argument.

Defining $\mathbf X \in \mathbb R^{m\times n}$ as the matrix whose columns are $n$ observations of $m$ features, $\mathbf y \in \mathbb \{-1,1\}^n$ as the vector of labels, $\mathbf Y = \text{diag}(\mathbf y)$ and $\mathbf A = (\mathbf X \mathbf Y)^T \mathbf X \mathbf Y$, the dual formulation of the hard-margin SVM optimization problem can be stated as:

$$ \max_{\boldsymbol \alpha} {\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha},$$ $$ \ \ {\boldsymbol \alpha}^T \mathbf y = 0,$$ $$ \boldsymbol{\alpha} \succeq \mathbf 0.$$

The first constraint forces the solution to lie in the zero-offset hyperplane orthogonal to $\mathbf y$. Since the objective function is everywhere concave, its maximum in that hyperplane can be found by letting its gradient vanish in any direction orthogonal to $\mathbf y$:

$$\nabla ({\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha}) = s\mathbf y ,$$

Where $s$ is a real number. This implies:

$$ \mathbf A \boldsymbol\alpha + s\mathbf y = \mathbf 1,$$

Which, along with the first constraint, can be put in block matrix form:

$$ \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix}\begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix}=\begin{bmatrix}\mathbf 1\\0\end{bmatrix}. $$

(Incidentally, this equation automatically excludes $\boldsymbol\alpha = \mathbf 0$ as a solution, because it would imply $s\mathbf y = \mathbf 1$, which doesn't hold if we have observations of both classes.)

If the leftmost matrix in the equation above has an inverse (I don't know if it generally does; this is an acknowledged gap in the argument), then:

$$ \begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix} = \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix} ^{-1} \begin{bmatrix}\mathbf 1\\0\end{bmatrix}.$$

Can this result be considered a closed-form solution to the hard-margin SVM problem? I would like to emphasize that I'm not concerned with issues like performance or computational cost. The question is merely theoretical.

Thank you very much for any comment.

rafexiap · Accepted Answer · 2023-08-27 18:29:37Z

0

Your approach does not account for the non-negativity requirement on $\alpha$. In other words, if $\alpha$ you find from the optimality conditions you have formulated happens to be non-negative, then you are done. Otherwise, this will not be a feasible, let alone an optimal solution.

answered Aug 27, 2023 at 18:29

rafexiap

6903 silver badges8 bronze badges

Add a comment |

Stack Exchange Network

Is this a closed-form analytical solution for the hard-margin SVM dual problem?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Is this a closed-form analytical solution for the hard-margin SVM dual problem?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions