0
$\begingroup$

I have been searching, without much success, for some discussion on the possibilities (or impossibilities) of a general closed-form analytical solution for the hard-margin (only) support vector machines problem. The texts on the subject that I came upon usually stop at its formulation as an optimization problem, in primal or dual forms, and delegate the solution to numerical methods.

Then I decided to start some exploration on the subject by myself, and what follows is the (highly uncertain) result of these exercises.

I would appreciate any comment or suggestion (including references) on the validity of the following argument.

Defining $\mathbf X \in \mathbb R^{m\times n}$ as the matrix whose columns are $n$ observations of $m$ features, $\mathbf y \in \mathbb \{-1,1\}^n$ as the vector of labels, $\mathbf Y = \text{diag}(\mathbf y)$ and $\mathbf A = (\mathbf X \mathbf Y)^T \mathbf X \mathbf Y$, the dual formulation of the hard-margin SVM optimization problem can be stated as:

$$ \max_{\boldsymbol \alpha} {\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha},$$ $$ \ \ {\boldsymbol \alpha}^T \mathbf y = 0,$$ $$ \boldsymbol{\alpha} \succeq \mathbf 0.$$

The first constraint forces the solution to lie in the zero-offset hyperplane orthogonal to $\mathbf y$. Since the objective function is everywhere concave, its maximum in that hyperplane can be found by letting its gradient vanish in any direction orthogonal to $\mathbf y$:

$$\nabla ({\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha}) = s\mathbf y ,$$

Where $s$ is a real number. This implies:

$$ \mathbf A \boldsymbol\alpha + s\mathbf y = \mathbf 1,$$

Which, along with the first constraint, can be put in block matrix form:

$$ \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix}\begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix}=\begin{bmatrix}\mathbf 1\\0\end{bmatrix}. $$

(Incidentally, this equation automatically excludes $\boldsymbol\alpha = \mathbf 0$ as a solution, because it would imply $s\mathbf y = \mathbf 1$, which doesn't hold if we have observations of both classes.)

If the leftmost matrix in the equation above has an inverse (I don't know if it generally does; this is an acknowledged gap in the argument), then:

$$ \begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix} = \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix} ^{-1} \begin{bmatrix}\mathbf 1\\0\end{bmatrix}.$$

Can this result be considered a closed-form solution to the hard-margin SVM problem? I would like to emphasize that I'm not concerned with issues like performance or computational cost. The question is merely theoretical.

Thank you very much for any comment.

$\endgroup$

1 Answer 1

0
$\begingroup$

Your approach does not account for the non-negativity requirement on $\alpha$. In other words, if $\alpha$ you find from the optimality conditions you have formulated happens to be non-negative, then you are done. Otherwise, this will not be a feasible, let alone an optimal solution.

$\endgroup$
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.