I have been searching, without much success, for some discussion on the possibilities (or impossibilities) of a general closed-form analytical solution for the hard-margin (only) support vector machines problem. The texts on the subject that I came upon usually stop at its formulation as an optimization problem, in primal or dual forms, and delegate the solution to numerical methods.
Then I decided to start some exploration on the subject by myself, and what follows is the (highly uncertain) result of these exercises.
I would appreciate any comment or suggestion (including references) on the validity of the following argument.
Defining $\mathbf X \in \mathbb R^{m\times n}$ as the matrix whose columns are $n$ observations of $m$ features, $\mathbf y \in \mathbb \{-1,1\}^n$ as the vector of labels, $\mathbf Y = \text{diag}(\mathbf y)$ and $\mathbf A = (\mathbf X \mathbf Y)^T \mathbf X \mathbf Y$, the dual formulation of the hard-margin SVM optimization problem can be stated as:
$$ \max_{\boldsymbol \alpha} {\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha},$$ $$ \ \ {\boldsymbol \alpha}^T \mathbf y = 0,$$ $$ \boldsymbol{\alpha} \succeq \mathbf 0.$$
The first constraint forces the solution to lie in the zero-offset hyperplane orthogonal to $\mathbf y$. Since the objective function is everywhere concave, its maximum in that hyperplane can be found by letting its gradient vanish in any direction orthogonal to $\mathbf y$:
$$\nabla ({\boldsymbol \alpha}^T \mathbf 1 - \frac{1}{2} {\boldsymbol \alpha}^T \mathbf A {\boldsymbol \alpha}) = s\mathbf y ,$$
Where $s$ is a real number. This implies:
$$ \mathbf A \boldsymbol\alpha + s\mathbf y = \mathbf 1,$$
Which, along with the first constraint, can be put in block matrix form:
$$ \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix}\begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix}=\begin{bmatrix}\mathbf 1\\0\end{bmatrix}. $$
(Incidentally, this equation automatically excludes $\boldsymbol\alpha = \mathbf 0$ as a solution, because it would imply $s\mathbf y = \mathbf 1$, which doesn't hold if we have observations of both classes.)
If the leftmost matrix in the equation above has an inverse (I don't know if it generally does; this is an acknowledged gap in the argument), then:
$$ \begin{bmatrix}\boldsymbol\alpha\\s\end{bmatrix} = \begin{bmatrix} \mathbf A&\mathbf y\\\mathbf y^T&0\end{bmatrix} ^{-1} \begin{bmatrix}\mathbf 1\\0\end{bmatrix}.$$
Can this result be considered a closed-form solution to the hard-margin SVM problem? I would like to emphasize that I'm not concerned with issues like performance or computational cost. The question is merely theoretical.
Thank you very much for any comment.