Semagle.MachineLearning.SVM Library
Semagle.MachineLearning.SVM implements training and prediction functions for two class classification, one class classification and regression. The library is based on generalization of Sequential Minimal Optimization (SMO) algorithm.
Performance
The main idea of the library was to create an extensible SVM implementation for solving many machine learning problems. However, high-level abstractions and .Net runtime take their toll on performance. Check the performance tables for details.
Kernels
The library implements the following popular kernels:
Kernel.linear
- \(\mathbf{x}_i \cdot \mathbf{x}_j\)Kernel.polynomial
- \((\gamma(\mathbf{x}_i \cdot \mathbf{x}_j) + \mu)^n\)Kernel.rbf
- \(e^{\gamma(\mathbf{x}_i - \mathbf{x}_j)^2}\)Kernel.sigmoid
- \(tanh(\gamma(\mathbf{x}_i \cdot \mathbf{x}_j) + \mu)\)
Two Class Classification
Training
Function SMO.C_SVC
finds a solution of the following optimization problem:
\[\begin{array}
\mathop{min}_{\mathbf{w},b,\mathbf{\xi}} & \quad \frac{1}{2} \mathbf{w}^T\mathbf{w} + C \sum_{i=1}^l \xi_i \\
\text{subject to} & \quad y_i(\mathbf{w}^T\phi(\mathbf{x}_i) + b) \geq 1 - \xi_i \\
& \quad \xi_i \geq 0, i=1, \dots, l
\end{array}\]
in the dual form:
\[\begin{array}
\mathop{min}_{\mathbf{\alpha}} & \quad \frac{1}{2} \mathbf{\alpha}^TQ\mathbf{w} - \mathbf{e}^T\mathbf{\alpha} \\
\text{subject to} & \quad \mathbf{y}^T\mathbf{\alpha} = 0 \\
& \quad 0 \leq \alpha_i \leq C, i=1, \dots, l
\end{array}\]
where \(Q_{ij}=y_iy_jK(\mathbf{x}_i, \mathbf{x}_j)\).
1: 2: 3: 4: 5: |
|
Prediction
Two class prediction function implements the decision rule \(sign(\sum_{i=1}^l y_i\alpha_i K(\mathbf{x}_i, x) + b)\), where \(K\) - kernel function, \(\alpha_i\) - \(i\)-th solution of the dual optimization problem, \(y_i\) - label of \(i\)-th support vector, \(\mathbf{x}_i\) - \(i\)-th support vector, \(b\) - bias value.
1:
|
|
One Class Classification
Training
Function SMO.OneClass
finds a solution of the following optimization problem:
\[\begin{array}
\mathop{min}_{\mathbf{w},\rho,\mathbf{\xi}} & \quad \frac{1}{2} \mathbf{w}^T\mathbf{w} - \rho + \frac{1}{\nu l} \sum_{i=1}^l \xi_i \\
\text{subject to} & \quad \mathbf{w}^T\phi(\mathbf{x}_i) \geq \rho - \xi_i \\
& \quad \xi_i \geq 0, i=1, \dots, l
\end{array}\]
in the dual form:
\[\begin{array}
\mathop{min}_{\mathbf{\alpha}} & \quad \frac{1}{2} \mathbf{\alpha}^TQ\mathbf{w} \\
\text{subject to} & \quad \mathbf{e}^T\mathbf{\alpha} = 1 \\
& \quad 0 \leq \alpha_i \leq \frac{1}{\nu l}, i=1, \dots, l
\end{array}\]
where \(Q_{ij}=K(\mathbf{x}_i, \mathbf{x}_j)\).
1: 2: 3: 4: 5: |
|
Prediction
Two class prediction function implements the decision rule \(sign(\sum_{i=1}^l \alpha_i K(\mathbf{x}_i, x) + \rho)\), where \(K\) - kernel function, \(\alpha_i\) - \(i\)-th solution of the dual optimization problem, \(\mathbf{x}_i\) - \(i\)-th support vector, \(\rho\) - bias value.
1:
|
|
Regression
Training
Function SMO.C_SVR
finds a solution of the following optimization problem:
\[\begin{array}
\mathop{min}_{\mathbf{w},b,\mathbf{\xi}} & \quad \frac{1}{2} \mathbf{w}^T\mathbf{w} + C \sum_{i=1}^l \xi_i + C \sum_{i=1}^l \xi_i^* \\
\text{subject to} & \quad \mathbf{w}^T\phi(\mathbf{x}_i) + b - z_i \leq \eta + \xi_i \\
& \quad z_i - \mathbf{w}^T\phi(\mathbf{x}_i) - b \leq \eta - \xi_i \\
& \quad \xi_i, \xi_i^* \geq 0, i=1, \dots, l
\end{array}\]
in the dual form:
\[\begin{array}
\mathop{min}_{\mathbf{\alpha}} & \quad \frac{1}{2} (\mathbf{\alpha} - \mathbf{\alpha}^*)^T Q (\mathbf{\alpha} - \mathbf{\alpha}^*) +
\eta \sum_{i=1}^l (\alpha_i + \alpha_i^*) + \sum_{i=1}^l (\alpha_i - \alpha_i^*) \\
\text{subject to} & \quad \mathbf{e}^T(\mathbf{\alpha} - \mathbf{\alpha}^*) = 0 \\
& \quad 0 \leq \alpha_i, \alpha_i^* \leq \frac{1}{\nu l}, i=1, \dots, l
\end{array}\]
where \(Q_{ij}=K(\mathbf{x}_i, \mathbf{x}_j)\).
1: 2: 3: 4: 5: |
|
Prediction
Regression function computes the approximation function \(\sum_{i=1}^l (-\alpha_i + \alpha_i^*) K(\mathbf{x}_i, x) + \rho)\), \(K\) - kernel function, \(\alpha_i\) - \(i\)-th solution of the dual optimization problem, \(\mathbf{x}_i\) - \(i\)-th support vector, \(\rho\) - bias value.
1:
|
|