K-Means Clusturing
K-means Error
- Error estimation of K-means can be defined as sum of distances between clusters and each allocated samples
- Error function below can be optimalized by gradient descent
- However, Lloyd’s algorithm converges faster than gradient descent (Bottou and Bengio, 1995)
$J(Z,U) = \displaystyle\sum_{i=1}^{N}\displaystyle\sum_{j=1}^{k}\boldsymbol{u}_{ji}|\boldsymbol{x_i}-\boldsymbol{z_j}|^2$
$N$: Number of samples $k$: Number of clusters $x_i$: $i$th sample $u_{ji}$: 1 if $\boldsymbol{x_i}$ in $C_j$(cluster of $\boldsymbol{z}_j$) else 0, $\boldsymbol{u} \in \boldsymbol{Z}$ $\boldsymbol{z}_j$: Center of $j$th cluster, $\boldsymbol{z} \in \boldsymbol{Z}$ $Z$: Set of cluster centers $U$: Set of membership
Limitations of K-means
- Sensitive to initial values
- Can fall into local minima
- Sensitive to outlier
- Variety of K-means
- Multi start K-means
- K-medoids
Leave a comment