How K-Means Clustering Works
K-Means partitions data into K clusters by iteratively assigning points to the nearest centroid. It's fast, scalable, and ideal for spherical clusters in medium-to-large datasets.
About the Wine Quality Dataset
178 wine samples with 13 chemical properties. Ideal for discovering natural groupings or predicting wine class.
- Samples
- 178
- Features
- 13
- Type
- Numeric
- Category
- Partition-based
Key Metrics to Watch
Silhouette Score
Measures how similar a point is to its own cluster vs. other clusters. Ranges from −1 to +1; higher is better.
Calinski-Harabasz Index
Ratio of between-cluster to within-cluster variance. Higher values indicate denser, well-separated clusters.
Davies-Bouldin Index
Average similarity between each cluster and its most similar cluster. Lower is better.
Inertia (Within-Cluster SSE)
Sum of squared distances from each point to its assigned centroid. Lower indicates tighter clusters.
When to Use K-Means Clustering
K-Means Clustering belongs to the Partition-based family of clustering algorithms. These methods divide data into non-overlapping subsets. They work best when clusters are roughly spherical and similar in size.
Related Examples
K-Means Clustering on Iris Flowers
See K-Means Clustering applied to the Iris Flowers dataset (150 samples, 4 features). Interactive visualization, metrics, and analysis.
K-Means Clustering on Customer Segments
See K-Means Clustering applied to the Customer Segments dataset (200 samples, 5 features). Interactive visualization, metrics, and analysis.
K-Medoids Clustering on Wine Quality
See K-Medoids Clustering applied to the Wine Quality dataset (178 samples, 13 features). Interactive visualization, metrics, and analysis.
DBSCAN Clustering on Wine Quality
See DBSCAN Clustering applied to the Wine Quality dataset (178 samples, 13 features). Interactive visualization, metrics, and analysis.