How K-Means Clustering Works

K-Means partitions data into K clusters by iteratively assigning points to the nearest centroid. It's fast, scalable, and ideal for spherical clusters in medium-to-large datasets.

K-Meansclusteringcentroidelbow methodsilhouette scoreunsupervised learningdata partitioning

About the Iris Flowers Dataset

Classic 150-sample dataset with 4 petal/sepal measurements across 3 species. The gold standard for clustering & classification demos.

Samples: 150
Features: 4
Type: Numeric
Category: Partition-based

Key Metrics to Watch

Silhouette Score

Measures how similar a point is to its own cluster vs. other clusters. Ranges from −1 to +1; higher is better.

Calinski-Harabasz Index

Ratio of between-cluster to within-cluster variance. Higher values indicate denser, well-separated clusters.

Davies-Bouldin Index

Average similarity between each cluster and its most similar cluster. Lower is better.

Inertia (Within-Cluster SSE)

Sum of squared distances from each point to its assigned centroid. Lower indicates tighter clusters.

When to Use K-Means Clustering

K-Means Clustering belongs to the Partition-based family of clustering algorithms. These methods divide data into non-overlapping subsets. They work best when clusters are roughly spherical and similar in size.

K-Means Clustering on Iris Flowers

How K-Means Clustering Works

About the Iris Flowers Dataset

Key Metrics to Watch

Silhouette Score

Calinski-Harabasz Index

Davies-Bouldin Index

Inertia (Within-Cluster SSE)

When to Use K-Means Clustering

Related Examples

K-Means Clustering on Wine Quality

K-Means Clustering on Customer Segments

K-Medoids Clustering on Iris Flowers

DBSCAN Clustering on Iris Flowers