How BIRCH Clustering Works

BIRCH efficiently clusters very large datasets by building a CF-tree summary structure, then applying a secondary clustering step.

BIRCHincremental clusteringCF-treelarge datasetsonline clusteringthreshold

About the Iris Flowers Dataset

Classic 150-sample dataset with 4 petal/sepal measurements across 3 species. The gold standard for clustering & classification demos.

Samples: 150
Features: 4
Type: Numeric
Category: Hierarchical

Key Metrics to Watch

Silhouette Score

Measures how similar a point is to its own cluster vs. other clusters. Ranges from −1 to +1; higher is better.

Calinski-Harabasz Index

Ratio of between-cluster to within-cluster variance. Higher values indicate denser, well-separated clusters.

Davies-Bouldin Index

Average similarity between each cluster and its most similar cluster. Lower is better.

Inertia (Within-Cluster SSE)

Sum of squared distances from each point to its assigned centroid. Lower indicates tighter clusters.

When to Use BIRCH Clustering

BIRCH Clustering belongs to the Hierarchical family of clustering algorithms. These methods build a tree of clusters, either by merging (agglomerative) or splitting (divisive). They reveal multi-scale structure in data.

BIRCH Clustering on Iris Flowers

How BIRCH Clustering Works

About the Iris Flowers Dataset

Key Metrics to Watch

Silhouette Score

Calinski-Harabasz Index

Davies-Bouldin Index

Inertia (Within-Cluster SSE)

When to Use BIRCH Clustering

Related Examples

K-Means Clustering on Iris Flowers

K-Medoids Clustering on Iris Flowers

DBSCAN Clustering on Iris Flowers

HDBSCAN Clustering on Iris Flowers