HDBSCAN Clustering on Iris Flowers

See HDBSCAN Clustering applied to the Iris Flowers dataset (150 samples, 4 features). Interactive visualization, metrics, and analysis.

How HDBSCAN Clustering Works

HDBSCAN extends DBSCAN to handle varying-density clusters. It builds a hierarchy and extracts the most stable clusters automatically.

HDBSCANhierarchical densityvarying densitycluster stabilitysoft clusteringnoise robust

About the Iris Flowers Dataset

Classic 150-sample dataset with 4 petal/sepal measurements across 3 species. The gold standard for clustering & classification demos.

Samples
150
Features
4
Type
Numeric
Category
Density-based

Key Metrics to Watch

Silhouette Score

Measures how similar a point is to its own cluster vs. other clusters. Ranges from −1 to +1; higher is better.

Calinski-Harabasz Index

Ratio of between-cluster to within-cluster variance. Higher values indicate denser, well-separated clusters.

Davies-Bouldin Index

Average similarity between each cluster and its most similar cluster. Lower is better.

Inertia (Within-Cluster SSE)

Sum of squared distances from each point to its assigned centroid. Lower indicates tighter clusters.

When to Use HDBSCAN Clustering

HDBSCAN Clustering belongs to the Density-based family of clustering algorithms. These methods identify clusters as regions of high point density separated by regions of low density. They can discover clusters of arbitrary shape.

Related Examples