How HDBSCAN Clustering Works

HDBSCAN extends DBSCAN to handle varying-density clusters. It builds a hierarchy and extracts the most stable clusters automatically.

HDBSCANhierarchical densityvarying densitycluster stabilitysoft clusteringnoise robust

About the Customer Segments Dataset

200 synthetic customer records with spending, income, and loyalty data. Perfect for market segmentation with clustering.

Samples: 200
Features: 5
Type: Numeric
Category: Density-based

Key Metrics to Watch

Silhouette Score

Measures how similar a point is to its own cluster vs. other clusters. Ranges from −1 to +1; higher is better.

Calinski-Harabasz Index

Ratio of between-cluster to within-cluster variance. Higher values indicate denser, well-separated clusters.

Davies-Bouldin Index

Average similarity between each cluster and its most similar cluster. Lower is better.

Inertia (Within-Cluster SSE)

Sum of squared distances from each point to its assigned centroid. Lower indicates tighter clusters.

When to Use HDBSCAN Clustering

HDBSCAN Clustering belongs to the Density-based family of clustering algorithms. These methods identify clusters as regions of high point density separated by regions of low density. They can discover clusters of arbitrary shape.

HDBSCAN Clustering on Customer Segments

How HDBSCAN Clustering Works

About the Customer Segments Dataset

Key Metrics to Watch

Silhouette Score

Calinski-Harabasz Index

Davies-Bouldin Index

Inertia (Within-Cluster SSE)

When to Use HDBSCAN Clustering

Related Examples

K-Means Clustering on Customer Segments

K-Medoids Clustering on Customer Segments

DBSCAN Clustering on Customer Segments

HDBSCAN Clustering on Iris Flowers