Why does a tiny change in ε sometimes merge or split clusters dramatically?

DBSCAN’s decisions are **thresholded** on counts within a fixed-radius ball. Near critical densities, adding a small ε can suddenly connect two dense regions through a sparse “bridge” of points, merging clusters; shrinking ε can sever that bridge. This **non-smooth** behavior is intrinsic to hard density thresholds.

How should I pick minPts in 2D?

A common rule of thumb is **minPts ≈ 2 × dim** for low-dimensional spatial data (so **4** is a frequent baseline in the plane), then tune for noise tolerance—larger **minPts** demands denser cores and tends to label more border/sparse points as **noise**.

Does this implementation exactly match sklearn’s DBSCAN?

It follows the same **core / border / noise** logic and ε-neighborhood expansion, but omits advanced indexing (kd-trees) and edge-case policies used in optimized libraries. The goal is **visual correctness** for teaching, not bit-identical parity with production code.

DBSCAN Density Clustering

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) discovers clusters of arbitrary shape without fixing the number of clusters k. Two user parameters control density: a distance ε defining local neighborhoods, and minPts, the minimum number of points (including the query point itself) required inside an ε-ball for that point to be a core object. The algorithm grows clusters by transitively expanding from cores: any unvisited point in the ε-neighborhood of a core is pulled into the same cluster; non-core points reached this way are border points; anything never absorbed is labeled noise. Unlike k-means, DBSCAN can reject sparse outliers and separate nearby blobs when ε is small enough—while large ε tends to bridge distinct groups. This page recomputes labels instantly on the plane as you drag ε and minPts, colors clusters, outlines noise distinctly, and optionally draws ε-circles around cores in screen space (a teaching overlay, not a second metric). A built-in demo mixes four tight Gaussian blobs with uniformly scattered background points to make the noise class visually obvious.

Who it's for: Introductory machine-learning or spatial-data students comparing partition-based k-means with density-based clustering; pairs naturally with the Lloyd k-means lab on this site.

Key terms

DBSCAN
ε-neighborhood
minPts
Core point
Border point
Noise
Density reachability
Arbitrary-shaped clusters

How it works

DBSCAN finds density-connected clusters without fixing k: a point is a core if at least minPts neighbors lie within distance ε (including itself). From each unvisited core, the algorithm expands by unioning ε-neighborhoods of cores; other reached points become border; points never absorbed stay noise. Drag ε and minPts to split or merge blobs and watch noise points appear at sparse regions.

Frequently asked questions

Why does a tiny change in ε sometimes merge or split clusters dramatically?: DBSCAN’s decisions are thresholded on counts within a fixed-radius ball. Near critical densities, adding a small ε can suddenly connect two dense regions through a sparse “bridge” of points, merging clusters; shrinking ε can sever that bridge. This non-smooth behavior is intrinsic to hard density thresholds.
How should I pick minPts in 2D?: A common rule of thumb is minPts ≈ 2 × dim for low-dimensional spatial data (so 4 is a frequent baseline in the plane), then tune for noise tolerance—larger minPts demands denser cores and tends to label more border/sparse points as noise.
Does this implementation exactly match sklearn’s DBSCAN?: It follows the same core / border / noise logic and ε-neighborhood expansion, but omits advanced indexing (kd-trees) and edge-case policies used in optimized libraries. The goal is visual correctness for teaching, not bit-identical parity with production code.

Other simulators in this category — or see all 61.

View category →

NewSchool

PCA in 2D (principal components & 1D projection)

Click-built cloud: covariance eigenvectors as PC1/PC2 arrows from the mean, optional orthogonal drops to the PC1 line, and a bottom strip of PC1 scores — the standard rank-one projection coordinate.

Launch Simulator

NewSchool

Decision Tree Classifier (2D toy)

Greedy axis-aligned splits on a click-labeled scatter: compare **Gini** vs **entropy** impurity, max depth, and min-samples-per-leaf; shaded rectangles show leaf decisions, dashed lines show recursive partitions.

Launch Simulator

NewSchool

Toy 2-Layer MLP + Backprop (XOR / spiral)

Click-labeled 2D data; **tanh** hidden layer + **logistic** output trained by **full-batch** gradient descent on **binary cross-entropy**. Heatmap shows **P(class 1)** evolving across epoch blocks — watch the **0.5 decision contour** wrap XOR or untangle spirals.

Launch Simulator

NewUniversity / research

Convolution (pulses)

Two rectangular pulses; overlap length at τ = 0.

Launch Simulator

NewUniversity / research

Euler vs RK4 (Pendulum)

Same nonlinear pendulum ODE and step h; Euler vs RK4 side by side.

Launch Simulator

NewSchool

Lotka–Volterra

N′ = αN−βNP, P′ = δNP−γP; phase plane RK4; equilibrium dot.

Launch Simulator

DBSCAN Density Clustering

How it works

Frequently asked questions

More from Math Visualization

PCA in 2D (principal components & 1D projection)

Decision Tree Classifier (2D toy)

Toy 2-Layer MLP + Backprop (XOR / spiral)

Convolution (pulses)

Euler vs RK4 (Pendulum)

Lotka–Volterra

DBSCAN Density Clustering

How it works

Frequently asked questions