This lab is an interactive simple linear regression playground on the plane. You build a small dataset by clicking to add points, dragging to move them, and Shift+clicking to delete—so the geometry of leverage and outliers is immediate. The model is y = β₀ + β₁ x with an intercept and one slope. Ordinary least squares (OLS) minimizes the sum of squared vertical residuals. Ridge adds an L2 penalty on the slope only (the intercept is not shrunk), corresponding to the normal equations with a single diagonal regularizer on the second parameter—this tends to pull the slope toward zero and reduces variance at the cost of bias. Lasso uses an L1 penalty on the slope only and is solved here with a short coordinate-descent loop; it can exactly zero the slope for large penalties, performing a kind of hard complexity control. A dedicated Δy spike is applied only to the point with the largest |x| (a high-leverage location for a line), mimicking a classic vertical outlier experiment: OLS often tilts dramatically to reduce squared error on that point, while penalized fits frequently remain closer to the bulk trend. Readouts include SSE and R² = 1 − SSE/SST with SST measured around ȳ on the currently plotted y-values (including the spike). When viewing Ridge/Lasso, you can overlay a faint OLS line to compare slopes directly.
Who it's for: Intro statistics / machine-learning students learning OLS vs penalized regression, R², and outlier sensitivity; pairs well with matrix-form normal-equation lectures.
Key terms
Ordinary least squares
Ridge regression
Lasso regression
L2 and L1 penalties
R-squared
Sum of squared errors
Outliers and leverage
Coordinate descent
How it works
Interactive scatter in the plane: fit y = β₀ + β₁ x with ordinary least squares (OLS), Ridge (L2 on the slope), or Lasso (L1 on the slope, intercept not penalized). A vertical spike on the largest |x| point mimics an outlier in y; compare how OLS tilts while penalized fits often stay closer to the bulk trend. Readouts include SSE and R²; optionally overlay a faint OLS line while viewing Ridge/Lasso.
Frequently asked questions
Why is only the slope penalized, not the intercept?
Penalizing the intercept would make the fit depend on an arbitrary shift of y; most textbook ridge/lasso formulations either center the responses/features or leave the intercept unpenalized so the model can match the overall level of the data. This simulator follows that teaching convention.
Does a large Ridge λ always give a better model?
No—λ trades off bias and variance. Too large a penalty shrinks the slope toward zero even when a steep slope is warranted, underfitting the signal. Cross-validation (not shown here) is the standard way to pick λ in practice.
Why does my Lasso slope hit exactly zero sometimes?
The L1 penalty can drive coefficients to exact zeros (sparse solutions). In this one-slope setup, a sufficiently large λ makes the optimal slope 0, leaving a constant model y ≈ β₀.