- Why do we square the residuals instead of just using their absolute values?
- Squaring the residuals emphasizes larger errors more heavily, which makes the fit less sensitive to a few extreme outliers compared to a simple sum. Mathematically, the squared error function is differentiable everywhere, which allows us to use calculus to derive a unique, analytic solution for the best-fit parameters. Using absolute values is valid (called least absolute deviations) but often requires more complex iterative methods to solve.
- Does the least squares line always pass through the average point (x̄, ȳ) of the data?
- Yes, for a simple linear least squares fit, the best-fit line always passes through the centroid of the data, the point defined by the mean of the x-values and the mean of the y-values. This is a direct mathematical consequence of the normal equations used to derive the slope and intercept.
- What is a real-world example where least squares fitting is used?
- Least squares is ubiquitous in science and engineering. For instance, in physics, it's used to determine the acceleration due to gravity from noisy position-time data in a free-fall experiment. In economics, it might model the relationship between consumer income and spending. Any time you see a 'line of best fit' in a scatter plot, it's likely calculated via least squares.
- What is a key limitation of the simple linear model shown here?
- This model assumes the relationship is strictly linear. It will produce a misleading fit if the true underlying trend is curved (e.g., quadratic or exponential). It also assumes the scatter (noise) is consistent across all x-values (homoscedasticity) and that data points are independent. Real data often violate these assumptions, requiring more advanced models.