Mean Absolute Error measures the average difference between predicted values and actual values in a dataset. It shows how far predictions are from the true values without considering direction.
- Calculated using absolute differences
- Simple to compute and interpret
- Treats all errors equally
- Less sensitive to large errors than MSE
- Commonly used to evaluate regression models
The Mathematical Formula for MAE is:
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|
Where:
y_i :Actual value for the ith observation\hat{y} : Calculated value for the ith observation- n: Total number of observations
Calculating Mean Absolute Error in Python
Method 1: Manual Calculation of MAE
Mean Absolute Error (MAE) is calculated by taking the summation of the absolute difference between the actual and calculated values of each observation over the entire array and then dividing the sum obtained by the number of observations in the array.
Example:
actual = [2, 3, 5, 5, 9]
calculated = [3, 3, 8, 7, 6]
n = 5
sum = 0
for i in range(n):
sum += abs(actual[i] - calculated[i])
error = sum/n
print("Mean absolute error : " + str(error))
Output:
Mean absolute error: 1.8
Method 2: Calculating MAE Using sklearn.metrics
The sklearn.metrics module in Python provides various tools to evaluate the performance of machine learning models. One of the methods available is mean_absolute_error(), which simplifies the calculation of MAE by handling all the necessary steps internally.
Syntax:
mean_absolute_error(actual,calculated)
Where,
- actual: Array of actual values as first argument
- calculated: Array of predicted/calculated values as second argument
It will return the mean absolute error of the given arrays.
Example:
from sklearn.metrics import mean_absolute_error as mae
actual = [2, 3, 5, 5, 9]
calculated = [3, 3, 8, 7, 6]
error = mae(actual, calculated)
print("Mean absolute error : " + str(error))
Output:
Mean absolute error: 1.8
Why to Choose Mean Absolute Error?
- Interpretability: Since MAE is in the same unit as the target variable, it's easy to understand. For instance, an MAE of 5 in a house price prediction model indicates an average error of $5,000.
- Robustness to Outliers: Unlike metrics that square the errors (like MSE), MAE doesn't disproportionately penalize larger errors, making it less sensitive to outliers.
- Simplicity: MAE provides a straightforward measure of average error, facilitating quick assessments of model performance.
MAE vs. Other Error Metrics
Understanding how MAE compares to other error metrics is crucial for selecting the appropriate evaluation measure.
1. Mean Squared Error (MSE)
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2
- Squares the errors, penalizing larger errors more heavily.
- More sensitive to outliers compared to MAE.
- Useful when large errors are particularly undesirable.
2. Root Mean Squared Error (RMSE)
\text{RMSE} = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2 }
- Provides error in the same units as the target variable.
- Like MSE, it penalizes larger errors more than MAE.
3. Mean Absolute Percentage Error (MAPE)
\text{MAPE} = \frac{100\%}{n} \sum_{i=1}^{n} \left| \frac{y_i - \hat{y}_i}{y_i} \right|
- Expresses error as a percentage, making it scale-independent.
- Can be problematic when actual values are close to zero.
Error Metrics Comparison Table
Metric | Penalizes Large errors | Sensitive to Outliers | Interpretability |
|---|---|---|---|
MAE | No | Less | High |
MSE | Yes | More | Moderate |
RMSE | Yes | More | High |
MAPE | No | Varies | High |