Model Accessment
What is a Model Accessment?
- Evaluating model to match our purpose
- Metric
- train/validation/test sets
Metric
- Indicators that estimate model’s preformance on a data set
- Cost, Error
- Accuracy
- Precision
- Recall
Cost, Error
- Output of cost function
- Calculated difference between model and data set by cost function
Accuracy
- Ratio of correct answers from entire sample number of data set
$Accuracy = \frac{\Sigma_iequals(y^{(i)},\widehat{y}^{(i)})}{N}=\frac{Number\ of\ correct}{Number\ of\ correct\ +\ Number\ of\ incorrect}$
Precision and Recall
- True positives + false negatives = Real positives
- True Negatives + false positives = Real negatives
Precision
- Ratio of real positives out of expected positives
$precision = \frac{TP}{TP + FP}$
Recall
- Ratio of expected positives out of real positives
$recall = \frac{TP}{TP + FN}$
Precision Recall Trade-off
- Model’s decision border can be moved by adjusting threshold
- There is a trade-off of precision and recall by changing threshold
- Setting high threshold results in high precision and low recall
- Setting low threshold results in low precision and high recall
F1 score
- Harmonic mean of precision and recall
- F1 score is used when evaluating model’s performance considering both precision and recall scores
-
F1 score stays the same when threshold trades off precision and recall
$precision = \frac{TP}{TP+FP}$
$recall = \frac{TP}{TP+FN}$
$f1\ score = \frac{2\cdot precision\cdot recall}{precision + recall}$
Datasets
- Dataset consists of train set, validation set and test set
- Train set
- Dataset that is directly used for model parameter train
- Validation set
- Dataset that is used to find $h_\theta$ with high performance from hypothesis set $\mathcal{H} = {h_{\theta^1}, h_{\theta^2}, …, h_{\theta^I}}$ which is made from model’s learning process
- $\hat{h}\boldsymbol\theta = \displaystyle\operatorname*{argmin}{h_\boldsymbol\theta\in\mathcal{H}}\sum_{i}^{N}metric(h_\boldsymbol\theta(\boldsymbol{x}^{(i)}),y^{(i)})$
- Dataset that is used to find $h_\theta$ with high performance from hypothesis set $\mathcal{H} = {h_{\theta^1}, h_{\theta^2}, …, h_{\theta^I}}$ which is made from model’s learning process
- Test set
- Dataset that is used to evaluate model’s final performance
- Train set
- Things to keep in mind when dividing dataset
- Divide to fit the purpose
- Keep similar data distribution between train set, validation set and test set in most cases
Model Selection
- Model selection is a process of selecting best model from hypothesis set $\mathcal{H}$ $\hat{h}\boldsymbol{\mathcal{H}} = \displaystyle\operatorname*{argmin}{h_{\boldsymbol\theta}\in\mathcal{H}}\sum_{(x,y)\in D_{valid}}^{N} cost(h_{\boldsymbol\theta}(x),y)$
Cross Validation
- How to access final performance after model selection?
- Cross Validation
- Methodology of model accessment
- Popular cross validation methods
- Holdout cross validation
- K-fold cross validation
- Cross Validation
- Holdout cross validation
- Fix dataset after dividing train set, validation set and test set
- Decide final model only using validation set and record the performance of test set $Score_{test} = \frac{1}{N}\displaystyle\sum_{(x,y)\in D_{test}}^{N}metric(\hat{h}\boldsymbol\theta(x),y)$ $\hat{h}\theta = \displaystyle\operatorname*{argmin}{h\theta\in\mathcal{H}}\displaystyle\sum_{(x,y)\in D_{valid}}^{N}cost(h_\theta(x),y)$
- K-fold cross validation
- Devide the train set with K folds and use each fold as test set to calculate the performance and average the scores
Model Diagnosis
- Bias vs Variance
- Learning curves
Bias vs Variance
- Bias? Variance?
- High variance = model is too complicated = overfitting = low versatility
- High bias = model is too simgple = underfitting = low performance
- Need to find optimal model complexity
Learning Curves
- A model can be diagnosed as high bias or high variance through learning curve
- In case of high bias
- Quick convergence of performance for a small train set
- High error
- Adding more data won’t work
- In case of high variance
- Big difference between validation cost and training cost
- Posibility of performance improvement by adding more data
Leave a comment