Random Forests in R: A Hands-On Interactive Course
Random forests are the strongest model you can train with almost no tuning. This three-lesson interactive course builds one from the ground up: what a decision tree is, why averaging many of them works, and how to train, tune and read a real forest in R.
Most tutorials start at "call this function." This course starts one level deeper, with the single decision tree a forest is made of, and builds up until you can train and tune a forest with confidence and know exactly why every piece is there.
Each lesson is a guided, interactive experience: you drive live models in the browser, answer checkpoints, and write R as you go.
The three lessons
Lesson 1: The building block, Decision Trees
How a tree splits data, how to grow one in R, and a live demo where you raise a tree's depth and watch it overfit before your eyes. This is the flaw the whole forest exists to fix.
Start Lesson 1: Decision Trees
Lesson 2: From one tree to a forest
Why averaging many noisy trees cancels their errors, how bootstrap samples make trees differ, and the random-feature trick that makes a random forest beat plain bagging. You will drag a slider and watch a jagged boundary smooth into an accurate one.
Start Lesson 2: Bagging and decorrelation
Lesson 3: Train, tune and read a forest in R
Out-of-bag error (a free test set), tuning mtry and the number of trees on a live forest, reading variable importance, and the ranger code to do it for real. Ends with the path to your Machine Learning certificate.
Start Lesson 3: Training and tuning in R
Who this is for
You are comfortable running R and know what a training and test set are. You do not need any prior machine learning. By the end you will understand random forests well enough to use them on your own data and to explain, precisely, why they work.
What you will be able to do
- Explain how a decision tree chooses its splits and why a single deep tree overfits
- Describe how bootstrap sampling and random feature selection decorrelate the trees
- Train a random forest in R, read its out-of-bag error, and tune mtry and the number of trees
- Read variable importance and recognise where random forests are the wrong tool
Ready? Begin with Lesson 1.