The idea here is to properly index my posts. However, I didn’t have time to finish yet.
This page is and will always be under construction!
Data sets
German Credit Data
Data pre-processing
Unsupervised data pre-processing: individual predictors
Near-zero variance predictors. Should we remove them?
Unsupervised Learning
PCA
Introduction to Principal Component Analysis (PCA)
Computing and visualizing PCA in R
Supervised Learning
Classification
Discriminant Analysis
Discriminant Analysis
Reduced-rank discriminant analysis
Computing and visualizing LDA in R
Regression
Linear Regression
Linear regression (according to Coursera’s ML course)
Logistic Regression
Logistic regression (according to Coursera’s ML course)
Spatial Modeling
Auto-logistic model
Latent Gaussian Models
Fast Bayesian Inference with INLA
Latent Gaussian Models and INLA
Web Scraping
The basics of XML for web-scraping
Decision theory
Declining marginal utility and the logarithmic utility function
R Software
Software development
devtools and testthat R packages, definitely worth using
Optimizing R with Multi-threaded OpenBLAS
Profiling R code
R scripts
Data handling
Reshape and aggregate data with the R package reshape2
Visualization
Plot matrix with the R package GGally
Text Analysis
Character strings in R
Statistical Models
- Auto-logistic model (18/01/2013)
- First two weeks of Coursera’s Machine Learning (linear regression) (08/05/2013)
- Third week of Coursera’s Machine Learning (logistic regression) (15/05/2013)
- 4th and 5th week of Coursera’s Machine Learning (neural networks) (05/06/2013)
- Bayesian linear regression model – simple, yet useful results (07/08/2013)
- Latent Gaussian Models and INLA (16/10/2013)
Approximate Methods for Statistical Inference
- INLA group (04/02/2013)
- Introduction to Variational Bayes (31/07/2013)
- Fast Bayesian Inference with INLA (09/10/2013)
- Latent Gaussian Models and INLA (16/10/2013)
Model Selection and Model Assessment
- Bias-variance trade-off in model selection (10/04/2013)
- Overview of Supervised Learning according to (Hastie et. al., 2009) (24/04/2013)
- AIC, Kullback-Leibler and a more general Information Criterion (01/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [1/3] (22/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [2/3] (29/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [3/3] (19/06/2013)
- How to properly assess point forecast (03/07/2013)
- 6th week of Coursera’s Machine Learning (advice on applying machine learning) (17/07/2013)
- Posterior predictive checks (28/08/2013)
- 6th week of Coursera’s Machine Learning (Error analysis) (18/09/2013)
Numerics
- Fast, simple and useful numerical integration methods (02/10/2013)
- Numerical computation of quantiles (23/10/2013)
Other Statistical Concepts
- Kullback-Leibler divergence (10/07/2013)
Tools/Software
– R
- ODB R package (02/04/2013)
- devtools and testthat R packages, definitely worth using (26/06/2013)
- Optimizing R with Multi-threaded OpenBLAS (21/08/2013)
- Profiling R code (25/09/2013)
- Latent Gaussian Models and INLA (16/10/2013)
- Numerical computation of quantiles (23/10/2013)
- Reshape and aggregate data with the R package reshape2 (31/10/2013)
- Unsupervised data pre-processing for predictive modeling (07/11/2013)
-UNIX-like
- Run long computations remotely with screen (22/01/2013)
- Scheduling R scripts to run on a regular basis (11/09/2013)
– Others
- How to draw neural network diagrams using Graphviz (12/06/2013)
- Using Dropbox as a private git repository (24/07/2013)
- LaTeX and WordPress.com (14/08/2013)
- The basics of XML for web-scraping (04/09/2013)
Book summaries and/or comments
– The elements of statistical learning: data mining, inference and prediction, by Trevor Hastie, Robert Tibshirani and Jerome Friedman (book link)
- Overview of Supervised Learning according to (Hastie et. al., 2009) (24/04/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [1/3] (22/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [2/3] (29/05/2013)
- Model selection and model assessment according to (Hastie and Tibshirani, 2009) – Part [3/3] (19/06/2013)
– Coursera’s Machine Learning course, by Andrew Ng (course link)
- First two weeks of Coursera’s Machine Learning (linear regression) (08/05/2013)
- Third week of Coursera’s Machine Learning (logistic regression) (15/05/2013)
- 4th and 5th week of Coursera’s Machine Learning (neural networks) (05/06/2013)
- 6th week of Coursera’s Machine Learning (advice on applying machine learning) (17/07/2013)
- 6th week of Coursera’s Machine Learning (Error analysis) (18/09/2013)
Uncategorized (yet)
- How much gold is there in the world? (02/04/2013)
- College education. Is it for everyone? (05/04/2013)
- The $100 Startup – Guidelines to set your own microbusiness (09/04/2013)
- Productivity Paradox and Prediction Failures (21/04/2013)
This site is like a goldmine for budding data scientists like me. Thank You!!!