Skip to content

Machine Learning Models & Experiments A collection of machine learning projects and experiments implemented from scratch and using scikit-learn. This repository showcases my learning journey in data preprocessing, feature engineering, model training, evaluation, and visualization using real-world datasets.

Notifications You must be signed in to change notification settings

girishbetter/Machine_Learning_Models

Repository files navigation

Machine Learning Models & Projects

This repository showcases my hands-on learning and implementation of Machine Learning models using real-world datasets.
It demonstrates my ability to build end-to-end ML pipelines — from data ingestion and preprocessing to model training, evaluation, and visualization — using Python and industry-standard libraries.

Focused on learning by implementing real ML workflows, not just theory.


🎯 What This Repository Demonstrates

  • Practical understanding of Machine Learning fundamentals
  • Ability to work with real-world datasets
  • Experience with data preprocessing and feature engineering
  • Model training, evaluation, and interpretation
  • Clean, readable, and modular ML code

This repository is intended as a portfolio for internships and entry-level roles in Data Science / Machine Learning.


🛠️ Tech Stack & Tools

  • Python
  • Pandas – data manipulation
  • NumPy – numerical computation
  • Matplotlib – data visualization
  • scikit-learn – ML models & evaluation
  • KaggleHub – dataset integration

📊 Implemented Models

  • Linear Regression (Regression problems)
  • More models will be added as learning progresses

Each project includes:

  • Dataset loading (CSV / Kaggle)
  • Feature–target separation
  • Categorical data encoding (One-Hot Encoding)
  • Train–test split
  • Model training
  • Performance evaluation
  • Visualization of results

📁 Featured Project: Insurance Cost Prediction

Problem Statement:
Predict medical insurance charges based on personal and lifestyle attributes.

Dataset: US Health Insurance Dataset (Kaggle)
Target Variable: charges (continuous numerical value)

Key Steps:

  1. Loaded dataset using KaggleHub
  2. Performed data preprocessing and encoding of categorical variables
  3. Split dataset into training and testing sets
  4. Trained a Linear Regression model
  5. Evaluated model performance using regression metrics
  6. Visualized actual vs predicted values

📈 Model Evaluation Metrics

  • Mean Absolute Error (MAE) – average prediction error
  • Mean Squared Error (MSE) – penalizes large errors
  • Root Mean Squared Error (RMSE) – error in original units
  • R² Score – overall model fit

These metrics help assess both accuracy and reliability of predictions.


▶️ How to Run the Project

  1. Clone the repository:
    git clone https://github.com/your-username/your-repo-name.git
  2. Install dependencies:

pip install pandas numpy matplotlib scikit-learn kagglehub

  1. Run the model script or notebook:

python model_file.py

👤 Author

Girish Chapekar

Aspiring Machine Learning Engineer with a strong interest in applied ML and data-driven problem solving.

⭐ Acknowledgements

Kaggle for datasets

scikit-learn documentation

Open-source ML community

About

Machine Learning Models & Experiments A collection of machine learning projects and experiments implemented from scratch and using scikit-learn. This repository showcases my learning journey in data preprocessing, feature engineering, model training, evaluation, and visualization using real-world datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors