Welcome to the Exploratory Data Analysis repository! This repo contains a collection of Python Jupyter Notebooks that demonstrate how to explore and gain insights from various real-world datasets using common data science tools such as Pandas, NumPy, Matplotlib, and Seaborn.
Exploratory Data Analysis (EDA) is the process of analyzing data sets to summarize their main characteristics — often using visual methods — before applying modeling techniques. EDA helps uncover patterns, detect anomalies, test hypotheses, and check underlying assumptions. This repository includes EDA studies on a variety of datasets, such as ecommerce purchases, salaries, the Titanic dataset, YouTube channels, and more.
- Data_csv_files / – Contains all the CSV datasets used by the notebooks.
- 01_EDA_Ecommerce_Purchases_Dataset.ipynb – Exploratory analysis on consumer purchase data.
- 02_EDA_Salaries Dataset.ipynb – Insights into salary distributions and trends.
- 03_EDA_Adult_dataset.ipynb – Exploring demographic and income data.
- 04_EDA_Titanic_dataset.ipynb – Visualization of survival trends in the Titanic dataset.
- 05_EDA_Googleplaystore_dataset.ipynb – Analysis of app metadata from Google Play Store.
- 06_EDA_Udemy_Dataset.ipynb – Examining trends within Udemy course data.
- 07_EDA_Supermarket_Sales_Dataset.ipynb – Sales and revenue insights from supermarket data.
- 08_EDA_Top_Youtube_Channel_Dataset.ipynb – Exploration of top YouTube channels.
- 09_EDA_IMDB_Movies_Dataset.ipynb – Movie metadata analysis from IMDb.
- 10_Flight_Price_dataset.ipynb – Investigating patterns in flight pricing.
- The notebooks illustrate how to:
- Load, inspect, and clean datasets
- Explore data structure and feature distributions
- Visualize trends with charts (histograms, scatter plots, box plots, etc.)
- Interpret findings and generate insights
- Compare numerical and categorical variables