An end-to-end data analytics project that processes raw Spotify track data and visualizes key music insights through an interactive Power BI dashboard.
The dashboard provides a comprehensive view of the Spotify dataset with the following KPIs and visualizations:
| Metric | Value |
|---|---|
| Total Artists | 3,918 |
| Total Songs | 9,727 |
| Average Popularity | 36.53 |
| Avg Duration (min) | 3.42 |
| Most Popular Song | i'm good (blue) |
- Top Artists – Bar chart ranking the most prolific artists (e.g. Jack Harlow, Jhayco, Marilyn Manson, Daddy Yankee, Vybz Kartel, Feid, etc.)
- Avg Popularity By Genre – Horizontal bar chart comparing genres (Hard-Rock leads, followed by Chill, Acoustic, Afrobeat, Hardstyle, Goth, Pop, Alternative)
- Total Artists By Explicit Songs – Donut chart showing 548 (100%) explicit tracks
- Explicit Content Scatter Plot – Bubble chart mapping explicit vs. non-explicit song distributions
- Artist Slicer – Interactive filter panel for drilling down by individual artists
- Explicit Toggle – Switch between
False/Trueto filter explicit songs
| File | Description |
|---|---|
app.ipynb |
Jupyter notebook for data cleaning and normalization |
dataset.csv |
Raw Spotify dataset (input) |
cleaned_spotify.csv |
Cleaned and processed dataset (output) |
Spotify.pbix |
Power BI dashboard file |
README.md |
Project documentation |
- Load and preview the raw Spotify dataset
- Remove unnecessary columns and duplicates
- Handle missing values and standardize text fields
- Correct data types and split multi-artist fields
- Convert duration to minutes and standardize genres
- Filter out invalid records
- Export the cleaned dataset as
cleaned_spotify.csv
- Open
app.ipynbin Jupyter or VS Code. - Run all cells to clean and process the data.
- Load
cleaned_spotify.csvinto Power BI to explore the dashboard.
- Python 3.x
- pandas
pip install pandasspotify-data-analytics/
├── app.ipynb
├── dataset.csv
├── cleaned_spotify.csv
├── Spotify.pbix
└── README.md