A clean and reproducible workflow for exploring correlations in datasets using Python.
Ideal for data analysis, visualization, and statistical exploration.
The files inside are aimed at data analysis:
- Distribution characteristics: fat tails, skewness, kurtosis
- Multidimensional correlation: Pearson/Spearman/mutual information
- Missing value pattern
- PCA dimensionality reduction and feature clustering
- Time series-aware EDA methods
- Weighted statistical analysis
- (con't)
β Data used:
- data_cp.csv (too big for uploading)
- data.csv
β Notebooks:
- DataVisualization.ipynb
- FatTail&DriftAwareness.ipynb
- MissingDataAnalysis.ipynb
- PCA.ipynb
- Time-sliced EDA.ipynb
- CorrelationAnalysis_basic.ipynb