Skip to content

A clean and modular Python toolkit for feature engineering, including data preprocessing, transformation, encoding, scaling and feature selection. Suitable for both exploratory data analysis and production workflows.

Notifications You must be signed in to change notification settings

paramedick/FeatureEngineering

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔧 Feature Engineering Toolkit (Python)

A modular and reusable toolkit for performing feature engineering on structured datasets.
This repository provides essential utilities for preprocessing, transforming, and optimizing features for quant finance and machine learning workflows.


📦 Overview

Feature engineering is one of the most critical steps in quant projects.
Here provides a clean pipeline and practical examples for:

  • 🧹 Data preprocessing:missing value handling, outlier detection
  • 📊 Exploratory Feature Analysis:time series analysis, classical technical indices, correlation, visual comparison
  • 🔣 Feature Transformation:temporal dimension, Cross-sectional features, interaction, contextual dimension, demensional reduction
  • 📐 Advanced Engineering:improved engineering methods based on the previous results and comparisons
  • ✂️ Feature Selection:selection based on correlation changes, SHAP from Catboost models

This repository can serve both as a reference and a reusable feature engineering module.


📁 Project Structure

  • Data used: data_cp.csv (too big for uploading)
  • Notebooks:
    1. Technical Indices.ipynb
    2. TimeSeriesAnalysis.ipynb
    3. FeatureEngeering_basic.ipynb

About

A clean and modular Python toolkit for feature engineering, including data preprocessing, transformation, encoding, scaling and feature selection. Suitable for both exploratory data analysis and production workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%