Extract transform load CLI tool for extracting small and middle data volume from sources (databases, csv files, xls files, gspreadsheets) to target (databases, csv files, xls files, gspreadsheets) in free combination.

database etl extractor google-sheets data-engineering data-lake business-intelligence dwh elt datapipeline data-pipeline extract-transform-load usql excel-to-sql etl-job etl-process sharepoint-integration sharepoint-sync sync-google-sheet

Updated May 29, 2025
Python

adilkhash / luigi-course-materials

Star

Материалы для курса Введение в Data Engineering: дата пайплайны

python workflow-engine luigi datapipeline dataengineering dataeng

Updated Feb 18, 2024
Python

onrylmz / serverless-datapipeline-aws-sam

Star

serverless datapipeline aws-sam

Updated Aug 5, 2021
Python

multilayer-io / airflow-kubernetes

Star

Simple Airflow on Kubernetes (GKE)

docker kubernetes airflow gcp google-cloud datascience gke datapipeline kubernetes-executor airflow-kubernetes

Updated Sep 17, 2021
Python

Kushalkhadka7 / dagster_clickhouse_dbt

Star

DBT and clickhouse test project with dagster

clickhouse dbt elt datapipeline dagster

Updated Aug 29, 2023
Python

Yan-Luo-AU / Data_Engineer_Project_ETL_BI

Star

This is an ETL project - extracting data from an ecommerce transactional database on RDS, transforming the data using AWS glue job, and loading it to a Redshift data warehouse, and connected it to Tableau for BI

glue redshift datapipeline data-engineering-pipeline rds-mssql secret-manager

Updated Jan 26, 2021
Python

julian-King22 / etl_with_mage_ai

Star

An ETL data pipeline that extracts data from source and loads it to destination, automated using mage.ai

gcp python3 datapipeline googlecompute dataengineering cloudstorage beautifulsoup4 mageai

Updated Nov 21, 2023
Python

gchatterjee-git / Data-Pipeline-AWS

Star

This is a project which demonstrates creation of a data pipeline by scraping data using twitter API and creating a data delivery stream using Kinesis Firehose for ingesting data to Amazon S3.

python aws data big-data etl twitter-api aws-s3 bigdata kinesis python3 kinesis-firehose boto3 datapipeline real-time-processing aws-glue

Updated May 4, 2020
Python

vvspearlvvs / MusicChatbot

Star

AWS 데이터파이프라인 개��과 음악추천 챗봇

aws serverless datapipeline

Updated Sep 9, 2021
Python

wri / gfw_pixetl

Star

GFW ETL for raster tiles

gfw datapipeline dockerhub-webhook

Updated Nov 26, 2024
Python

MarvinEdorh / Data-Engineering

Star

python bigquery google pipeline etl google-analytics web-analytics gcp google-cloud cloud-platform google-cloud-platform datapipeline data-pipeline google-bigquery webanalytics googleanalytics etl-pipeline gbq

Updated Feb 27, 2023
Python

coder2j / dagster-tutorial

Star

Dagster Tutorial to get you started with Dagster as an absolute Beginner. The tutorial covers various topics like Dagster Installation, Dagster Asset, Dagster Job, Dagster Scheduler, Dagster Ops, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites.

etl orchestration datapipeline dagster

Updated Nov 26, 2023
Python

zekeriyyaa / Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

Star

python aws lambda aws-lambda dynamodb s3 s3-bucket stream-processing datapipeline iam-policy data-engineer iam-role iam-credentials aws-event

Updated May 28, 2022
Python

sanketsanap5 / EScooter-Data-Pipeline

Star

An Automated data pipeline using "Apache Airflow" performing "ETL" on RAW data using "Pandas" library then stage data into "PostgreSQL" then process it distributed cluster and parallelly using "Spark" and loaded final useful data into "ElasticSearch" NoSQL DB warehouse

python elasticsearch kibana sql spark apache-spark etl nosql postgresql pandas pyspark data-engineering dag datapipeline apache-airflow

Updated Jan 2, 2022
Python

ekhtiar / swiss-transport-datapipeline

Star

A data pipeline to daily pull public transport data from the opentransportdata.swiss portal. This pipeline has three tasks, pull the right data from opentransportdata.swiss, push the data to s3 for storage, and transform and load the transformed data to a database. Hopefully this repository helps people explain ETL / Batch data pipeline.

python airflow spark pyspark scrapy datapipeline

Updated Mar 28, 2018
Python

Improve this page

Add a description, image, and links to the datapipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the datapipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datapipeline

Here are 93 public repositories matching this topic...

josephmachado / beginner_de_project_stream

ContextData / VectorETL

kartik4949 / TensorPipe

WaylonWalker / kedro-static-viz

Mg30 / pydwt

ankiano / etl

adilkhash / luigi-course-materials

onrylmz / serverless-datapipeline-aws-sam

multilayer-io / airflow-kubernetes

Kushalkhadka7 / dagster_clickhouse_dbt

Yan-Luo-AU / Data_Engineer_Project_ETL_BI

julian-King22 / etl_with_mage_ai

gchatterjee-git / Data-Pipeline-AWS

vvspearlvvs / MusicChatbot

wri / gfw_pixetl

MarvinEdorh / Data-Engineering

coder2j / dagster-tutorial

zekeriyyaa / Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

sanketsanap5 / EScooter-Data-Pipeline

ekhtiar / swiss-transport-datapipeline

Improve this page

Add this topic to your repo