Log inSign up
ABC
5,786 posts
user avatar
ABC
@Ubunta
Data & AI Infrastructure for Healthcare | DhanvantriAI | HotTechStack | ChatWithDatabase 🇩🇪Berlin & 🇮🇳Kolkata
Berlin, Germany
abhishekchoudhary.net
Joined August 2009
3,120
Following
5,002
Followers
  • Pinned
    user avatar
    ABC
    @Ubunta
    Oct 11, 2025
    Using Postgres as a Data Warehouse - Start with Postgres 18+ — asynchronous I/O makes table scans 2-3x faster than Postgres 15 - One command runs everything: `docker-compose up`. If partitioning breaks on localhost, it'll break in prod — test the real structure first - Async
    51K
  • user avatar
    ABC
    @Ubunta
    Nov 26, 2022
    Lazydocker - A very useful terminal UI based application to manage Docker This is really a brilliant application for simplifying docker management
    GIF
  • user avatar
    ABC
    @Ubunta
    Mar 4, 2025
    "Hello World" in modern Data Engineering - Create a Dockerfile or setup a dev environment with Python, Sqlalchemy, DuckDB, Polars, Daft installed. - Read CSV/Excel file and convert it to Parquet - Upload the Parquet file in DuckDB - Connect to DuckDB using Polars / Daft. - Make
    36K
  • user avatar
    ABC
    @Ubunta
    Aug 9, 2025
    Replying to @LundukeJournal
    It's way more polite than many stackoverflow comments
    38K
  • user avatar
    ABC
    @Ubunta
    Nov 29, 2023
    As a Senior Staff Data Engineer, my top five tasks over the past 2 years include: 1. Simplifying Kubernetes for Data Scientists/Engineers: Developed user-friendly libraries and containers, enabling Data Scientists to utilize Kubernetes effortlessly. Achieved a complete
    54K
  • user avatar
    ABC
    @Ubunta
    Sep 23, 2022
    People are debating on Snowflake vs Databricks and I am rebuilding my Data/ML stack on @duckdb, Apache Arrow, @IbisData and @flyteorg
  • user avatar
    ABC
    @Ubunta
    Aug 8, 2024
    DrawDB is an excellent tool for database design and ER modeling. I found it very user-friendly, and it also allows you to upload existing schemas. 📌You can check it out here: (github.com/drawdb-io/draw…). I used the generated SQL for PostgreSQL!
    GIF
    21K
  • user avatar
    ABC
    @Ubunta
    Sep 24, 2025
    The Current Shift in Data Engineering - CSV, Excel, and JSON will outlive most tools — formats persist because they're human-friendly - Postgres is still the first "data warehouse" most teams touch before scaling up - "Data pipeline" will remain a vague term nobody fully agrees
    17K
  • user avatar
    ABC
    @Ubunta
    Oct 8, 2025
    Building a Data Engineering Pipeline for Production in 2025/2026 - Local first — docker-compose.yml with Postgres, Redis, DuckDB, Marimo, and Airflow - One command runs your entire data stack: `docker-compose up` - If it doesn't work on localhost, it won't work in prod - Python
    14K
  • user avatar
    ABC
    @Ubunta
    Aug 20, 2024
    Data Engineering and Machine Learning are currently in one of their most exciting phases: - Single-node data stacks, like @DataPolars and Apache Arrow, are now capable of handling 80% of data use cases, even with terabytes of data. - @duckdb is rapidly gaining traction, with
    28K
  • user avatar
    ABC
    @Ubunta
    Oct 12, 2023
    Data Engineering offers good pay if you're skilled in several technologies - Streaming engines: Flink & Kafka - DWH: Spark , snow, trino, clickhs - Distributed DB: hbase, cockroachdb, yugabyte - Infrastructure: elk stack, docker + Python & sql "Ability to explain ☝️ these"
    30K
  • user avatar
    ABC
    @Ubunta
    Jan 19, 2023
    Apache Arrow is on Fire 🔥🔥🔥 🙏 Data Fusion 🔥 @duckdb ⚡Polars Data To me, @ApacheArrow is now the most important component in the data and ML community
    25K
  • user avatar
    ABC
    @Ubunta
    Oct 19, 2025
    Designing Postgres for Large Data Engineering Workloads, it works - Postgres 18's async I/O made old queries feel new — sequential scans that crawled at 40s now finish in 12s, no tuning required - Batch writes are non-negotiable — COPY and execute_values turned 40-second ingests
    15K
  • user avatar
    ABC
    @Ubunta
    Nov 9, 2025
    How to Keep DuckDB in Sync with Postgres- the Easy Local CDC Way - You have live data in Postgres and want it in DuckDB for analytics — this should take 10 minutes to set up, not 10 days -No need to install Kafka, Zookeeper, and Debezium like you're building super Data
    16K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up