Skip to content

risingwavelabs/awesome-stream-processing

Repository files navigation

🏫 Awesome Stream Processing 🏫

The term "stream processing" might sound intimidating to many people. We often hear statements like:

  • "Stream processing is too difficult to learn and use!" 😱
  • "Stream processing is very expensive!" 😱
  • "I don’t see any business use cases for stream processing!" 😱

However, we believe this isn't true. ❌

Streaming data is everywhere, generated from operational databases, messaging queues, IoT devices, and many other sources. People can leverage modern stream processing technology to easily address classic real-world problems, using SQL as the programming language.

In this repository, we provide a series of executable demos demonstrating how stream processing can be applied in practical scenarios:

  1. Getting started βœ…

    • Install Kafka, PostgreSQL, and RisingWave, and run minimal toy examples on your device.
    • Integrate RisingWave with other data platforms.
  2. Basic stream processing examples βœ…

    Learn the fundamentals of ingesting, processing, transforming, and offloading data from streaming systems.

    1. Querying and processing event streaming data (πŸ‘ˆ Kafka users, you may start here! πŸ’‘)
    • Directly query data stored in event streaming systems (e.g., Kafka, Redpanda).
    • Continuously ingest and analyze data from event streaming systems.
    1. Bringing analytics closer to operational databases (πŸ‘ˆ Postgres users, you may start here! πŸ’‘)
    • Offload event-driven queries (e.g., materialized views and triggers) from operational databases (e.g., MySQL, PostgreSQL).
    1. Real-time ETL (Extract, Transform, Load)
    • Perform ETL continuously and incrementally.
  3. Simple demonstrations βœ…

    • A collection of simple, self-contained demos showcasing how stream processing can be applied in specific industry use cases.
  4. Solution demonstrations βœ…

    • A collection of comprehensive demos showcasing how to build a stream processing pipeline for real-world applications.
  5. RAG & Metrics Comparisons βœ…

  • RisingWave RAG Demo
    • Build a Retrieval-Augmented Generation system using RisingWave. The pipeline stores documentation chunks and their embeddings, retrieves the most similar documents for a user query, and calls an LLM to generate grounded answers.
  • Compare Metrics (RisingWave vs. Flink)
    • Run the same workloads on both systems using the same message queues and queries to observe and compare performance metrics side by side.
  1. Agent Demo βœ…
    • Use AI agents to analyze data ingested into RisingWave. This client app connects RisingWave’s MCP with Anthropic’s LLM to parse natural-language questions, discover relevant tables/schemas, call data tools, and iteratively return clean results (e.g., formatted tables).
  2. Data Engineering Agent Swarm βœ…
    • A multi-agent system for common data engineering tasks with RisingWave and Kafka integration. Includes a planner that delegates to specialized agents for database ops, stream processing, and pipeline orchestration; supports automatic schema inference and an interactive chat loop.
  3. RisingWave + Apache Iceberg β€” End-to-End Streaming Lakehouse Demos βœ…

We use RisingWave as the default stream processing system to run these demos. We also assume that you have Kafka and/or PostgreSQL installed and possess basic knowledge of how to use these systems. These demos have been verified on Ubuntu and Mac.

All you need is a laptop πŸ’» - no cluster is required.

Any comments are welcome. Happy streaming!

Join our Slack community to engage in discussions with thousands of stream processing enthusiasts!

About

A collection of demos showcasing how stream processing can be used to solve real-world problems.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published