Skip to content
View wuxuedaifu's full-sized avatar
  • Xiaomi
  • Beijing
  • 11:20 (UTC -12:00)

Block or report wuxuedaifu

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
wuxuedaifu/README.md

Hi, I'm Fu Dai (wuxuedaifu) ���

Data Scientist, AI/ML engineer building production multimodal systems — voice, vision, and LLM-driven applications.

  • 🔭 Currently working on: Face Recognition · Voice STT & TTS · LLM-based news intelligence · Fault diagnosis
  • 🧠 Focus: Large Language Models, multi-agent systems, and multimodal AI (voice + vision), deployed at production scale on Kubernetes
  • 🌱 Exploring: real-time streaming TTS and agentic workflows

🛠️ Tech Stack

Languages: Python · Java · SQL · Shell AI/ML: LLMs · multi-agent systems · ASR/TTS · multimodal (voice + vision) Backend: FastAPI · Spring Boot · Prefect Data: ClickHouse · PostgreSQL Infra: Docker · Kubernetes · GitLab CI/CD

📫 Connect

Pinned Loading

  1. vllm-surya-ocr vllm-surya-ocr Public

    OpenAI-compatible, vLLM-served OCR API for the Surya-OCR-2 model — multilingual document OCR (layout + text recognition) with request batching, a local CLI, and Docker packaging.

    Python 21

  2. xttsv2-vllm-streaming-server xttsv2-vllm-streaming-server Public

    couqi xtts vllm deploment, real-streaming, TTFB 0.5s

    Python 24 2

  3. vllm-chatterbox-stream vllm-chatterbox-stream Public

    OpenAI-compatible multilingual TTS server — Chatterbox on vLLM with real-time PCM audio streaming, low time-to-first-byte (~0.7 s), voice cloning, and 23 languages.

    Python 24 2

  4. pipecat-plugin-tenvad pipecat-plugin-tenvad Public

    pipecat-plugin-local-test

    Python 3

  5. insightface insightface Public

    Forked from deepinsight/insightface

    State-of-the-art 2D and 3D Face Analysis Project

    Python 1

  6. deepfilter-stream deepfilter-stream Public

    Real-time streaming noise cancellation with DeepFilterNet3 on ONNX Runtime

    Python 1