A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 14,259 1,476 Updated Jan 1, 2026

benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.

Python 10,386 1,812 Updated Dec 1, 2025

bytedance / USO

🔥🔥 Open-sourced unified customization model

Python 1,199 73 Updated Sep 12, 2025

hynek / structlog

Simple, powerful, and fast logging for Python.

Python 4,476 262 Updated Jan 1, 2026

JoeanAmier / KS-Downloader

快手（KuaiShou）无水印视频/图片下载工具；数据采集工具

Python 663 165 Updated Dec 26, 2025

cornerfarmer / ctc_segmentation

Segment a given audio into utterances using a trained end-to-end ASR model.

Python 74 9 Updated Oct 9, 2020

lfos / calcurse

A text-based calendar and scheduling application

C 1,196 114 Updated Sep 24, 2025

pyutils / line_profiler

Line-by-line profiling for Python

Python 3,175 131 Updated Nov 20, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 8,913 985 Updated Dec 13, 2025

google-deepmind / gemma

Gemma open-weight LLM library, from Google DeepMind

Python 3,918 617 Updated Nov 18, 2025

openai / openai-python

The official Python library for the OpenAI API

Python 29,595 4,490 Updated Dec 19, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,964 2,939 Updated Jan 1, 2026

cmatsuoka / figlet

Claudio's FIGlet tree

C 1,561 134 Updated Sep 13, 2023

epidemian / snake

A silly snake game on the browser URL

JavaScript 1,364 109 Updated Sep 30, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,861 304 Updated Jun 12, 2025