Skip to content
View dwillis's full-sized avatar

Highlights

  • Pro

Organizations

@unitedstates

Block or report dwillis

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A feature-rich Python text case conversion library

Python 159 Updated Apr 4, 2025

CLI that queries multiple language models in parallel using prompts from a CSV file

Python 22 Updated Mar 28, 2025

Official Implementation of "KBLaM: Knowledge Base augmented Language Model"

Jupyter Notebook 1,144 89 Updated Mar 31, 2025

Cutting-edge web scraping techniques workshop at NICAR 2025

329 28 Updated Mar 10, 2025

Vision infrastructure to turn complex documents into RAG/LLM-ready data

Rust 2,093 117 Updated Apr 4, 2025

Export any Kindle book you own as text, PDF, EPUB, or as a custom, AI-narrated audiobook. 🔥

TypeScript 79 8 Updated Oct 12, 2024

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 7,187 863 Updated Mar 31, 2025

Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-level concepts to analyze unstructured text.

Python 95 18 Updated Feb 26, 2025

Greg's

R 13 2 Updated Mar 8, 2025

A pytest plugin for running and analyzing LLM evaluation tests.

Jupyter Notebook 117 3 Updated Feb 5, 2025

A text-to-speech (TTS) and Speech-to-Speech (STS) library built on Apple's MLX framework, providing efficient speech synthesis on Apple Silicon.

Python 408 47 Updated Apr 2, 2025

python CLI for interacting with unix tools built for people who haven't committed manpages to memory

Python 3 Updated Mar 10, 2025

The repository for the NICAR 2024 class, SELECT * FROM interesting

Jupyter Notebook 17 1 Updated Feb 2, 2024
Jupyter Notebook 21 6 Updated Mar 7, 2025

Tip sheet and activities for a hands-on session about using the command line for the 2025 NICAR conference

8 Updated Mar 7, 2025

semantic search for your spreadsheets

Jupyter Notebook 25 1 Updated Apr 3, 2025

Multi-tool for semantic search

Python 2,592 153 Updated Aug 27, 2024

📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.

Python 1,257 170 Updated Mar 8, 2025

A repository for collecting several simple datasets that track the impact of the Trump 47 regime

HTML 31 2 Updated Apr 5, 2025
Ruby 1 Updated Feb 15, 2025

Codec is a collaborative tool for managing video evidence.

Svelte 66 9 Updated Apr 16, 2024

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

Python 1,180 115 Updated Apr 2, 2025
Python 1,339 102 Updated Feb 15, 2025

potato: portable text annotation tool

Jupyter Notebook 325 55 Updated Apr 3, 2025

An LLM plugin to efficiently pose questions to LLMs, cache the answers, and quickly retrieve answers to questions that you've already posed.

Python 9 1 Updated Feb 9, 2025

Format and Complete Few-Shot LLM Prompts

R 16 1 Updated Jan 14, 2025

A collection of rosters of forms maintained by policing organizations

Rich Text Format 13 Updated Jan 26, 2025

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

C 4,216 432 Updated Feb 10, 2025

This repository contains code and explanations for how to use large language models and a variety of other natural language processing techniques to analyze congressional hearings.

Python 2 Updated Feb 9, 2025
Next