- Seattle, WA
- kyleclo.com
- @kylelostat
- @kylelo.bsky.social
Stars
Code for collecting, processing, and preparing datasets for the Common Pile
Modeling, training, eval, and inference code for OLMo
A collection of scripts that build docker images for various use-cases.
Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OFLbgUP04nC
Apache PDFBox extension for precisely extracting character/symbol locations and identities from born-digital PDF files.
Replication code for "With Little Power Comes Great Responsibility"
Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)
A large (>5k) collection of search questions asked about Coronavirus ðŸ¦
Unsupervised text tokenizer for Neural Network-based text generation.
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Code for Defending Against Neural Fake News, https://rowanzellers.com/grover/
A full spaCy pipeline and models for scientific/biomedical documents.
An Interactive Tool for Scalable and Reproducible Error Analysis.
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Acceptance rates for the major AI conferences
Library to scrape and clean web pages to create massive datasets.





