Hey folks! I'd like to more formally introduce and talk about a few of the projects we're working on towards improving large-scale bioinformatics analyses (one of which was developed during St. Jude's KIDS24 Biohackathon)!
If you're not aware, my team started a project called St. Jude Rust Labs (https://lnkd.in/eXY_9dJ8) where we're rewriting a more foundation for bioinformatics analysis in Rust. This cuts across more than just workflow execution, but, recently, we've been primarily focused on improving the experience of working with the Workflow Definition Language (WDL) at scale. To that end, we've released a number of projects, including:
- We've written a complete lexer/parser for WDL v1.2 language (https://lnkd.in/eGWHSM2g). Anyone can use this foundation to build tools for WDL.
- We've written a VSCode plugin that includes a language server protocol (LSP) for linting and validation directly in your editor (https://lnkd.in/euApG9tF).
- And, announcing today, we've started down the road of writing our own WDL execution engine spread across the Crankshaft (https://lnkd.in/et_RFkA3) and Sprocket (https://lnkd.in/eUTuPE68) projects.
Crankshaft was prototyped during the St. Jude Biohackathon last week—it's a _headless_ workflow execution engine, meaning that, in theory, others could come along and write drivers for Crankshaft built on NextFlow, CWL, Snakemake, etc. As I said earlier, our team is really focused on WDL specifically, so we're going to continue building out a "head unit" for Crankshaft using WDL. That being said, I would love to see other community projects popping up and using the core machinery for other workflow languages.
Thanks to all of the individuals who participated on our Biohackathon team (Kevin Benton, suchitra chavan, Braden Everson, Andrew Frantz, Michael Gattas, Peter Huene, and John McGuigan)!