Inspiration

Duchenne Muscular Dystrophy affects 1 in 3,500 male births. When parents hear the diagnosis, they're often told: "Go home and love your child. He's going to stop walking at 10 and he probably won't make it past 18." But exon skipping therapy is changing. Four FDA-approved drugs can convert severe Duchenne into milder Becker muscular dystrophy by restoring a partially functional protein. The problem is that determining whether a specific patient's mutation is amenable to exon skipping requires specialized frame math, domain impact analysis, and cross-referencing scattered clinical trial databases.

What it does

Becker is a clinical decision-support tool for exon skipping therapy in rare genetic diseases. A user inputs a patient's gene and mutation (e.g., DMD exons 45–50 deletion), and the tool instantly detects whether the mutation causes a frameshift, searches for exon skip strategies that restore the reading frame, predicts the resulting protein length and assesses which critical domains are lost, scores functionality, clinical severity, and therapeutic feasibility, matches the suggested skip to FDA-approved drugs and active clinical trials, and visualizes the exon map and 3D protein structure. It currently supports three diseases: Duchenne/Becker Muscular Dystrophy (DMD), Limb-Girdle Muscular Dystrophy 2B (DYSF), and Usher Syndrome Type 2A (USH2A). A built-in validation page verifies model outputs against published clinical cases.

How we built it

We built Becker as a Next.js 16 application in TypeScript with a fully client-side analysis engine with no backend or database required. The core frame math engine calculates coding base pairs from mRNA/CDS coordinates sourced from NCBI RefSeq and Ensembl, then searches for skip strategies that restore reading frame divisibility by 3. Exon tables are structured JSON files containing phase data, skippability flags, and critical domain annotations from UniProt and the Leiden Open Variation Database. The therapy matching system is a curated database of 9+ drugs and trials. We used shadcn/ui and Tailwind CSS for the interface, and NGL Viewer with Three.js for interactive 3D protein structure rendering using PDB and AlphaFold data. The skip strategy sorting algorithm was validated against FDA-approved therapies and then stress-tested across 78 DMD deletion scenarios using Python scripts to prevent overfitting.

Challenges we ran into

The biggest challenge was getting the biology right. Early versions showed predicted proteins longer than wildtype because exon length data from RefSeq includes untranslated regions (UTRs), while protein lengths from UniProt are coding-only. We had to build a getCodingBp function that dynamically calculates the true coding base pairs for each exon by intersecting mRNA coordinates with CDS boundaries. Another challenge was the skip strategy ranking algorithm; initially, it picked distant exons with marginally higher protein retention, but clinically, all four FDA-approved drugs target the exon immediately adjacent to the deletion. We had to rethink our sorting to prioritize proximity, then verify across dozens of mutation scenarios that this was a genuine improvement and not overfitting to our validation cases. Handling duplications also required rethinking the entire analysis pipeline, since unlike deletions, the affected exons are still present in the mRNA and the skip strategy is fundamentally different.

What we learned

We learned that the gap between raw genomic data and clinical decision-making is wider than we expected. Numbers like exon lengths, phases, and protein domains are spread across multiple databases (RefSeq, Ensembl, UniProt, Leiden) that don't always agree. We also learned that "best" in computational biology doesn't always mean highest score — clinical feasibility, ASO delivery practicality, and proximity to the mutation breakpoint matter more than a 1–2% difference in predicted protein retention. Perhaps most importantly, we learned how much rare disease families are depending on therapies like exon skipping, and how meaningful it is to build tools that could help accelerate access to them.

What's next for Becker

Point mutation support: Currently Becker handles whole-exon deletions and duplications. Supporting intra-exonic mutations (nonsense, splice-site, small indels) would cover the remaining ~30% of DMD patients. More genes: The modular architecture is ready for expansion. Spinal Muscular Atrophy (SMN2), Epidermolysis Bullosa (COL7A1), and other exon-skipping-amenable diseases are natural next targets. Live trial integration: Connecting to ClinicalTrials.gov and FDA APIs for real-time therapy matching instead of a static curated database. Clinician-facing reports: Generating downloadable PDF reports summarizing the analysis, suitable for inclusion in a patient's medical record or genetic counseling session. Patient registry integration: Connecting to rare disease registries to help match patients with eligible clinical trials based on their specific mutation.

Built With

Share this project:

Updates