Ph.D. student @ Dartmouth College
Formal Methods, Benchmark
-
Dartmouth College
- Hanover
- www.shenghz.org
Highlights
- Pro
Pinned Loading
-
github/codeql
github/codeql PublicCodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
-
harbor-framework/harbor
harbor-framework/harbor PublicFramework for evaluating and improving agents
-
benchflow-ai/skillsbench
benchflow-ai/skillsbench PublicSkillsBench evaluates how well skills work and how effective agents are at using them.
-
harbor-framework/terminal-bench-3
harbor-framework/terminal-bench-3 PublicMeasuring agents' ability to get work done on a computer
-
benchflow-ai/benchflow
benchflow-ai/benchflow PublicResearch infra for creating RL environments, post-training, and evals
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

