liushz

Follow

liushz

Follow

@shanghai AI Lab / FuDan NLP

15 followers · 10 following

Shanghai

Achievements

Achievements

Pinned Loading

open-compass/opencompass open-compass/opencompass Public

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6.5k 716
open-compass/MathBench open-compass/MathBench Public

[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset

108 1
open-compass/GPassK open-compass/GPassK Public

[ACL 2025] Are Your LLMs Capable of Stable Reasoning?

Python 32 2
open-compass/CompassVerifier open-compass/CompassVerifier Public

[EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Jupyter Notebook 61 2