Research Engineer / Applied Scientist
Integrated M.S.-Ph.D. Candidate in Computer Science and Engineering, Seoul National University
Expected Graduation: August 2026
I build multimodal AI systems that reason more reliably over images, videos, and language. My work focuses on reliable multimodal reasoning, video understanding, and evaluation reliability, with recurring use of structured intermediate representations, confidence-aware inference, internal knowledge graphs, temporal grounding, and LLM-as-a-Judge protocol design.
- Reliable multimodal reasoning for language-vision systems
- Video understanding and structured video-language pipelines
- Evaluation reliability for LLM-based assessment systems
-
C2R: Confidence-guided Refinement Reasoning for Zero-shot Question Answering
First-author paper at EMNLP 2025. I built a training-free inference pipeline that curates sub-question reasoning traces and selects answers with confidence-guided refinement for zero-shot multimodal QA. -
INQUIRER: Harnessing internal knowledge graphs for video question generation
Second-author paper in Knowledge-Based Systems 2025. I contributed to a structured video question-generation pipeline that uses internal knowledge graphs to produce more useful supervision for downstream video reasoning tasks. -
The Impact of Likert Scale Design on Judgment Reliability in Korean and English LLM-as-a-Judge
First-author paper in KIISE Transactions on Computing Practices 2026. I analyzed how score direction, label semantics, and evaluator choice materially change direct-scoring reliability across Korean and English settings. -
Instruction-tuned Self-Questioning Framework for Multimodal Reasoning
First-author workshop paper. I developed a multimodal reasoning framework that decomposes image question answering into instruction-tuned auxiliary questioning steps.
I also contributed to earlier foundational work on character-centered video story understanding through DramaQA.
- 4 patent families in multimodal reasoning and video-story understanding
- Includes a self-questioning-based visual question answering patent family
- Includes DramaQA-related question answering and character-centered video story understanding patent families with confirmed Korean and international records
-
Development of Uncertainty-Aware Agents Learning by Asking Questions (2022-present)
Student responsible researcher on a long-horizon project about uncertainty-aware agents that improve by asking questions. -
LA4IRA@RO-MAN 2023
Workshop organizer for research on learning by asking for intelligent robots and agents. -
DramaQA / Video Turing Test activities
Organizer across workshops, challenges, and competition activities related to video story understanding and benchmark building.
-
BioIntelligence Lab, Seoul National University β Integrated M.S./Ph.D. Researcher
Mar. 2019 - Expected Aug. 2026
Research on multimodal reasoning, video understanding, and LLM evaluation. -
KT Corporation β Research Intern
Jul. 2023 - Aug. 2023
Worked on instruction-tuned self-questioning for multimodal reasoning. -
NAVER Corp β Research Intern
Jul. 2018 - Aug. 2018
Worked on hierarchical category classification and large-scale category structure.
-
Seoul National University β Integrated M.S. and Ph.D. in Computer Science and Engineering
Mar. 2019 - Expected Aug. 2026 -
Seoul National University β B.S. in Computer Science and Engineering
Mar. 2015 - Feb. 2019 -
Seoul Science High School (SSHS) β High School
Mar. 2012 - Feb. 2015
- Programming: Python, C++
- Frameworks & Tools: PyTorch, Hugging Face, Git, Linux, LaTeX
- Research Areas: Reliable multimodal reasoning, video understanding, vision-language models, LLM evaluation, temporal grounding
I am particularly interested in Research Engineer and Applied Scientist roles in multimodal AI, video understanding, and evaluation-reliability problems, especially in industry research labs that value both research depth and system-building ability.

