Sumin Park

M.S. student at KAIST

profile.jpeg

I am a Master’s student at School of Computing, Korea Advanced Institute of Science and Technology (KAIST), advised by Professor Noseong Park. My research focuses on how architectural inductive biases and training dynamics give rise to structured internal representations and functional specialization in large-scale neural networks. More specifically, my current interests include:

Research Interest

  • Mechanistic interpretability of LLMs
  • Efficient sequence modeling with state space models and linear attention
  • Representation learning for input-network functional specialization

Building on my undergraduate background in neuroscience, my longer-term research direction lies in introducing brain-inspired inductive biases as a conceptual framework for understanding and designing universal learning principles that can be shared by both brain and machines.

Ongoing Projects

Understanding attention failures through spectral regimes

My current project investigates whether different modes of model failures correspond to distinct spectral regimes of attention, rather than a single pathology. Using a Gaussian-equivalent null model for attention, we analyze diffuse versus structured failure patterns through random matrix theory (RMT).

Selected publications

  1. 2026
    In Review
    Q-Delta: Beyond Key–Value Associative State Evolution
    Sumin Park, Seojin Kim, Noseong Park
    Query-aware delta rule for linear attention that uses mixed key–query prediction errors, enabling richer, jointly corrective state evolution dynamics
    Linear Attention SSMs LLMs
  2. 2026
    In Review
    STAR: Rethinking MoE Routing as Structure-Aware Subspace Learning
    Sumin Park, Noseong Park
    Input-aware MoE routing based on incremental subspace learning for evolving input representation
    MoE Representation LLMs
  3. 2026
    AAAI
    How Many Experts Are Enough? Towards Optimal Semantic Specialization for Mixture-of-Experts
    Sumin Park, Noseong Park
    Adaptive MoE expansion mechanism based on gradient-guided semantic drift signals
    MoE
  4. 2024
    ICML
    PANDA: Expanded Width-Aware Message Passing Beyond Rewiring
    Jeongwhan Choi, Sumin Park, Hyowon Wi, Sung-Bae Cho, Noseong Park
    Expanded width-aware message passing for GNNs to address the over-squashing problem
    GNNs