Publications

Zejia Lin, Hongxin Xu, Guanyi Chen, Zhiguang Chen, Yutong Lu, and Xianwei Zhang.
Bullet: Boosting GPU Utilization for LLM Serving via Dynamic Spatial-Temporal Orchestration (CCF-A)
The 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2026. [PDF][Code].
Wenxuan Pan, Zejia Lin, Jiangsu Du and Xianwei Zhang.
HuntKTm: Hybrid Scheduling and Automatic Management for Efficient Kernel Execution on Modern GPUs (CCF-A)
ACM Transactions on Architecture and Code Optimization (TACO), Volume 22, Issue 4, Article 161. [PDF][Code]
Kan Wu, Zejia Lin, Mengyue Xi, Zhongchun Zheng, Wenxuan Pan, Xianwei Zhang, and Yutong Lu.
GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving (CCF-A)
The 62nd ACM/IEEE Design Automation Conference (DAC), 2025. [PDF][Slides][Code].
Zejia Lin, Aoyuan Sun, Xianwei Zhang, and Yutong Lu.
MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores (CCF-B)
The 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2024. [PDF][Slides].
Zejia Lin^#, Zewei Mo^#, Xuanteng Huang, Xianwei Zhang, and Yutong Lu.
KeSCo: Compiler-based Kernel Scheduling for Multi-task GPU Applications (CCF-B)
The IEEE 41st International Conference on Computer Design (ICCD), 2023. [PDF][Slides].