About me

Assistant professor, PhD advisor
The School of Computer Science,
Peking University,
Beijing, China

I am currently engaged in the research and design of storage systems and specialized processors. My research addresses the requirements for high-performance storage systems in the era of big data and artificial intelligence from the perspective of computer architecture. I am dedicated to breaking through the bottlenecks of data migration and the limitations of memory walls in the Von Neumann architecture.


Important:

I am actively seeking talented and self-motivated students. There are two openings per year for future PhD candidates and multiple positions for interns. It’s always welcome to contact me via email.


News:

  • Febuary 2025: Invited to serve as PC of MICRO.
  • December 2024: One paper is accepted to Eurosys’25.
  • November 2024: One paper is accepted to NSDI’25.
  • November 2024: Three papers are accepted to HPCA’25. Congratulations to Xiurui and Endian!
  • October 2024: Invited to serve as PC of USENIX ATC, ISCA and ISPASS.
  • August 2024: BIZA is accepted to SOSP’24. Congratulations to Shushu, Shaocong and Li Peng!
  • July 2024: Two papers are accepted to MICRO’24.
  • July 2024: Two papers are accepted to TC and ToS.
  • May 2024: Invited to serve as PC of HPCA.
  • May 2024: Two papers are accepted to USENIX ATC’24. Congratulations to Shushu Yi, Li Peng, Xiurui and Yuda!
  • April 2024: One paper is accepted to TACO.
  • April 2024: Flagger is accepted to ISCA’24. Congratulations to Xiurui and Yuda!
  • March 2024: Invited to serve as ERC of MICRO.
  • November 2023: One paper is accepted to ASPLOS’24.
  • October 2023: Invited to serve as ERC of ISCA.
  • October 2023: Four papers are accepted to HPCA’24. Congratulations to Yuda and Yuyue!
  • August 2023: Awarded Intel Young Faculty Researcher Program.
+ **July 2023**: invited to serve as TPC of HPCA. + **April 2023**: Awarded 1st prize in national storage technology competition. Congrats to Shushu Yi! + **January 2023**: one paper is accepted to NVMW. + **January 2023**: one paper is accepted to CAL. + **December 2022**: one paper is accepted to SAC. + **October 2022**: invited to serve as TPC of USENIX ATC and SAC. + **September 2022**: awarded ACM SIGCSE Rising Star! + **July 2022**: one paper is accepted to THPC. + **May 2022**: our paper "ScalaRAID" is accepted to HotStorage'22. Congrats to Shushu Yi! + **April 2022**: two papers are accepted to NVMW'22. + **December 2021**: awarded NSFC Excellent Young Scientists Fund Overseas Program (国家自然科学基金优秀青年科学基金海外项目)! + **Sep 2021**: our work "HAMS" is selected as [KAIST breakthroughs 50 years](http://breakthroughs.kaist.ac.kr/?post_no=1954). + **August 2021**: our work "OhmGPU" is reported by [Naver headline + 26](https://search.naver.com/search.naver?where=news&sm=tab_tnw&query=ohm%20GPU&sort=0&photo=0&field=0&pd=0&ds=&de=&mynews=0&office_type=0&office_section_code=0&news_office_checked=&related=1&docid=2770004947353&nso=so:r,p:all,a:all) and [Press](chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/viewer.html?pdfurl=http%3A%2F%2Fcamelab.org%2Fuploads%2FMain%2Fohgpu.pdf&clen=447529&chunk=true). + **July 2021**: one paper is accepted to MICRO'21. + **April 2021**: three papers are accepted to NVMW'21. + **March 2021**: our work "HAMS" is reported by [Naver headline + 39](https://search.naver.com/search.naver?where=news&sm=tab_tnw&query=%ED%85%8C%EB%9D%BC%EB%B0%94%EC%9D%B4%ED%8A%B8&sort=0&photo=0&field=0&pd=0&ds=&de=&mynews=0&office_type=0&office_section_code=0&news_office_checked=&related=1&docid=50360000134412&nso=so:r,p:all,a:all), [KBS](https://news.kbs.co.kr/news/view.do?ncd=5140379&ref=A) and [Press](chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/viewer.html?pdfurl=http%3A%2F%2Fcamelab.org%2Fuploads%2FMain%2Fterabyte.pdf&clen=866764&chunk=true). + **Feb 2021**: one paper is accepted to ISCA'21. + **June 2020**: one paper is accepted to ISCA'20. + **Feb 2020**: one paper is accepted to HPCA'20. + **Feb 2020**: one paper is accpeted to FAST'20. + **Feb 2020**: join KAIST as a postdoctoral researcher. + **Dec 2019**: successfully defend PhD thesis.

Selected Publications:

  • (Eurosys‘25) Daredevil: Rescue Your Flash Storage from Inflexible Kernel Storage Stack
  • (NSDI‘25) Beehive: A Scalable Disaggregated Memory Runtime Exploiting Asynchrony of Multithreaded Programs
  • (HPCA’25) InstAttention: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
  • (HPCA’25) Criticality-Aware Instruction-Centric Bandwidth Partitioning for Data Center Applications
  • (HPCA’25) NeuVSA: A Unified and Efficient Accelerator for Neural Vector Search
  • (SOSP’24) BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and Performance
  • (MICRO’24) FlashLLM: A Chiplet-Based In-Flash Computing Architecture to Enable On-Device Inference of 70B LLM
  • (MICRO’24) NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering
  • (USENIX ATC’24) ScalaCache: Scalable User-Space Page Cache Management with Software-Hardware Coordination
  • (USENIX ATC’24) ScalaAFA: Constructing User-Space All-Flash Array Engine with Holistic Designs
  • (ISCA’24) Flagger: Cooperative Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation
  • (ASPLOS’24) Achieving Near-Zero Read Retry for 3D NAND Flash Memory
  • (HPCA’24) BeaconGNN: Large-Scale GNN Acceleration with Asynchronous In-Storage Computing
  • (HPCA’24) StreamPIM: Streaming Matrix Computation in Racetrack Memory
  • (HPCA’24) LearnedFTL: A Learning-based Page-level FTL for Reducing Double Reads in Flash-based SSDs
  • (HPCA’24) Midas Touch: Invalid-Data Assisted Reliability and Performance Boost for 3D High-Density Flash
  • (MICRO’21) Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors
  • (ISCA’21) Revamping Storage Class Memory With Hardware Automated Memory-Over-Storage Solution
  • (ISCA’20) ZnG: Architecting GPU Multi-Processors with New Flash for Scalable Data Analysis
  • (USENIX FAST’20) Scalable Parallel Flash Firmware for Many-core Architectures
  • (HPCA’20) DRAM-less: Hardware Acceleration of Data Processing with New Memory
  • (DAC’19) FlashGPU: Placing New Flash Next to GPU Cores
  • (HPCA’19) FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads
  • (OSDI’18) FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs
  • (MICRO’18) Amber: Enabling Precise Full-System Simulation with Detailed Modeling of All SSD Resources
  • (Eurosys’18) FlashAbacus: A Self-governing Flash-based Accelerator for Low-power Systems
  • (HPCA’16) DUANG: Fast and Lightweight Page Migration in Asymmetric Memory Systems
  • (PACT’15) NVMMU: A Non-Volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures
  • (HotStorage’14) Power, Energy and Thermal Considerations in SSD-Based I/O Acceleration