Day 1: Monday, April 4, 2022

08:45 — 09:00 EDT
21:45 — 22:00 KST
09:00 — 10:00 EDT
22:00 — 23:00 KST
Keynote I (PPoPP)
Many Real-World Challenges for Effective Programming of Heterogeneous Systems
James Reinders (Intel Corporation)
10:00 — 10:20 EDT
23:00 — 23:20 KST
10:20 — 11:20 EDT
23:20 — Tue 00:20 KST
Session 1A: Accelerators ISession 1B: Security ISession 1C: At Scale
11:20 — 11:40 EDT
Tue 00:20 — 00:40 KST
11:40 — 12:25 EDT
Tue 00:40 — 01:25 KST
Session 2A: Accelerators IISession 2B: Security IISession 2C: Quantum I
12:25 — 12:50 EDT
Tue 01:25 — 01:50 KST
12:50 — 13:35 EDT
Tue 01:50 — 02:35 KST
Session 3A: Accelerators IIISession 3B: Security IIISession 3C: Quantum II
13:35 — 13:45 EDT
Tue 02:35 — 02:45 KST
13:45 — 14:45 EDT
Tue 02:45 — 03:45 KST
Business Meeting

Day 2: Tuesday, April 5, 2022

09:00 — 10:00 EDT
22:00 — 23:00 KST
Keynote II (CGO)
Compiler 2.0
Saman Amarasinghe (Massachusetts Institute of Technology)
10:00 — 10:20 EDT
23:00 — 23:20 KST
10:20 — 11:20 EDT
23:20 — Wed 00:20 KST
Session 4A: Accelerators IVSession 4B: Storage, Scheduling, InterfacesSession 4C: Best Paper Candidates
11:20 — 11:40 EDT
Wed 00:20 — 00:40 KST
11:40 — 12:25 EDT
Wed 00:40 — 01:25 KST
Session 5A: SimulationSession 5B: Cache HierarchySession 5C: Quantum III
12:25 — 12:50 EDT
Wed 01:25 — 01:50 KST
12:50 — 13:35 EDT
Wed 01:50 — 02:35 KST
Session 6A: SynthesisSession 6B: Traditional ArchitectureSession 6C: Best of CAL

Day 3: Wednesday, April 6, 2022

09:00 — 10:00 EDT
22:00 — 23:00 KST
Keynote III (HPCA)
Integration, Specialization and Approximation: the “ISA” of Post-Moore Servers
Babak Falsafi (EcoCloud, EPFL)
10:00 — 10:20 EDT
23:00 — 23:20 KST
10:20 — 11:35 EDT
23:20 — Thu 00:35 KST
Session 7A: Accelerators VSession 7B: Non-Volatile MemorySession 7C: Network On Chip
11:35 — 12:00 EDT
Thu 00:35 — 01:00 KST
12:00 — 13:15 EDT
Thu 01:00 — 02:15 KST
Session 8A: Accelerators VISession 8B: MemorySession 8C: Industrial Session
13:15 — 13:35 EDT
Thu 02:15 — 02:35 KST
Awards Ceremony & Closing Remarks

Session Details

Session 1A: Accelerators I
Session Chair: James C. Hoe (Carnegie Mellon University)
Direct Spatial Implementation of Sparse Matrix Multipliers for Reservoir Computing
Matthew Denton, Herman Schmit (Google Brain)
uSystolic: Byte-Crawling Unary Systolic Array
Di Wu, Joshua San Miguel (University of Wisconsin-Madison)
CAMA: Energy and Memory Efficient Automata Processing in Content-Addressable Memories
Yi Huang, Zhiyu Chen, Dai Li, Kaiyuan Yang (Rice University)
CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators
Yuji Chai, Glenn G. Ko, Wei-Te Mark Ting, Luke Bailey (Harvard University / Stochastic); David Brooks, Gu-Yeon Wei (Harvard University)
Session 1B: Security I
Session Chair: Wenjie Xiong (Virginia Tech)
Leaky Frontends: Security Vulnerabilities in Processor Frontends
Shuwen Deng, Bowen Huang, Jakub Szefer (Yale University)
DPrime+DAbort: A High-Precision and Timer-Free Directory-Based Side-Channel Attack in Non-Inclusive Cache Hierarchies using Intel TSX
Sowoong Kim, Myeonggyun Han, Woongki Baek (UNIST)
Abusing Cache Line Dirty States to Leak Information in Commercial Processors
Yujie Cui, Chun Yang, Xu Cheng (Peking University)
unXpec: Breaking Undo-Based Safe Speculation
Mengming Li, Chenlu Miao, Yilong Yang, Kai Bu (Zhejiang University)
Session 1C: At Scale
Session Chair: Arkaprava Basu (Indian Institute of Science)
Cottage: Coordinated Time Budget Assignment for Latency, Quality and Power Optimization in Web Search
Liang Zhou, Laxmi N. Bhuyan, K. K. Ramakrishnan (University of California, Riverside)
Enabling Efficient Large-Scale Deep Learning Training with Cache Coherent Disaggregated Memory Systems
Zixuan Wang (University of California, San Diego); Joonseop Sim, Euicheol Lim (SK Hynix); Jishen Zhao (University of California, San Diego)
Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation
Liu Ke (Meta / Washington University in St. Louis); Udit Gupta (Meta / Harvard University); Mark Hempstead (Tufts University); Carole-Jean Wu, Hsien-Hsin Sean Lee (Meta); Xuan Zhang (Washington University in St. Louis)
ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud
Shuang Chen, Angela Jin, Christina Delimitrou, José F. Martínez (Cornell University)
Session 2A: Accelerators II
Session Chair: Ioannis Sourdis (Chalmers University of Technology)
ANNA: Specialized Architecture for Approximate Nearest Neighbor Search
Yejin Lee, Hyunji Choi, Sunhong Min, Hyunseung Lee, Sangwon Beak, Dawoon Jeong, Jae W. Lee, Tae Jun Ham (Seoul National University)
Hardware-Accelerated Hypergraph Processing with Chain-Driven Scheduling
Qinggang Wang, Long Zheng, Jingrui Yuan, Yu Huang, Pengcheng Yao, Chuangyi Gui, Ao Hu, Xiaofei Liao, Hai Jin (Huazhong University of Science and Technology)
ScalaGraph: A Scalable Accelerator for Massively Parallel Graph Processing
Pengcheng Yao, Long Zheng, Yu Huang, Qinggang Wang, Chuangyi Gui, Zhen Zeng, Xiaofei Liao, Hai Jin (Huazhong University of Science and Technology); Jingling Xue (UNSW Sydney)
Session 2B: Security II
Session Chair: Jakub Szefer (Yale University)
Adaptive Security Support for Heterogeneous Memory on GPUs
Shougang Yuan, Amro Awad (North Carolina State University); Ardhi Wiratama Baskara Yudha, Yan Solihin (University of Central Florida); Huiyang Zhou (North Carolina State University)
TNPU: Supporting Trusted Execution with Tree-Less Integrity Protection for Neural Processing Unit
Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, Jaehyuk Huh (KAIST)
SecNDP: Secure Near-Data Processing with Untrusted Memory
Wenjie Xiong (Meta); Liu Ke (Washington University in St. Louis); Dimitrije Jankov (Rice University); Michael Kounavis, Xiaochen Wang, Eric Northup, Jie Amy Yang, Bilge Acun, Carole-Jean Wu, Ping Tak Peter Tang (Meta); G. Edward Suh (Cornell University / Meta); Xuan Zhang (Washington University in St. Louis); Hsien-Hsin S. Lee (Meta)
Session 2C: Quantum I
Session Chair: Devesh Tiwari (Northeastern University)
AFS: Accurate, Fast, and Scalable Error-Decoding for Fault-Tolerant Quantum Computers
Poulami Das (Georgia Institute of Technology); Christopher Pattison (California Institute of Technology); Srilatha Manne, Douglas M. Carmean (Meta); Krysta M. Svore (Microsoft Research); Moinuddin Qureshi (Georgia Institute of Technology); Nicolas Delfosse (Microsoft)
QULATIS: A Quantum Error Correction Methodology Toward Lattice Surgery
Yosuke Ueno (University of Tokyo); Masaaki Kondo (Keio University / RIKEN Center for Computational Science); Masamitsu Tanaka (Nagoya University); Yasunari Suzuki (NTT Computer and Data Science Laboratories); Yutaka Tabuchi (RIKEN Center for Quantum Computing)
VAQEM: A Variational Approach to Quantum Error Mitigation
Gokul Subramanian Ravi, Kaitlin N. Smith (University of Chicago); Pranav Gokhale (; Andrea Mari (Unitary Fund); Nathan Earnest, Ali Javadi-Abhari (IBM); Frederic T. Chong (University of Chicago)
Session 3A: Accelerators III
Session Chair: Tushar Krishna (Georgia Institute of Technology)
DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs
Cheng Tan, Nicolas Bohm Agostini, Tong Geng (Pacific Northwest National Laboratory); Chenhao Xie (University of Houston); Jiajia Li, Ang Li, Kevin Barker, Antonino Tumeo (Pacific Northwest National Laboratory)
Parallel Time Batching: Systolic-Array Acceleration of Sparse Spiking Neural Computation
Jeong-Jun Lee, Wenrui Zhang, Peng Li (University of California, Santa Barbara)
Near-Stream Computing: General and Transparent Near-Cache Acceleration
Zhengrong Wang, Jian Weng, Sihao Liu, Tony Nowatzki (University of California, Los Angeles)
Session 3B: Security III
Session Chair: Huiyang Zhou (North Carolina State University)
HyBP: Hybrid Isolation-Randomization Secure Branch Predictor
Lutan Zhao, Peinan Li, Rui Hou (Institute of Information Engineering, Chinese Academy of Sciences); Michael C. Huang (University of Rochester); Xuehai Qian (University of Southern California); Lixin Zhang (Freelance); Dan Meng (Institute of Information Engineering, Chinese Academy of Sciences)
IR-ORAM: Path Access Type Based Memory Intensity Reduction for Path-ORAM
Mehrnoosh Raoufi, Youtao Zhang, Jun Yang (University of Pittsburgh)
SafeGuard: Reducing the Security Risk from Row-Hammer via Low-Cost Integrity Protection
Ali Fakhrzadehgan, Yale N. Patt (University of Texas at Austin); Prashant J. Nair (University of British Columbia); Moinuddin K. Qureshi (Georgia Institute of Technology)
Session 3C: Quantum II
Session Chair: Yipeng Huang (Rutgers University)
Detecting Qubit-Coupling Faults in Ion-Trap Quantum Computers
Andrii Maksymov, Jason Nguyen, Vandiver Chaplin (IonQ, Inc.); Yunseong Nam (IonQ, Inc. / University of Maryland); Igor Markov (IonQ, Inc.)
DigiQ: A Scalable Digital Controller for Quantum Computers using SFQ Logic
Mohammad Reza Jokar, Richard Rines (University of Chicago); Ghasem Pasandi (University of Southern California / NVIDIA); Haolin Cong (University of Southern California); Adam Holmes (University of Chicago / HRL Laboratories); Yunong Shi (University of Chicago / Amazon Braket); Massoud Pedram (University of Southern California); Frederic T. Chong (University of Chicago)
HiPerRF: A Dual-Bit Dense Storage SFQ Register File
Haipeng Zha (University of Southern California); Naveen Kumar Katam (SeeQC, Inc), Massoud Pedram, Murali Annavaram (University of Southern California)
Session 4A: Accelerators IV
Session Chair: Yang Hu (The University of Texas at Dallas)
ReGNN: A Redundancy-Eliminated Graph Neural Networks Accelerator
Cen Chen, Kenli Li, Yangfan Li, Xiaofeng Zou (Hunan University)
LISA: Graph Neural Network Based Portable Mapping on Spatial Accelerators
Zhaoying Li, Dan Wu, Dhananjaya Wijerathne, Tulika Mitra (National University of Singapore)
GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design
Haoran You (Rice University); Tong Geng (Pacific Northwest National Laboratory); Yongan Zhang (Rice University); Ang Li (Pacific Northwest National Laboratory); Yingyan Lin (Rice University)
Atomic Dataflow Based Graph-Level Workload Orchestration for Scalable DNN Accelerators
Shixuan Zheng, Xianjue Zhang, Leibo Liu, Shaojun Wei, Shouyi Yin (Tsinghua University)
Session 4B: Storage, Scheduling, Interfaces
Session Chair: Sandhya Dwarkadas (University of Rochester)
Filesystem Encryption or Direct-Access for NVM Filesystems? Let’s Have Both!
Kazi Abu Zubair (North Carolina State University); David Mohaisen (University of Central Florida); Amro Awad (North Carolina State University)
Efficient Bad Block Management with Cluster Similarity
Jui-Nan Yen, Yao-Ching Hsieh, Cheng-Yu Chen (National Taiwan University); Tseng-Yi Chen (National Central University); Chia-Lin Yang (National Taiwan University); Hsiang-Yun Cheng (Academia Sinica); Yixin Luo (Carnegie Mellon University)
Using Psychophysics to Guide Power Adaptation for Input Methods on Mobile Architectures
Xueliang Li, Shicong Hong, Junyang Chen (Shenzhen University); Guihai Yan (State Key Laboratory of Computer Architecture, Chinese Academy of Sciences); Kaishun Wu (Shenzhen University)
HD-CPS: Hardware-Assisted Drift-Aware Concurrent Priority Scheduler for Shared Memory Multicores
Mohsin Shan, Omer Khan (University of Connecticut)
Session 4C: Best Paper Candidates
Session Chair: Alaa Alameldeen (Simon Fraser University)
Improving Locality of Irregular Updates with Hardware Assisted Propagation Blocking
Vignesh Balaji (NVIDIA); Brandon Lucia (Carnegie Mellon University)
Effective Mimicry of Belady’s MIN Policy
Ishan Shah (University of Texas at Austin); Akanksha Jain (Google); Calvin Lin (University of Texas at Austin)
S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration
Zhi-Gang Liu (Arm ML Research Lab); Paul Whatmough (Arm ML Research Lab / Harvard University); Yuhao Zhu (University of Rochester); Matthew Mattina (Tenstorrent)
SupermarQ: A Scalable Quantum Benchmark Suite
Teague Tomesh (Princeton University); Pranav Gokhale, Victory Omole (; Gokul Subramanian Ravi, Kaitlin N. Smith, Joshua Viszlai (University of Chicago); Xin-Chuan Wu (Intel); Nikos Hardavellas (Northwestern University); Margaret Martonosi (Princeton University); Frederic T. Chong (University of Chicago)
Session 5A: Simulation
Session Chair: Magnus Själander (Norwegian University of Science and Technology)
LoopPoint: Checkpoint-Driven Sampled Simulation for Multi-Threaded Applications
Alen Sabu (National University of Singapore); Harish Patil, Wim Heirman (Intel Corporation); Trevor E. Carlson (National University of Singapore)
Compiler-Driven Simulation of Reconfigurable Hardware Accelerators
Zhijing Li, Yuwei Ye (Cornell University); Stephen Neuendorffer (Xilinx Inc.); Adrian Sampson (Cornell University)
NeuroSync: A Scalable and Accurate Brain Simulator Using Safe and Efficient Speculation
Hunjun Lee, Chanmyeong Kim, Minseop Kim, Yujin Chung, Jangwoo Kim (Seoul National University)
Session 5B: Cache Hierarchy
Session Chair: Samantika S. Sury (Intel Corporation)
Reducing Load Latency with Cache Level Prediction
Majid Jalili, Mattan Erez (University of Texas at Austin)
TCOR: A Tile Cache with Optimal Replacement
Diya Joseph (Universitat Politecnica de Catalunya); Juan L. Aragón (University of Murcia); Joan-Manuel Parcerisa, Antonio González (Universitat Politecnica de Catalunya)
Only Buffer When You Need To: Reducing On-Chip GPU Traffic with Reconfigurable Local Atomic Buffers
Preyesh Dalmia (University of Wisconsin-Madison); Rohan Mahapatra (University of California, San Diego); Matthew D. Sinclair (University of Wisconsin-Madison / AMD Research)
Session 5C: Quantum III
Session Chair: Koen Bertels (
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
Hanrui Wang (Massachusetts Institute of Technology); Yongshan Ding (Yale University); Jiaqi Gu (University of Texas at Austin); Yujun Lin (Massachusetts Institute of Technology); David Z. Pan (University of Texas at Austin); Frederic T. Chong (University of Chicago); Song Han (Massachusetts Institute of Technology)
Not All SWAPs Have the Same Cost: A Case for Optimization-Aware Qubit Routing
Ji Liu, Peiyi Li, Huiyang Zhou (North Carolina State University)
Q-GPU: A Recipe of Optimizations for Quantum Circuit Simulation using GPUs
Yilun Zhao, Yanan Guo, Yuan Yao, Amanda Dumi, Devin Mulvey, Shiv Upadhyay, Youtao Zhang, Kenneth Jordan, Jun Yang, Xulong Tang (University of Pittsburgh)
Session 6A: Synthesis
Session Chair: Magnus Själander (Norwegian University of Science and Technology)
ScaleHLS: A New Scalable High-Level Synthesis Framework on Multi-Level Intermediate Representation
Hanchen Ye (University of Illinois at Urbana-Champaign); Cong Hao (Georgia Institute of Technology); Jianyi Cheng (Imperial College London); Hyunmin Jeong, Jack Huang (University of Illinois at Urbana-Champaign); Stephen Neuendorffer (Xilinx Inc.); Deming Chen (University of Illinois at Urbana-Champaign)
HeteroGen: Automatic Synthesis of Heterogeneous Cache Coherence Protocols
Nicolai Oswald, Vijay Nagarajan (University of Edinburgh); Daniel J. Sorin (Duke University); Vasilis Gavrielatos, Theo Olausson, Reece Carr (University of Edinburgh)
Session 6B: Traditional Architecture
Session Chair: Daniel Wong (University of California, Riverside)
Reliability-Aware Runahead
Ajeya Naithani, Lieven Eeckhout (Ghent University)
Adaptable Register File Organization for Vector Processors
Cristóbal Ramírez Lazo, Enrico Reggiani (Polytechnic University of Catalonia / Barcelona Supercomputing Center); Carlos Rojas Morales, Roger Figueras Bagué (Barcelona Supercomputing Center); Luis Alfonso Villa Vargas, Marco Antonio Ramírez Salinas (Instituto Politécnico Nacional); Mateo Valero Cortés (Polytechnic University of Catalonia / Barcelona Supercomputing Center); Osman Sabri Ünsal (Barcelona Supercomputing Center); Adrián Cristal (Polytechnic University of Catalonia / Barcelona Supercomputing Center)
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization While Ensuring QoS
Han Zhao, Weihao Cui, Quan Chen (Shanghai Jiao Tong University); Youtao Zhang (University of Pittsburgh); Yanchao Lu (NVIDIA); Chao Li, Jingwen Leng, Minyi Guo (Shanghai Jiao Tong University)
Session 6C: Best of CAL
Session Chair: Chia-Lin Yang (National Taiwan University)
A Case for Speculative Strength Reduction
Arthur Perais (CNRS)
Reorder Buffer Contention: A Forward Speculative Interference Attack for Speculation Invariant Instructions
Pavlos Aimoniotis, Christos Sakalis (Uppsala University); Magnus Själander (Norwegian University of Science and Technology); Stefanos Kaxiras (Uppsala University)
Chopping Off the Tail: Bounded Non-Determinism for Real-Time Accelerators
Alexander Rucker (Stanford University); Muhammad Shahbaz (Purdue University); Kunle Olukotun (Stanford University)
Session 7A: Accelerators V
Session Chair: Sai Manoj Pudukotai Dinakarrao (George Mason University)
MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores
Sheng-Chun Kao, Tushar Krishna (Georgia Institute of Technology)
SPACX: Silicon Photonics-Based Scalable Chiplet Accelerator for DNN Inference
Yuan Li, Ahmed Louri (George Washington University); Avinash Karanth (Ohio University)
FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Sai Qian Zhang (Harvard University); Bradley McDanel (Franklin & Marshall College); H.T. Kung (Harvard University)
Griffin: Rethinking Sparse Optimization for Deep Learning Architectures
Jong Hoon Shin, Ali Shafiee, Ardavan Pedram, Hamzah Abdel-Aziz, Ling Li, Joseph Hassoun (Samsung Semiconductor Inc.)
CANDLES: Channel-Aware Novel Dataflow-Microarchitecture Co-Design for Low Energy Sparse Neural Network Acceleration
Sumanth Gudaparthi, Sarabjeet Singh, Surya Narayanan, Rajeev Balasubramonian (University of Utah); Visvesh Sathe (University of Washington)
Session 7B: Non-Volatile Memory
Session Chair: Amro Awad (North Carolina State University)
ASAP: A Speculative Approach to Persistence
Sujay Yadalam, Nisarg Shah, Xiangyao Yu, Michael Swift (University of Wisconsin-Madison)
Temporal Exposure Reduction Protection for Persistent Memory
Yuanchao Xu (North Carolina State University); Chencheng Ye (Huazhong University of Science and Technology); Xipeng Shen (North Carolina State University / Meta); Yan Solihin (University of Central Florida)
MULTI-CLOCK: Dynamic Tiering for Hybrid Memory Systems
Adnan Maruf, Ashikee Ghosh, Janki Bhimani (Florida International University); Daniel Campello (Google); Andy Rudoff (Intel Corporation); Raju Rangaswami (Florida International University)
NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories
Lillian Pentecost (Harvard University); Alexander Hankin, Marco Donato, Mark Hempstead (Tufts University); Gu-Yeon Wei, David Brooks (Harvard University)
Session 7C: Network On Chip
Session Chair: Dionisios N. Pnevmatikatos (National Technical University of Athens)
Stay in your Lane: A NoC with Low-Overhead Multi-Packet Bypassing
Hossein Farrokhbakht (University of Toronto); Paul V. Gratz (Texas A&M University); Tushar Krishna (Georgia Institute of Technology); Joshua San Miguel (University of Wisconsin-Madison); Natalie Enright Jerger (University of Toronto)
FastTrackNoC: A NoC with FastTrack Router Datapaths
Ahsen Ejaz, Ioannis Sourdis (Chalmers University of Technology)
Upward Packet Popup for Deadlock Freedom in Modular Chiplet-Based Systems
Yibo Wu (Tsinghua University); Liang Wang (Beihang University); Xiaohang Wang (South China University of Technology); Jie Han (University of Alberta); Jianfeng Zhu, Honglan Jiang, Shouyi Yin, Shaojun Wei, Leibo Liu (Tsinghua University)
Saving PAM4 Bus Energy with SMOREs: Sparse Multi-Level Opportunistic Restricted Encodings
Mike O’Connor, Donghyuk Lee, Niladrish Chatterjee, Michael B. Sullivan, Stephen W. Keckler (NVIDIA)
Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures
Xia Zhao (Artificial Intelligence Research Center); Lieven Eeckhout (Ghent University); Magnus Jahre (Norwegian University of Science and Technology)
Session 8A: Accelerators VI
Session Chair: Dimitrios Skarlatos (Carnegie Mellon University)
Accelerating Graph Convolutional Networks Using Crossbar-based Processing-In-Memory Architectures
Yu Huang, Long Zheng, Pengcheng Yao, Qinggang Wang, Xiaofei Liao, Hai Jin (Huazhong University of Science and Technology); Jingling Xue (UNSW Sydney)
Enabling High-Quality Uncertainty Quantification in a PIM Designed for Bayesian Neural Network
Xingchen Li, Bingzhe Wu, Guangyu Sun, Zhe Zhang, Zhihang Yuan, Runsheng Wang, Ru Huang (Peking University); Dimin Niu, Hongzhong Zheng (Alibaba Group Inc.); Zhichao Lu, Liang Zhao (Hefei Reliance Memory Ltd.); Meng-Fan Chang (National Tsing Hua University); Tianchan Guan (Alibaba Group Inc.); Xin Si (National Tsing Hua University)
RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference
Xuan Sun, Hu Wan, Qiao Li (City University of Hong Kong); Chia-Lin Yang (National Taiwan University); Tei-Wei Kuo, Chun Jason Xue (City University of Hong Kong)
TransPIM: A Memory-Based Acceleration via Software-Hardware Co-Design for Transformer
Minxuan Zhou, Weihong Xu, Jaeyoung Kang, Tajana Rosing (University of California, San Diego)
PIMCloud: QoS-Aware Resource Management of Latency-Critical Applications in Clouds with Processing-in-Memory
Shuang Chen, Yi Jiang, Christina Delimitrou, José F. Martínez (Cornell University)
Session 8B: Memory
Session Chair: Xun Jian (Virginia Tech)
Exploiting Inter-Block Entropy to Enhance the Compressibility of Blocks with Diverse Data
Jinkwon Kim, Mincheol Kang (KAIST); Jeongkyu Hong (Yeungnam University); Soontae Kim (KAIST)
GBDI: Going Beyond Base-Delta-Immediate Compression with Global Bases
Alexandra Angerd, Angelos Arelakis, Vasilis Spiliopoulos (ZeroPoint Technologies); Erik Sintorn (Chalmers University of Technology); Per Stenstrom (ZeroPoint Technologies / Chalmers University of Technology)
Virtual Coset Coding for Encrypted Non-Volatile Memories with Multi-Level Cells
Stephen Longofono (University of Pittsburgh); Mohammad Seyedzadeh (AMD Research); Alex K. Jones (University of Pittsburgh)
DR-STRaNGe: End-to-End System Design for DRAM-Based True Random Number Generators
F. Nisa Bostancı, Ataberk Olgun (TOBB University of Economics and Technology); Lois Orosa, A. Giray Yağlıkçı, Jeremie S. Kim, Hasan Hassan (ETH Zurich); Oğuz Ergin (TOBB University of Economics and Technology); Onur Mutlu (ETH Zurich)
Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh
Michael Jaemin Kim, Jaehyun Park, Yeonhong Park, Wanju Doh, Namhoon Kim, Tae Jun Ham, Jae W. Lee, Jung Ho Ahn (Seoul National University)
Session 8C: Industrial Session
Session Chair: Dan Lustig (NVIDIA)
DarkGates: A Hybrid Power-Gating Architecture to Mitigate the Performance Impact of Dark-Silicon in High Performance Processors
Jawad Haj-Yahya, Giray Yaglikci, Jisung Park, Jeremie Kim (ETH Zurich); Efraim Rotem (Intel); Yanos Sazeides (University of Cyprus); Onur Mutlu (ETH Zurich)
GPU Subwarp Interleaving
Sana Damani, Mark Stephenson, Ram Rangan, Daniel R. Johnson, Rishkul Kulkarni, Stephen W. Keckler (NVIDIA)
Application Defined On-Chip Networks for Heterogeneous Chiplets: An Implementation Perspective
Tianqi Wang, Fan Feng, Shaolin Xiang, Qi Li, Xia Jing (Huawei)
The Specialized High-Performance Network on Anton 3
Keun Sup Shim, Brian Greskamp, Brian Towles, Bruce Edwards, J.P. Grossman, David E. Shaw (D. E. Shaw Research)
AI-Enabling Workloads on Large-Scale GPU-Accelerated System: Characterization, Opportunities, and Implications
Baolin Li, Rohin Arora (Northeastern University); Siddharth Samsi, William Arcand, David Bestor (Massachusetts Institute of Technology); Devesh Tiwari (Northeastern University); Chansup Byun (Massachusetts Institute of Technology); Tirthak Patel, Rohan Basu Roy (Northeastern University); Vijay Gadepally, Bill Bergeron, John Holodnak, Michael Houle, Matthew Hubbell, Michael Jones, Jeremy Kepner, Anna Klein, Peter Michaleas, Joseph McDonald, Lauren Milechin, Julie Mullen, Andrew Prout, Benjamin Price, Albert Reuther, Antonio Rosa, Matthew Weiss, Charles Yee, Daniel Edelman (Massachusetts Institute of Technology); Allan Vanterpool, Anson Cheng (US Air Force)