Note: This program shows the CGO keynote talk on Monday and the HPCA on Wednesday — a swap from the preliminary program post earlier due to travel issues
To access the final proceedings, please scan the QR code below

Monday, 27 February 2023
07:45 – 08:15 | Coffee/Tea/Juice Social (Food is not provided) | ||
08:30 – 09:30 | CGO Keynote | PyTorch 2.0 — the Journey to Bringing Compiler Technologies to the Core of PyTorch | By Dr. Peng Wu |
09:30 – 10:00 | Coffee Break | ||
10:00 – 12:00 | Session 1A: Neural Networks and Accelerators 1 Session Chair: Xue Lin SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators Mingi Yoo (Yonsei University), Jaeyong Song (Yonsei University), Jounghoo Lee (Yonsei University), Namhyung Kim (Samsung Electronics), Youngsok Kim (Yonsei University), Jinho Lee (Seoul National University) PhotoFourier: A Photonic Joint Transform Correlator-Based Neural Network Accelerator Shurui Li (UCLA), Hangbo Yang (UCLA), Chee Wei Wong (UCLA), Volker Sorger (The George Washington University), Puneet Gupta (UCLA) INCA: Input-stationary Dataflow at Outside-the-box Thinking about Deep Learning Accelerators Bokyung Kim (Duke University), Shiyu Li (Duke University), Hai “Helen” Li (Duke University) GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks Ranggi Hwang (KAIST), Minhoo Kang (KAIST), Jiwon Lee (KAIST), Dongyun Kam (POSTECH), Youngjoo Lee (POSTECH), Minsoo Rhu (KAIST) Logical/Physical Topology-Aware Collective Communication in Deep Learning Training Jo Sanghoon (KAIST), Hyojun Son (KAIST), John Kim (KAIST) Sibia: Signed Bit-slice Architecture for Dense DNN Acceleration with Slice-level Sparsity Exploitation Dongseok Im (KAIST), Gwangtae Park (KAIST), Zhiyong Li (KAIST), Junha Ryu (KAIST), Hoi-Jun Yoo (KAIST) | Session 1B: NVRAM and Hybrid Memory Session Chair: Prashant Nair AstriFlash: A Flash-Based System for Online Services Siddharth Gupta (EPFL), Yunho Oh (Korea University), Lei Yan (EPFL), Mark Sutherland (EPFL), Abhishek Bhattacharjee (Yale University), Babak Falsafi (EPFL), Peter Hsu (Peter Hsu & Associates) Thoth: Bridging the Gap Between Persistently Secure Memories and Memory Interfaces of Emerging NVMs Xijing Han (North Carolina State University), James Tuck (North Carolina State University), Amro Awad (North Carolina State University) Multi-Granularity Shadow Paging with NVM Write Optimization for Crash-Consistent Memory-Mapped I/O Hongchao Du (City University of Hong Kong), Qiao Li (Xiamen University), Riwei Pan (City University of Hong Kong), Tei-Wei Kuo (National Taiwan University), Chun Jason Xue (City University of Hong Kong) MGC: Multiple-Gray-Code for 3D NAND Flash based High-Density SSDs Yina Lv (East China Normal University), Liang Shi (East China Normal University), Qiao Li (Xiamen University), Congming Gao (Xiamen University), Yunpeng Song (East China Normal University), Longfei Luo (East China Normal University), Youtao Zhang (University of Pittsburgh) Baryon: Efficient Hybrid Memory Management with Compression and Sub-Blocking Yiwei Li (Tsinghua University), Mingyu Gao (Tsinghua University) Root Crash Consistency of SGX-style Integrity Trees in Secure Non-Volatile Memory Systems Jianming Huang (Huazhong University of Science and Technology), Yu Hua (Huazhong University of Science and Technology) | Session 1C: Caching and Memory Management Session Chair: Minsoo Rhu ACIC: Admission-Controlled Instruction Cache Yunjin Wang (Pennsylvania State University), Chia-Hao Chang (Pennsylvania State University), Anand Sivasubramaniam (Pennsylvania State University), Niranjan K Soundararajan (Intel Labs) Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs Carlos Escuin (University of Zaragoza), Asif Ali Khan (Tu Dresden), Pablo Ibáñez Marín (Universidad de Zaragoza), Teresa Monreal (Universitat Politècnica de Catalunya), Jeronimo Castrillon (Center for Advancing Electronics Dresden, TU Dresden), Víctor Viñals Yúfera (Universidad de Zaragoza) NOMAD: Enabling Non-blocking OS-managed DRAM Cache via Tag-Data Decoupling Youngin Kim (Yonsei University), Hyeonjin Kim (Yonsei University), William Song (Yonsei University) Safety Hints for HTM Capacity Abort Mitigation Anirudh Jain (Georgia Tech), Divya Kiran Kadiyala (Georgia Tech), Alexandros Daglis (Georgia Tech) iCACHE: An Importance-Sampling-Informed Cache for Accelerating I/O-Bound DNN Model Training Weijian Chen (Zhejiang University), Shuibing He (Zhejiang University), Yaowen Xu (Zhejiang University), Xuechen Zhang (Washington State University Vancouver), Siling Yang (Zhejiang University), Shuang Hu (Zhejiang University), Sun Xian-He (Illinois Institute of Technology), Gang Chen (Zhejiang University Are Randomized Caches Truly Random? Formal Analysis of Randomized-Partitioned Caches Anirban Chakraborty (Indian Institute of Technology, Kharagpur), Sarani Bhattacharya (Imec, Belgium), Sayandeep Saha (Nanyang Technological University, Singapore), Debdeep Mukhopadhyay (Indian Institute of Technology, Kharagpur) |
12:00 – 01:30 | Lunch | ||
13:30 – 15:10 | Session 2A: Accelerators Session Chair: Christopher Batten HIRAC: A Hierarchical Accelerator with Sorting-based Packing for SpGEMMs in DNN Applications Hesam Shabani (Lehigh University), Abhishek Singh (Lehigh University), Bishoy Youhana (Lehigh University), Xiaochen Guo (Lehigh University) VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs Geonhwa Jeong (Georgia Tech), Sana Damani (Georgia Tech), Abhimanyu Bambhaniya (Georgia Tech), Eric Qin (Georgia Tech), Christopher J. Hughes (Intel Labs), Sreenivas Subramoney (Intel Labs), Hyesoon Kim (Georgia Tech), Tushar Krishna (Georgia Tech) ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design Haoran You (Georgia Tech), Zhanyi Sun (Rice University), Huihong Shi (Nanjing University), Zhongzhi Yu (Georgia Tech), Yang Zhao (Rice University), Yongan Zhang (Georgia Tech), Chaojian Li (Georgia Tech), Baopu Li (Oracle Health and AI), Yingyan (Celine) Lin (Georgia Tech) Leveraging Domain Information for the Efficient Automated Design of Deep Learning Accelerators Chirag Sakhuja (The University of Texas at Austin), Zhan Shi (The University of Texas at Austin), Calvin Lin (The University of Texas at Austin) DIMM-Link: Enabling Efficient Inter-DIMM Communication for Near-Memory Processing Zhe Zhou (Peking University), Cong Li (Peking University), Fan Yang (Nankai University), Guangyu Sun (Peking University) **HPCA23 Best Paper** | Session 2B: Security Session Chair: Magnus Själander AutoCAT: Reinforcement Learning for Automated Exploration of Cache-Timing Attacks Mulong Luo* (Cornell University), Wenjie Xiong* (Meta AI/Virginia Tech), Geunbae Lee (Virginia Tech), Yueying Li (Cornell University), Xiaomeng Yang (Meta AI), Amy Zhang (Meta AI), Yuandong Tian (Meta AI), Hsien-Hsin S. Lee (Intel), Edward Suh (Meta AI/Cornell University) (* indicates equal contributions) SHADOW: Preventing Row Hammer in DRAM with Intra-Subarray Row Shuffling Minbok Wi (Seoul National University), Jaehyun Park (Seoul National University), Seoyoung Ko (Seoul National University), Michael Jaemin Kim (Seoul National University), Nam Sung Kim (UIUC), Eojin Lee (Inha University), Jung Ho Ahn (Seoul National University) Efficient Distributed Secure Memory with Migratable Merkle Tree Erhu Feng (Shanghai Jiao Tong University), Dong Du (Shanghai Jiao Tong University), Yubin Xia (Shanghai Jiao Tong University), Haibo Chen (Shanghai Jiao Tong University) AB-ORAM: Constructing Adjustable Buckets for Space Reduction in Ring ORAM Mehrnoosh Raoufi (University of Pittsburgh), Jun Yang (University of Pittsburgh), Xulong Tang (University of Pittsburgh), Youtao Zhang (University of Pittsburgh) Scalable and Secure Row-Swap: Efficient and Safe Row Hammer Mitigation in Memory Systems Jeonghyun Woo (University of British Columbia), Gururaj Saileshwar (Georgia Institute of Technology), Prashant J. Nair (University of British Columbia) **HPCA23 Best Paper** | Session 2C: Applications 1 Session Chair: Tim Rogers Post0-VR: Enabling Universal Realistic Rendering for Modern VR via Exploiting Architectural Similarity and Data Sharing Yu Wen (University of Houston), Chenhao Xie (Beihang University), Shuaiwen Leon Song (University of Sydney), Xin Fu (University of Houston) ParallelNN: A Parallel Octree-based Nearest Neighbor Search Accelerator for 3D Point Clouds Faquan Chen (Shanghai Jiao Tong University), Rendong Ying (Shanghai Jiao Tong University), Jianwei Xue (Shanghai Jiao Tong University), Fei Wen (Shanghai Jiao Tong University), Peilin Liu (Shanghai Jiao Tong University) ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with Linear Taylor Attention Jyotikrishna Dass (Rice University), Shang Wu (Rice University), Huihong Shi (Georgia Institute of Technology), Chaojian Li (Georgia Institute of Technology), Zhifan Ye (Rice University), Zhongfeng Wang (Nanjing University), Yingyan (Celine) Lin (Georgia Institute of Technology) CTA: Hardware-Software Co-design for Compressed Token Attention Mechanism Haoran Wang (Institute of Computing Technology, Chinese Academy of Sciences), Haobo Xu (Institute of Computing Technology, Chinese Academy of Sciences), Ying Wang (Institute of Computing Technology, Chinese Academy of Sciences), Yinhe Han (ICT, Chinese Academy of Sciences) HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers Peiyan Dong (Northeastern University), Mengshu Sun (Northeastern University), Alec Lu (Simon Fraser University), Yanyue Xie (Northeastern University), Kenneth Liu (Simon Fraser University), Zhenglun Kong (Northeastern University), Xin Meng (Peking University), Zhengang Li (Northeastern University), Xue Lin (Northeastern University), Zhenman Fang (Simon Fraser University), Yanzhi Wang (Northeastern University) |
15:10 – 15:40 | Coffee Break | ||
15:40 – 17:00 | Session 3A: Best of CAL Session Chair: Chia-Lin Yang A First-Order Model to Assess Computer Architecture Sustainability Lieven Eeckhout (Ghent University) A Pre-Silicon Approach to Discovering Microarchitectural Vulnerabilities in Security Critical Applications Kristin Barber (The Ohio State University), Moein Ghaniyoun (The Ohio State University), Yinqian Zhang (Southern University of Science and Technology), Radu Teodorescu (The Ohio State University) GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks Sungmin Yun (Seoul National University), Byeongho Kim (Seoul National University), Jaehyun Park (Seoul National University), Hwayong Nam (Seoul National University), Jung Ho Ahn (Seoul National University), Eojin Lee (Inha University) | Session 3B: Datacenters and HPC Session Chair: Osman Unsal Trans-FW: Short Circuiting Page Table Walk in Multi-GPU Systems via Remote Forwarding Bingyao Li (University of Pittsburgh), Jieming Yin (Lehigh University), Anup Holey (NVIDIA), Youtao Zhang (University of Pittsburgh), Jun Yang (University of Pittsburgh), Xulong Tang (University of Pittsburgh) Ah-Q: Quantifying and Handling the Interference within a Datacenter from a System Perspective Yu-Hang Liu (ICT, CAS), Xin Deng (ICT, CAS), Jiapeng Zhou (ICT, CAS), Mingyu Chen (ICT, CAS), Yungang Bao (ICT, CAS) Market Mechanism-Based User-in-the-Loop Scalable Power Oversubscription for HPC Systems Md Rajib Hossen (UT Arlington), Kishwar Ahmed (The University of Toledo), Mohammad A. Islam (UT Arlington) RAMBDA: RDMA-driven Acceleration Framework for Memory-intensive us-scale Datacenter Applications Yifan Yuan (UIUC/Intel Labs), Jinghan Huang (UIUC), Yan Sun (UIUC), Tianchen Wang (UIUC), Jacob Nelson (Microsoft Research), Dan Ports (Microsoft Research), Yipeng Wang (Intel Labs), Ren Wang (Intel Labs), Charlie Tai (Intel Labs), Nam Sung Kim (UIUC) | Session 3C: GPUs Session Chair: Daniel Wong FinePack: Transparently Improving the Efficiency of Fine-Grained Transfers in Multi-GPU Systems Harini Muthukrishnan (NVIDIA), Daniel Lustig (NVIDIA), Oreste Villa (NVIDIA), Thomas Wenisch (University of Michigan), David Nellans (NVIDIA) Mitigating GPU Core Partitioning Performance Effects Aaron Barnes (Purdue University), Fangjia Shen (Purdue University), Timothy G. Rogers (Purdue University) Plutus: Bandwidth-Efficient Memory Security for GPUs Rahaf Abdullah (North Carolina State University), Huiyang Zhou (North Carolina State University), Amro Awad (North Carolina State University) MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism Quan Zhou (University of Science and Technology of China), Haiquan Wang (University of Science and Technology of China), Xiaoyan Yu (University of Science and Technology of China), Cheng Li (University of Science and Technology of China), Youhui Bai (University of Science and Technology of China), Feng Yan (University of Houston), Yinlong Xu (University of Science and Technology of China) |
17:00 – 17:15 | Refreshments for Award Ceremony | ||
17:15 – 17:45 | Best Paper Awards and HPCA Test-of-Time Awards | ||
17:45 – 18:45 | Business Meeting |
Tuesday, 28 February 2023
08:00 – 08:30 | Coffee/Tea/Juice Social (Food is not provided) | ||
08:30 – 09:30 | PPoPP Keynote | ||
09:30 – 10:00 | Coffee Break | ||
10:00 – 12:00 | Session 4A: Neural Networks and Accelerators 2 Session Chair: Dimitrios Soudris DeFiNES: Enabling Fast Exploration of the Depth-first Scheduling Space for DNN Accelerators through Analytical Modeling Linyan Mei (KU Leuven), Koen Goetschalckx (KU Leuven), Arne Symons (KU Leuven), Marian Verhelst (KU Leuven) CEGMA: Coordinated Elastic Graph Matching Acceleration for Graph Matching Networks Yue Dai (University of Pittsburgh), Youtao Zhang (University of Pittsburgh), Xulong Tang (University of Pittsburgh) ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining Yifan Yang (MIT), Joel Emer (MIT/NVIDIA), Daniel Sanchez (MIT) OptimStore: In-Storage Optimization of Large Scale DNNs with On-Die Processing Junkyum Kim (SAIT), Myeonggu Kang (KAIST), Yunki Han (KAIST), Yang-gon Kim (KAIST), Lee-sup Kim (KAIST) KRISP: Enabling Kernel-wise Right-sizing for Spatial Partitioned GPU Inference Servers Marcus Chow (University of California, Riverside), Ali Jahanshahi (University of California, Riverside), Daniel Wong (University of California, Riverside) MERCURY: Accelerating DNN Training By Exploiting Input Similarity Vahid Janfaza (Texas A&M University), Kevin Weston (Texas A&M University), Moein Razavi Ghods (Texas A&M University), Shantanu Mandal (Texas A&M University), Farabi Mahmud (Texas A&M University), Alex Hilty (Texas A&M University), Abdullah Muzahid (Texas A&M University) | Session 4B: PIMs and Persistent Memory Session Chair: Josep Torrellas Silo: Speculative Hardware Logging for Atomic Durability in Persistent Memory Ming Zhang (Huazhong University of Science and Technology), Yu Hua (Huazhong University of Science and Technology) Reconciling Selective Logging and Hardware Persistent Memory Transaction Chencheng Ye (Huazhong University of Science and Technology), Yuanchao Xu (North Carolina State University), Xipeng Shen (North Carolina State University), Yan Sha (Huazhong University of Science and Technology), XIAOFEI LIAO (Huazhong University of Science and Technology), Hai Jin (Huazhong University of Science and Technology), Yan Solihin (University of Central Florida) SecPB: Architectures for Secure Non-Volatile Memory with Battery-Backed Persist Buffers Alexander Freij (North Carolina State University), Huiyang Zhou (North Carolina State University), Yan Solihin (University of Central Florida) EVE: Ephemeral Vector Engines Khalid Al-Hawaj (Cornell University), Tuan Ta (Cornell University), Nick Cebry (Cornell University), Shady Agwa (Cornell University), Olalekan Afuye (Cornell University), Eric Hall (Cornell University), Courtney Golden (Cornell University), Alyssa B. Apsel (Cornell University), Christopher Batten (Cornell University) On Consistency for Bulk-Bitwise Processing-in-Memory Ben Perach (Technion – Israel Institute of Technology), Ronny Ronen (Technion – Israel Institute of Technology), Shahar Kvatinsky (Technion – Israel Institute of Technology) Dalorex: A Data-Local Program Execution and Architecture for Memory-bound Applications Marcelo Orenes-Vera (Princeton University), Esin Tureci (Princeton University), David Wentzlaff (Princeton University), Margaret Martonosi (Princeton University) | Session 4C: Quantum and FPGAs Session Chair: Eddy Zhang HyQSAT: A Hybrid Approach for 3-SAT Problems by Integrating Quantum Annealer with CDCL Siwei Tan (Zhejiang University), Mingqian Yu (Zhejiang University), Andre Python (Zhejiang University), Yongheng Shang (Zhejiang University), Tingting Li (Zhejiang University), Liqiang Lu (Zhejiang University), Jianwei Yin (Zhejiang University) Duet: Creating Harmony between Processors and Embedded FPGAs Ang Li (Princeton University), August Ning (Princeton University), David Wentzlaff (Princeton University) Co-Designed Architectures for Modular Superconducting Quantum Computers Evan McKinney (University of Pittsburgh), Mingkang Xia (University of Pittsburgh), Chao Zhou (University of Pittsburgh), Pinlei Lu (University of Pittsburgh), Michael Hatridge (University of Pittsburgh), Alex K. Jones (University of Pittsburgh) A Pulse Generation Framework with Augmented Program-aware Basis Gates and Criticality Analysis Yanhao Chen (Rutgers University), Yuwei Jin (Rutgers University), Fei Hua (Rutgers University), Ari Hayes (Rutgers University), Ang Li (Pacific Northwest National Laboratory), Yunong Shi (Amazon Web Services), Eddy Z. Zhang (Rutgers University) The Imitation Game: Leveraging CopyCats for Robust Native Gate Selection in NISQ Programs Poulami Das (Georgia Tech), Eric Kessler (Amazon Web Services), Yunong Shi (Amazon Web Services) |
12:00 – 01:30 | Lunch | ||
13:30 – 15:10 | Session 5A: Cloud and Edge Computing Session Chair: Alex Daglis eNODE: Energy-Efficient and Low-Latency Edge Inference and Training of Neural ODEs Junkang Zhu (University of Michigan, Ann Arbor), Yaoyu Tao (University of Michigan, Ann Arbor), Zhengya Zhang (University of Michigan, Ann Arbor) SpecFaaS: Accelerating Serverless Applications with Speculative Function Execution Jovan Stojkovic (University of Illinois at Urbana-Champaign), Tianyin Xu (University of Illinois at Urbana-Champaign), Hubertus Franke (IBM Research), Josep Torrellas (University of Illinois Urbana-Champaign) MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural Networks Seah Kim (University of California, Berkeley), Hasan Genc (University of California, Berkeley), Vadim Vadimovich Nikiforov (University of California, Berkeley), Krste Asanovic (University of California Berkeley), Borivoje Nikolic (University of California, Berkeley), Yakun Sophia Shao (University of California, Berkeley) Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving Jun Yeol Ryu (Sungkyunkwan University), Jongseok Kim (Sungkyunkwan University), Euiseong Seo (Sungkyunkwan University) Adrias: Interference-Aware Memory Orchestration for Disaggregated Cloud Infrastructures Dimosthenis Masouros (National Technical University of Athens), Christian Pinto (IBM Research Europe), Michele Gazzetti (IBM Research), Sotirios Xydis (Harokopio University of Athens, Greece), Dimitrios Soudris (National Technical University of Athens) | Session 5B: Encryption and SGX Session Chair: John Kim Poseidon: Practical Homomorphic Encryption Accelerator Yinghao Yang (Institute of Computing Technology, Chinese Academy of Sciences), Huaizhi Zhang (Zhejiang Ocean University), Shengyu Fan (Institute of Information Engineering, Chinese Academy of Sciences), Hang Lu (Institute of Computing Technology, Chinese Academy of Sciences), Mingzhe Zhang (Institute of Information Engineering, Chinese Academy of Sciences), Xiaowei Li (Institute of Computing Technology, Chinese Academy of Sciences) FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption Rashmi Agrawal (Boston University), Leo de Castro (MIT CSAIL), Guowei Yang (Boston University), Chiraag Juvekar (Analog Devices), Rabia Yazicigil (Boston University), Anantha Chandrakasan (MIT), Vinod Vaikuntanathan (MIT CSAIL), Ajay Joshi (Boston University) FxHENN: FPGA-based acceleration framework for homomorphic encrypted CNN inference Yilan Zhu (Shandong University), Xinyao Wang (Shandong University), Lei Ju (Shandong University), Shanqing Guo (Shandong University) D-Shield: Enabling Processor-Side Encryption and Integrity Verification for Secure NVMe Drives Md Hafizul Islam Chowdhuryy (University of Central Florida), Myoungsoo Jung (KAIST), Fan Yao (University of Central Florida), Amro Awad (North Carolina State University) TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU Shengyu Fan (Institute of Information Engineering, Chinese Academy of Sciences), Zhiwei Wang (Institute of Information Engineering, Chinese Academy of Sciences), Weizhi Xu (Shandong Normal University), Rui Hou (Institute of Information Engineering, Chinese Academy of Sciences), Dan Meng (Institute of Information Engineering, Chinese Academy of Sciences), Mingzhe Zhang (Institute of Information Engineering, Chinese Academy of Sciences) | Session 5C: Reliability Session Chair: Vilas Sridharan AVGI: Microarchitecture-Driven, Fast and Accurate Vulnerability Assessment George Papadimitriou (University of Athens), Dimitris Gizopoulos (University of Athens) Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators Abhishek Tyagi (University of Rochester), Yiming Gan (University of Rochester), Shaoshan Liu (PerceptIn), Bo Yu (PerceptIn), Paul Whatmough (Arm), Yuhao Zhu (University of Rochester) Realizing Extreme Endurance Through Fault-aware Wear Leveling and Improved Tolerance Jiangwei Zhang (Tsinghua University), Chong Wang (Tsinghua University), Zhenhua Zhu (Tsinghua University), Donald Kline, Jr (Intel Corporation), Alex K. Jones (University of Pittsburgh), Huazhong Yang (Tsinghua University), Yu Wang (Tsinghua University) ESD: An ECC-assisted and Selective Deduplication for Non-Volatile Main Memory Chunfeng Du (Xiamen University), Suzhen Wu (Xiamen University), Jiapeng Wu (Xiamen University), Bo Mao (Xiamen University), Shengzhe Wang (Xiamen University) |
15:10 – 15:40 | Coffee Break | ||
15:40 – 17:00 | Session 6A: Industry Track Session Session Chair: Ravi Iver A Systematic Study of DDR4 DRAM Faults in the Field Majed Valad Beigi (AMD), Yi Cao (Google), Sudhanva Gurumurthi (AMD), Charles Recchia (AMD), Andrew Walton (Google), Vilas Sridharan (AMD) High Performance and Power Efficient Accelerator for Cloud Inference Jianguo Yao (SJTU/Enflame-Tech Inc.), Hao Zhou (Enflame-Tech Inc.), Yalin Zhang (Enflame-Tech Inc.), Ying Li (Enflame-Tech Inc.), Chuang Feng (Enflame-Tech Inc.), Shi Chen (Enflame-Tech Inc.), Jiaoyan Chen (Enflame-Tech Inc.), Yongdong Wang (Enflame-Tech Inc.), Qiaojuan Hu (Enflame-Tech Inc.) LightTrader: A Standalone High-Frequency Trading System with Deep Learning Inference Accelerators and Proactive Scheduler Sungyeob Yoo (KAIST), Hyunsung Kim (Rebellions Inc.), Jinseok Kim (Rebellions Inc.), Sunghyun Park (Rebellions Inc.), Joo-Young Kim (KAIST), Jinwook Oh (Rebellions Inc.) BM-Store: A Transparent and High-performance Local Storage Architecture for Bare-metal Clouds Enabling Large-scale Deployment Yiquan Chen (Alibaba Group and Zhejiang University), Jiexiong Xu (Zhejiang University), Chengkun Wei (Zhejiang University), Yijing Wang (Alibaba Group), Xin Yuan (Alibaba Group), Yangming Zhang (Alibaba Group), Xulin Yu (Alibaba Group), Yi Chen (Zhejiang University), Zeke Wang (Zhejiang University), Shuibing He (Zhejiang University), Wenzhi Chen (Zhejiang University) | Session 6B: NICs and Networks Session Chair: Ajay Joshi Turbo: SmartNIC-enabled Dynamic Load Balancing of μs-scale RPCs Hamed Seyedroudbari (Georgia Tech), Srikar Vanavasam (Georgia Tech), Alexandros Daglis (Georgia Tech) A Scalable Methodology for Designing Efficient Interconnection Network of Chiplets Yinxiao Feng (Tsinghua University), Dong Xiang (Tsinghua University), Kaisheng Ma (Tsinghua University) VVQ: Virtualizing Virtual Channel for Cost-Efficient Protocol Deadlock Avoidance Hans Kasan (KAIST), John Kim (KAIST) | |
17:00 – 18:00 | Break | ||
18:00 – 22:00 | Banquet |
Wednesday, 1 March 2023
08:00 – 08:30 | Coffee/Tea/Juice Social (Food is not provided) | ||
08:30 – 09:30 | HPCA Keynote | Addressing Challenges of Core Microarchitecture Research | by Daniel A. Jiménez |
09:30 – 10:00 | Coffee Break | ||
10:00 – 12:00 | Session 7A: Neural Network and Accelerators 3 Session Chair: Abdullah Muzahid Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices Enrico Reggiani (Barcelona Supercomputing Center), Alessandro Pappalardo (AMD AECG Research Labs), Max Doblas Font (Barcelona Supercomputing Center), Miquel Moreto (BSC and UPC), Mauro Olivieri (Sapienza University of Rome), Osman Sabri Unsal (Barcelona Supercomputing Center), Adrian Cristal (Barcelona Supercomputing Center) FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference Rishov Sarkar (Georgia Institute of Technology), Stefan Abi-Karam (Georgia Institute of Technology), Yuqi He (Georgia Institute of Technology), Lakshmi Sathidevi (Georgia Institute of Technology), Cong Hao (Georgia Institute of Technology), Callie Hao (Georgia Institute of Technology) Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion Size Zheng (Peking University), Siyuan Chen (Peking University), Peidi Song (Peking University), Renze Chen (Peking University), Xiuhong Li (Sensetime Research & Shanghai AI Lab), Shengen Yan (SenseTime Research), Dahua Lin (The Chinese University of Hong Kong & Shanghai AI Lab), Jingwen Leng (Shanghai Jiao Tong University), Yun Liang (Peking University) Securator: A Fast and Secure Neural Processing Unit Nivedita Shrivastava (IIT Delhi), Smruti R. Sarangi (IIT Delhi) Tensor Movement Orchestration In Multi-GPU Training Systems Shao-Fu Lin (National Taiwan University), Chia-Lin Yang (National Taiwan University), Yi-Jung Chen (National Chi Nan University), Hsiang-Yun Cheng (Academia Sinica) | Session 7B: Microarchitecture and memory Systems Session Chair: Abhishek Bhattacharjee A Storage-Effective BTB Organization for Servers Truls Asheim (Norwegian University of Science and Technology), Boris Grot (University of Edinburgh), Rakesh Kumar (Norwegian University of Science and Technology (NTNU)) HoPP: Hardware-Software Co-Designed Page Prefetching for Disaggregated Memory Haifeng Li (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Aca demy of Sciences), Ke Liu (Institute of Computing Technology, Chinese Academy of Sciences), Ting Liang (Institute of Computing Technology, Chinese Academy of Sciences,University of Chinese Aca demy of Sciences), Zuojun Li (Institute of Computing Technology, Chinese Academy of Sciences), Tianyue Lu (Institute of Computing Technology, Chinese Academy of Sciences), Hui Yuan (Huawei), Yinben Xia (Huawei), Yungang Bao (ICT, CAS), Mingyu Chen (Institute of Computing Technology, Chinese Academy of Sciences), Yizhou Shan (Huawei Cloud) Speculative Register Reclamation Sanyam Mehta (HPE) SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPU Jiwon Lee (Yonsei University), Ju Min Lee (Yonsei University), Yunho Oh (Korea University), William Song (Yonsei University), Won Woo Ro (Yonsei University) CARE: A Concurrency-Aware Enhanced Lightweight Cache Management Framework Xiaoyang Lu (Illinois Institute of Technology), Rujia Wang (Illinois Institute of Technology), Xian-He Sun (Illinois Institute of Technology) Memory-Efficient Hashed Page Tables Jovan Stojkovic (University of Illinois at Urbana-Champaign), Namrata Mantri (UIUC), Dimitrios Skarlatos (Carnegie Mellon University), Tianyin Xu (University of Illinois at Urbana-Champaign), Josep Torrellas (University of Illinois Urbana-Champaign) | Session 7C: Applications 2 & Potpourri Session Chair: David Kaeli NvWa: Enhancing Sequence Alignment Accelerator Throughput via Hardware Scheduling Yewen Li (Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Aca demy of Sciences), Xueqi Li (Institute of Computing Technology, Chinese Academy of Sciences), Ruihao Gao (Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese A cademy of Sciences), Wanqi Liu (Institute of Computing Technology, Chinese Academy of Sciences / University of Chinese Ac ademy of Sciences), Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences) Efficient Supernet Training Using Path Parallelism Ying Xu (Institute of Computing Technology, Chinese Academy of Sciences), Long Cheng (School of Control and Computer Engineering, North China Electric Power University), Xuyi Cai (Institute of Computing Technology, Chinese Academy of Sciences), Xiaohan Ma (Institute of Computing Technology, Chinese Academy of Sciences), Weiwei Chen (Insititute of Computer Technoology, Chinese Academy of Sciences.), Lei Zhang (Institute of Computing Technology, Chinese Academy of Sciences), Ying Wang (Institute of Computing Technology, Chinese Academy of Sciences) Phloem: Automatic Acceleration of Irregular Applications with Fine-Grain Pipeline Parallelism Quan Nguyen (MIT), Daniel Sanchez (MIT) CHOPPER: A Compiler Infrastructure for Programmable Bit-serial SIMD Processing Using Memory in DRAM Xiangjun Peng (The Chinese University of Hong Kong), Yaohua Wang (National University of Defense Technology), Ming-Chang Yang (The Chinese University of Hong Kong) VAQUERO: Vector Acceleration for Query Processing Julian Pavon Rivera (Barcelona Supercomputing Center), Ivan Vargas Valdivieso (Barcelona Supercomputing Center), Joan Marimon (Barcelona Supercomputing Center), Roger Figueras Bagué (Barcelona Supercomputing Center), Francesc Moll Echeto (Universidad Politecnica de Catalunya), Osman Unsal (Barcelona Supercomputing Center), Mateo Valero (Barcelona Supercomputing Center), Adrian Cristal (Barcelona Supercomputing Center) |
12:00 – 12:20 | Closing |
To access the final proceedings, please scan the QR code below
