Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design: A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach
Autor Xiaowei Li, Guihai Yan, Cheng Liuen Limba Engleză Hardback – 2 mar 2023
With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one of the mainstay merits of VLSI designs. In this context, this book presents a built-in on-chip fault-tolerant computing paradigm that seeks to combine fault detection, fault diagnosis, and error recovery in large-scale VLSI design in a unified manner so as to minimize resource overhead and performance penalties. Following this computing paradigm, we propose a holistic solution based on three key components: self-test, self-diagnosis and self-repair, or “3S” for short. We then explore the use of 3S for general IC designs, general-purpose processors, network-on-chip (NoC) and deep learning accelerators, and present prototypes to demonstrate how 3S responds to in-field silicon degradation and recovery under various runtime faults caused by aging, process variations, or radical particles. Moreover, we demonstrate that 3S not onlyoffers a powerful backbone for various on-chip fault-tolerant designs and implementations, but also has farther-reaching implications such as maintaining graceful performance degradation, mitigating the impact of verification blind spots, and improving chip yield.
This book is the outcome of extensive fault-tolerant computing research pursued at the State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences over the past decade. The proposed built-in on-chip fault-tolerant computing paradigm has been verified in a broad range of scenarios, from small processors in satellite computers to large processors in HPCs. Hopefully, it will provide an alternative yet effective solution to the growing reliability challenges for large-scale VLSI designs. Toate formatele și edițiile | Preț | Express |
---|---|---|
Paperback (1) | 1047.32 lei 38-44 zile | |
Springer Nature Singapore – 3 mar 2024 | 1047.32 lei 38-44 zile | |
Hardback (1) | 1242.59 lei 6-8 săpt. | |
Springer Nature Singapore – 2 mar 2023 | 1242.59 lei 6-8 săpt. |
Preț: 1242.59 lei
Preț vechi: 1553.23 lei
-20% Nou
Puncte Express: 1864
Preț estimativ în valută:
237.80€ • 250.10$ • 198.08£
237.80€ • 250.10$ • 198.08£
Carte tipărită la comandă
Livrare economică 04-18 ianuarie 25
Preluare comenzi: 021 569.72.76
Specificații
ISBN-13: 9789811985508
ISBN-10: 9811985502
Pagini: 304
Ilustrații: XVIII, 304 p. 1 illus.
Dimensiuni: 155 x 235 mm
Greutate: 0.63 kg
Ediția:2023
Editura: Springer Nature Singapore
Colecția Springer
Locul publicării:Singapore, Singapore
ISBN-10: 9811985502
Pagini: 304
Ilustrații: XVIII, 304 p. 1 illus.
Dimensiuni: 155 x 235 mm
Greutate: 0.63 kg
Ediția:2023
Editura: Springer Nature Singapore
Colecția Springer
Locul publicării:Singapore, Singapore
Cuprins
Chapter 1: Introduction.- Chapter 2: Fault-tolerant general circuits with 3S.- Chapter 3: Fault-tolerant general purposed processors with 3S.- Chapter 4: Fault-tolerant network-on-chip with 3S.- Chapter 5: Fault-tolerant deep learning processors with 3S.- Chapter 6: Conclusion.
Notă biografică
Dr. Xiaowei Li is a Professor and Deputy (Executive) Director at State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree and M.Eng. degree from Hefei University of Technology in 1985 and 1988, and his Ph.D. from ICT, CAS in 1991. He joined Peking University as a post-doc in 1991. From 1993 to 2000, he was an associate professor with the Department of Computer Science at Peking University. From 1997 to 1999, he was a Visiting Research Fellow at The University of Hong Kong and at Nara Institute of Science and Technology, Japan. His research interests include VLSI testing, fault-tolerant computing, multi-core processor design & verification, and hardware security. He has led more than 20 national research projects and helped to develop many systems and software tools in these areas. He holds more than 90 patents and more than 50 software copyrights. He has co-published over 400 peer-reviewed journal and conference papers. He has received many honors and awards, including China National Technology Innovation Award (2012), and China National Science and Technology Progress Award (2015). Dr. Li served on a number of program committees of IEEE/ACM-sponsored conferences and symposia including DAC, ICCAD and DATE, and is currently Vice-Chair of TTTC of the IEEE Computer Society. He also serves as Associate Editors of IEEE TCAD, IEEE TCAS II, and ACM TODAES.
Dr. Guihai Yan is a professor at the State Key Laboratory of Processors (SKLP), Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree from Peking University in 2005 and his Ph.D. from ICT, CAS in 2011, respectively. His primary research interest is in computer architecture with an emphasis on domain-specific architectures for machine learning and financial computing. He has published more than 40 peer-reviewed papers in leading conference proceedings andjournals including ISCA, HPCA, TC and TVLSI. His research work on fault-tolerant VLSI design has been deployed in countless projects, including 973 high-throughput computing systems and self-repair computer systems.
Dr. Cheng Liu is an associate professor at the State Key Laboratory of Processors (SKLP), Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree and M.Eng. degree from Harbin Institute of Technology in 2007 and 2009, and his Ph.D. from The University of Hong Kong in 2016. He also worked as a research fellow at National University of Singapore from 2016 to 2018. His research interests include fault-tolerant computing, reconfigurable computing, and customized computing particularly for deep learning and large graph processing. He has published more than 50 peer-reviewed papers in leading conference proceedings and journals for computer architecture and EDA.
Textul de pe ultima copertă
With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one of the mainstay merits of VLSI designs. In this context, this book presents a built-in on-chip fault-tolerant computing paradigm that seeks to combine fault detection, fault diagnosis, and error recovery in large-scale VLSI design in a unified manner so as to minimize resource overhead and performance penalties. Following this computing paradigm, we propose a holistic solution based on three key components: self-test, self-diagnosis and self-repair, or “3S” for short. We then explore the use of 3S for general IC designs, general-purpose processors, network-on-chip (NoC) and deep learning accelerators, and present prototypes to demonstrate how 3S responds to in-field silicon degradation and recovery under various runtime faults caused by aging, process variations, or radical particles. Moreover, we demonstrate that 3S not onlyoffers a powerful backbone for various on-chip fault-tolerant designs and implementations, but also has farther-reaching implications such as maintaining graceful performance degradation, mitigating the impact of verification blind spots, and improving chip yield.
This book is the outcome of extensive fault-tolerant computing research pursued at the State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences over the past decade. The proposed built-in on-chip fault-tolerant computing paradigm has been verified in a broad range of scenarios, from small processors in satellite computers to large processors in HPCs. Hopefully, it will provide an alternative yet effective solution to the growing reliability challenges for large-scale VLSI designs. Caracteristici
Presents a built-in on-chip fault-tolerant computing paradigm that can be applied to a variety of VLSI designs Provides a holistic fault-tolerant solution that enables self-test, self-diagnosis and self-repair (3S) Verified in a broad range of VLSI designs, from processors and NoCs to deep learning accelerators