Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design de Xiaowei Li

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design: A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach

Xiaowei Li

Guihai Yan

Cheng Liu

With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one of the mainstay merits of VLSI designs. In this context, this book presents a built-in on-chip fault-tolerant computing paradigm that seeks to combine fault detection, fault diagnosis, and error recovery in large-scale VLSI design in a unified manner so as to minimize resource overhead and performance penalties. Following this computing paradigm, we propose a holistic solution based on three key components: self-test, self-diagnosis and self-repair, or “3S” for short. We then explore the use of 3S for general IC designs, general-purpose processors, network-on-chip (NoC) and deep learning accelerators, and present prototypes to demonstrate how 3S responds to in-field silicon degradation and recovery under various runtime faults caused by aging, process variations, or radical particles. Moreover, we demonstrate that 3S not onlyoffers a powerful backbone for various on-chip fault-tolerant designs and implementations, but also has farther-reaching implications such as maintaining graceful performance degradation, mitigating the impact of verification blind spots, and improving chip yield.

This book is the outcome of extensive fault-tolerant computing research pursued at the State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences over the past decade. The proposed built-in on-chip fault-tolerant computing paradigm has been verified in a broad range of scenarios, from small processors in satellite computers to large processors in HPCs. Hopefully, it will provide an alternative yet effective solution to the growing reliability challenges for large-scale VLSI designs.

Citește tot Restrânge

	Preț	Express
Paperback (1)	1047^.36 lei 38-44 zile
Springer Nature Singapore – 3 mar 2024	1047^.36 lei 38-44 zile
Hardback (1)	1279^.34 lei 43-57 zile
Springer Nature Singapore – 2 mar 2023	1279^.34 lei 43-57 zile

Preț

Express

Paperback (1)

1047^.36 lei 38-44 zile

Springer Nature Singapore – 3 mar 2024

1047^.36 lei 38-44 zile

Hardback (1)

1279^.34 lei 43-57 zile

Springer Nature Singapore – 2 mar 2023

1279^.34 lei 43-57 zile

Notă biografică

Dr. Xiaowei Li is a Professor and Deputy (Executive) Director at State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree and M.Eng. degree from Hefei University of Technology in 1985 and 1988, and his Ph.D. from ICT, CAS in 1991. He joined Peking University as a post-doc in 1991. From 1993 to 2000, he was an associate professor with the Department of Computer Science at Peking University. From 1997 to 1999, he was a Visiting Research Fellow at The University of Hong Kong and at Nara Institute of Science and Technology, Japan. His research interests include VLSI testing, fault-tolerant computing, multi-core processor design & verification, and hardware security. He has led more than 20 national research projects and helped to develop many systems and software tools in these areas. He holds more than 90 patents and more than 50 software copyrights. He has co-published over 400 peer-reviewed journal and conference papers. He has received many honors and awards, including China National Technology Innovation Award (2012), and China National Science and Technology Progress Award (2015). Dr. Li served on a number of program committees of IEEE/ACM-sponsored conferences and symposia including DAC, ICCAD and DATE, and is currently Vice-Chair of TTTC of the IEEE Computer Society. He also serves as Associate Editors of IEEE TCAD, IEEE TCAS II, and ACM TODAES.

Dr. Guihai Yan is a professor at the State Key Laboratory of Processors (SKLP), Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree from Peking University in 2005 and his Ph.D. from ICT, CAS in 2011, respectively. His primary research interest is in computer architecture with an emphasis on domain-specific architectures for machine learning and financial computing. He has published more than 40 peer-reviewed papers in leading conference proceedings andjournals including ISCA, HPCA, TC and TVLSI. His research work on fault-tolerant VLSI design has been deployed in countless projects, including 973 high-throughput computing systems and self-repair computer systems.

Dr. Cheng Liu is an associate professor at the State Key Laboratory of Processors (SKLP), Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). He received his B.Eng. degree and M.Eng. degree from Harbin Institute of Technology in 2007 and 2009, and his Ph.D. from The University of Hong Kong in 2016. He also worked as a research fellow at National University of Singapore from 2016 to 2018. His research interests include fault-tolerant computing, reconfigurable computing, and customized computing particularly for deep learning and large graph processing. He has published more than 50 peer-reviewed papers in leading conference proceedings and journals for computer architecture and EDA.

Textul de pe ultima copertă

Caracteristici

Presents a built-in on-chip fault-tolerant computing paradigm that can be applied to a variety of VLSI designs Provides a holistic fault-tolerant solution that enables self-test, self-diagnosis and self-repair (3S) Verified in a broad range of VLSI designs, from processors and NoCs to deep learning accelerators

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design: A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach

Toate formatele și edițiile

Preț: 1279^.34 lei

Carte tipărită la comandă

Specificații

Cuprins

Notă biografică

Textul de pe ultima copertă

Caracteristici

Papetărie, jocuri, reviste

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design: A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach

Toate formatele și edițiile

Preț: 1279.34 lei

Specificații

V-ar putea interesa

Cuprins

Notă biografică

Textul de pe ultima copertă

Caracteristici

Preț: 1279^.34 lei