With the growing demand for smaller and less power-consuming electronic devices, low-power chip design has taken on an essential role. Artificial intelligence, increasingly present in embedded systems, is challenging low-power chip designers to integrate more intensive and innovative manufacturing and engineering processes. To meet the requirements of AI chips such as functionality, manufacturability, cost, and reliability, it is necessary to use appropriate power analysis techniques and tools. Low Power Design The goal of low power design is to reduce the overall dynamic and static power consumption of an integrated circuit (IC), which is a critical aspect to enable next generation applications. This procedure involves reducing both dynamic and static energy. Dynamic power includes switching power and short circuit analyzes, while static power mainly involves leakage current analysis. The power equation, which includes the above three contributions, is illustrated in Figure 1. Figure 1: Power and Equation Components (Source: Synopsys) In the years when the IC manufacturing process relied on 90nm to 16nm technologies, designers’ attention was focused on reducing the power of Dropouts, having more weight (85% to 95%) than dynamic strength (10% to 15%). With the subsequent transition from 16 nm to 14 nm, the force equation changed; The leakage strength was completely under control, while the dynamic strength became a more important issue. This was due to the transition from the planar to the architecture of the FinFET transistor, a substrate-based multilayer device in which the gate is placed on two, three or four sides of the channel or wrapped around the channel, forming a double or multi-gate three-dimensional structure. In the next few years, continuous developments in the field of electronic fabrication will result in fabrication processes at 7, 5, or even 3 nm, again highlighting the importance of leakage strength. New challenges for artificial intelligence The widespread use of artificial intelligence in electronic applications presents new types of energy challenges. The Performance, Power, and Space (PPA) model remains an objective that designers must achieve. The difference is that with the introduction of an AI chip, it becomes difficult to maximize power without sacrificing power. Today, performance is already limited by power, and it is very difficult to reliably deliver power to every part of the chip without worrying about heat dissipated and thermal management. Vector quality, defined as the realistic activity seen when a SoC is operating in a real system, is critical for dynamic power analysis and optimization. “The biggest problem is estimating the workload, especially when the SoC is working in the field, on a real system,” said Godwin Maben, a low-power architect and fellow at Synopsys Design Group. “We need to know the workload to measure and improve dynamic power. When it comes to AI, there are no pre-set parameters. We need to define these workloads, make sure they are captured and correct the power beforehand.” Designing for low power means understanding the implications of power across software development, hardware design, and manufacturing. It is not a one-step activity and must be run throughout the entire chip design process, with the goal of reducing overall dynamic and static power consumption. As shown in Figure 2, the design and validation methodology is divided into five main phases: static power validation, exploration, and dynamic power validation and analysis. From simulation Providing an estimate of the energy consumption of a SoC is a challenging task, requiring designers to set up test benches that are able to reproduce real operating conditions as faithfully as possible. The best system capable of meeting these requirements is simulation. Performing a power analysis of an AI chip requires appropriate tools capable of acquiring and manipulating hundreds of gigabytes, which are made up of trillions or billions of clock cycles. Defining energy attributes within a simulation system helps solve this problem, as it can only identify windows of interest for energy analysis. “With AI chips, two new concepts have emerged,” Mabin said. “The first is that debugging validation errors is challenging, because it takes a long time. The second is how to develop application software that can be ready by the time the slide is finished. This is where the concept of simulation and modeling came into the picture.” With a unique Fast Emulation architecture, the most advanced commercial FPGAs, and innovations in FPGA-based simulation software, Synopsys’ ZeBu Server is the fastest simulation system in the industry, delivering twice the performance of legacy simulation solutions. ZeBu provides users with valuable tools such as a fast compiler, advanced debugging (including native integration with Verdi), simulation acceleration, hybrid simulation, and power analysis. When an application runs on an emulator, it is eventually translated into vectors for the SoC. These vectors can then be used to run a simulation, thus validating the functionality of the chip in the emulator. Simulation is the appropriate platform to get the workload done, as it generates target vectors for optimizing power analysis. As shown in Figure 3, the ZeBu EmPower vectors are used by PrimePower RTL to provide useful information to designers. Figure 3: Software-Driven SoC Activity (Source: Synopsys) AI chips use a lot of mathematical functions, mainly multiplication and matrix processing, that are performed by custom and optimized combinatorial logic. “The moment we enter these computing-intensive applications, the new concept that designers are concerned about is the strength of the flaw in bottom engineering,” said Maben. “Fault strength is more than 25% of total energy, and we know that fault strength means wasted energy.” The amount of glitch is proportional to the number of operations the SoC performs, making glitching an important issue to address for AI accelerators. There are two types of glitches: inertial glitches and transmission glitches. Inertial glitches can be addressed architecturally, while transmission glitches are due to delays through logic cells, causing a different arrival time at the inputs of logic gates. Vulnerabilities have become such a big topic that they are very difficult to improve and difficult to measure. Synopsys offers an optimal solution for end-to-end RTL-to-gate fault power analysis. In RTL, PrimePower RTL (see Figure 4) can calculate and report bugs for each hierarchy, and can also indicate the RTL source line of code that generates the highest level of bugs. The PrimePower solution also provides delay/fault-aware vector generation using RTL simulation and can perform a fault power analysis using delay-free gate-level simulation or timing-aware simulation that correlates closely with SPICE power numbers. “Glitches are becoming dominant, especially in AI chips and minimal engineering,” Mabin said. “There are tools like PrimePower RTL, which can tell the designer which blocks are the most attractive and their order. Architects can then change the structure to make it less cluttered.” Figure 4: PrimePower RTL power defect analysis (Source: Synopsys).