System-on-chip (SoC) integration is behind the semiconductor industry’s success in continuing to achieve its goals for better, smaller, and faster chips. Various tools are used to design and verify electronic systems. Validation is one of the most important aspects, as it demonstrates the functional correctness of the design. Using FPGAs to validate SoC design is a powerful tool and has become a very important part of semiconductor design. Conventional methodologies are not sufficient to fully validate the system. There is a compelling reason to perform a dynamic timing analysis. EDA vendors provide basic simulation solutions that adequately support low-density devices. These tools are not powerful enough for debugging and validation to the extent that today’s designers demand it to meet the fierce competition in meeting schedules and effectively debugging in large-scale FPGAs. Some architecture explorers solve validation-related problems by reusing the system model to test timing, power, and functionality based on real-world and real-world workloads. It addresses many of the flaws in existing validation solutions. We will discuss the treatment of each particular issue and how each has been resolved. Problems underlying current SoC validation With the increases in the size and complexity of SoCs, there is a growing need for more effective validation tools. Excessive competition reduces the time requirements to reach the market. This makes it difficult for designers to use the traditional approach to hardware implementation and test designs. Functional simulation only tests the functional capabilities of an RTL design. Basically what they do is send a general set of inputs, and they test these scenarios and determine if they work or not. Fail to provide timing, power consumption, and responses to workloads from the rest of the system. Static analysis fails to find issues that can be seen when running the design dynamically. Timing analysis methodology has various drawbacks. In a real system, dynamic factors can cause timing violations on the SoC. It can tell the user whether the design can meet the setting and whether the applicable timing constraints are met. An example is the design of time-sensitive network routers, where care must be taken to determine which priority levels can use slots. You should also be careful that the priority level package does not use the resources allocated for another period of time. For example, control data frames are of priority level 3. In the current time period, Class A packets start using resources. During class A frames transfer, the next time period starts and packets scheduled during this new time period (control data frames) wait for the current transfer to complete. Static analysis tools will never be able to find this problem. Similarly, with packets being routed through a common crossbar in the network, packets can end up being dropped. So a suitable flow control mechanism should be put in place. Fixed timing analysis will not be able to find this issue. Another verification methodology is in-system testing. If the design runs on the board and passes the test kits, it is ready for release. But some issues such as timing violations may not appear immediately, and by that time, the design is already in the hands of the customer. Getting Accuracy Using an Architecture Simulator To describe this new system validation solution, we use a commercial architecture simulator called VisualSim Architect from Mirabilis Design. Architecture simulators have traditionally been used to validate system specifications, trade-offs, and architecture coverage. The new trend is to expand the role of these emulators by integrating them with FPGA boards and emulators. SystemC and Verilog emulators provide a similar approach, but their system models are either very detailed or very abstract. It does not accurately capture system scenarios with a simulation performance that enables large-scale testing. The logical function or IP is a block in the graphical architecture model. Most architecture simulators contain a diverse library of components that cover ASIC design blocks implemented in RTL. The environment allows capturing the entire architecture, allowing the user to see what the entire system is doing. The system can be inside a chip, a full box, or a grid. The architecture contains the necessary details that help create buffer occupancy, timing, or power consumption. It also provides information about the overall system response after replacing a block in the design. For example, if the memory block is replaced by an emulator or FPGA, what is the effect on the rest of the system? Meeting the timing deadline is critical to the success of any design. For example, a block is expected to complete its simulation within 20 microseconds, but has been observed to take 20 milliseconds of time. As a result, the rest of the system suffers. Such details are captured and the user gets insights into the timing of each application on the FPGA, relative to the rest of the system. Failure to meet deadlines may also result in failure of the testing required to make it usable in their product environment. The second interesting feature is to reduce the cost and time spent testing each block or IP address with a test chip. A test chip may cost $200,000 from the NRE, about $200 to $300 for packaging and other support activities, and six to nine months for the test. With VisualSim, the user can load an RTL for the assigned IP block, and replace the existing build block with the specified FPGA block. The C++ API communicates with the given FPGA block. This makes it easier for the user to maintain the same architecture environment and have an overview of the activities of the segment or the entire system. Includes performance, timing, and response time. The IP address placed on the FPGA will eventually pass to the product. This indicates that the user will test the IP address in the context of the real architecture. Checks if the IP address will work in the system. Also, additional information such as performance or power consumption will be incurred. So, it’s not just checking the single IP address, but checking the entire SoC with the IP in that block or in the FPGA. Solutions that solve your dilemmas Most engineers have good reasons for refusing to do a timing simulation. Some of the main concerns are: It is time consuming. Time is one of the most important factors for the success of any design. It takes a long time if you build the timing model from scratch. But the idea here is to reuse the architecture model in the VisualSim environment. It serves two purposes. The architecture model can be improved with greater accuracy and better evaluation of existing, internal or purchased IP. This also helps to perform a timing analysis of the IP address for which the code is available, which is the reason for the short time. It takes quite a bit of memory and processor power to be able to check. Designers prefer component-based simulation rather than simulating a single large design. Split-clog was introduced because a single FPGA board could only power a small piece of the chip. Emulating a single large design will consume a lot of processor power, memory, and FPGA capacity. For example, to simulate a full SoC, a user might need 2000 FPGAs. It is very difficult to put too many FPGAs on a board and get them to work. So most designers have welcomed the concept of divide and conquer with the hope that every part will work after assembly. Existing tools also contain a lot of basic details that are not useful for verification, which slows down the simulation of the entire design. Second, the “keep hierarchy” solution allows the design to maintain its hierarchy even if it goes through implementation. It takes each part of the processor, for example, and makes a hierarchy. But most current tools provide one to two levels of hierarchy. Architectural modeling environments such as VisualSim can design the entire SoC or the board. The environment tests and completes all functions in a very short period of time (1 or 2 hours), the reason being that it extracts clocks and signals and reuses architectural blocks, which makes simulation much faster. The FPGA board needs to contain only the specified IP address to be tested. Also, the simulator runs at 80 million events per second and over 40,000 instructions (not cycles) per second for the entire SoC. Moreover, it can be run on a shared Linux server. Thus, the cost can be kept low by using off-the-shelf systems. It also creates 30 to 40 levels of the hierarchy. Each hierarchical component can be a reusable component. The model is built by these small hierarchical components. There is no way to reuse the test table from the functional simulation. New test benches must be created. The environment reuses the architecture paradigm. The architecture paradigm contains timing, function, and power in the same paradigm. It contains all the required stats like latency, throughput, power consumption, efficiency, utility, waveform, etc. Since it has all the details, it can be easily reused for both timing and function. Debugging design errors turns out to be a chore, as the entire network list is flattened and there is no way to identify the problem in time. The environment assists in rectification efforts. It provides a lot of sensors on the side of the test environment. Also, the architecture model can be used as a reference to compare the output from an FPGA board. Thus, viewing the sequence containing the error becomes a lot easier. So it is possible to narrow the defect. Timing simulations show the worst numbers. The design has enough slack not to worry. Commercial tools like VisualSim run accurate cycle simulations, which are proven to get timings with 90% to 95% accuracy and power with 85% to 98% accuracy. So even very high throughput and timing designs can be tested with deadlines. Not all submodules are encoded in the same location. There is no way to break down the coded parts of each site, as the designers of those parts will be the ones who understand the design best in order to check it out. The entire architecture is captured in VisualSim. Each remote team can replace their design with the full system model in VisualSim. This way, each team can test independently and multiple teams can unite for testing as well. Some distinct benefits: Reduced costs and time in testing a multi-level hierarchy Reuse architecture model The entire SoC or the entire board can be designed. Tests independently Correction efforts help explore prospects Reusing the architectural model saves a significant amount of time. Component-based simulation is an outdated concept now. Full SoC verification in very short periods of time can save time and cost. Tools like VisualSim can capture the entire architecture, allowing the user to see what the entire system is doing. Additional information such as performance, timing, and latency is information that will give the user an idea of how the final design will work. The simulations performed are very fast and may take 1-2 hours as opposed to traditional methods which may take days to verify. By Anupurba Mukherjee, Product Market Engineer, Mirabilis Design Inc.