Why High-Level Synthesis (HLS) ?

A common pain for product development in traditional semiconductor companies is the dependencies within groups whenever the initial definition is pending/not clear or a change in feature is required. For example, high-level architecture specification drives the entire microarchitecture design proposal, whereas microarchitecture design is dependent on algorithmic modeling input as well as corresponding relations with the verification side. Whenever this dependency is not streamlined, the entire development progress stalls; inefficiencies in human resource utilization and continuous alternations to upper hierarchy planning eventually breaks the project planning and causes chaos. Another example is that during a stable project development cycle, a feature change causes micro-dependencies to build up between teams, eventually all resources are overwhelmed by communication overheads and result in deadlocks due to individual bottlenecks.

High-level synthesis (HLS) is a flow that encapsulates the conventional ASIC/FPGA development flow for modeling, design and verification tasks into one unified effort. Not only does HLS provide a flow for architecture exploration, but also facilitates rapid turnaround for application prototyping and customer-driven feature deliverables (e.g. initial cycle-accurate C-model). High-level synthesis (HLS) provides direct translation from high-level description languages (e.g. C, C++, System C) to low-level hardware description languages (e.g. System Verilog, Verilog, VHDL) for ASIC and FPGA devices.

Looking from a business prospective, HLS potentially reduces the amount of NRE cost for application prototyping and development (e.g. Nvidia stated in Mentor Graphic’s white paper that the overall development time and human resources reduced by more than 40% and 50%, respectively using the HLS flow). In addition, companies have a much softer entry point to seamlessly develop customized hardware for business applications (e.g. Google hardware team is believed to use HLS flow to develop the initial version of TPU with less than 15 engineers).

The drawback of HLS has always been the quality of result (QoR). Empirical reports show that the performance QoR drops by 10%-25% for HLS compared to the traditional ASIC development flow. However, the degradation could typically be improved by: 1) improving high-level coding style for low-level RTL synthesis; 2)constructing efficient custom libraries for HLS mapping; 3) exploration of HLS tool capabilities; and 4) refining the HLS flow for performance bottleneck pinpointing. Concretely, if the QoR drop could be maintained within 5%, the benefits of HLS greatly offset the performance deficiencies.

Although the HLS flow greatly reduces the amount of resources involved in ASIC/FPGA development, it is not intended to disrupt, but to guide a new organizational change in traditional semiconductor companies. HLS targets to bond the true dependency of modeling, verification, design and architecture definition into a single entity. HLS does not target to bring the best performance, but to cover the most amount of customer requests for prototyping and evaluation in the shortest amount of time. Logically, HLS delivers the first turnaround result (e.g. estimations of PPA within the ballpark, FPGA demo, cycle-accurate C-model), where on the basis of these results, customers can timely feedback on detailed design requirements and specification justifications that can further facilitate the traditional semiconductor development process.

Source: Deep Learning on Medium