Skip to content

HAsim Tutorial (ISCA 2013)

Michael Adler edited this page Mar 18, 2015 · 2 revisions

HAsim Tutorial (ISCA 2013)

HAsim FPGA-Based Processor Models: Fast, Accurate and Flexible

In the era of frequency scaling, the throughput of timing models improved approximately in proportion to host processor performance. Now, with frequency held relatively constant, we spend our growing transistor budgets on spatial scaling (parallelism): adding both cores and special-purpose system-on-chip devices. A single-threaded timing model now takes longer to run with each model target generation, proportional to the increase in parallelism of the simulated system. To make matters worse, growth in simulated cache sizes requires longer runs to model multi-core replacement policies.

To keep pace with growing parallelism, software-based timing modelers have two options: decrease model fidelity and increase model parallelism. Building a parallel processor model is, unfortunately, a difficult problem. The abundance of wires and interdependence in on-chip networks requires significant communication and tight coupling between parallel model threads. This limits potential model throughput gains. We have increasingly come to trust the results of lower accuracy models in order to reduce simulation time. While this may be fine for some architectural exploration, it leaves us vulnerable to errors introduced by unintentional side effects of modeling accuracy choices.

FPGAs have traditionally been employed for verification of RTL in the late stages of a project. Recently, a number of projects have come to see the value of FPGAs as general purpose accelerators. What if we could treat an FPGA as just another computation fabric, program it with high level languages and tools, and take advantage of its abundance of parallelism and communication wires? We have built an environment for writing timing models suitable for pre-RTL pathfinding using concepts similar to software-based timing models. Models are port-based and easily restructured, separating functional and timing components. Timing is separated from the FPGA clock, enabling hybrid computation in which difficult but infrequent tasks are managed in software. By multiplexing multiple cores or functional units onto a single timing pipeline we are able to model orders of magnitude more cores than would be possible in standard, RTL-based emulation.

HAsim (Hardware-based Architecture Simulator) is not a single model. It is an open-source infrastructure for building hybrid FPGA/software processor timing models, written in Bluespec System Verilog. Reference models are provided. HAsim is layered upon a virtual platform that we have written to provide OS services on FPGAs. This virtual platform, named LEAP (Logic-Based Environment for Application Programming), is suitable for a wide variety of applications. LEAP models may be confined to a single FPGA or spread across multiple FPGAs. For multi-FPGA configurations, LEAP generates networks with compressed data streams automatically, allowing scale-out to large models with limited run-time impact.

Key topics in the half-day tutorial will include:

  • Tracking simulated time with HAsim A-Ports, independent of FPGA clocks
  • Flexible, name-based, port networks
  • Split functional / timing models
  • Detailed description of a sample model, including mechanisms for time-multiplexing both core models and on-chip network models
  • The LEAP virtual platform
  • Model configuration using AWB (Architect’s Workbench)

Tutorial slides:

Presenters:

  • Michael Adler
  • Kermin E. Fleming
  • Michael Pellauer
  • Joel Emer