Why TASKING Compiler Design Matters for Synopsys ARC-V
Introduction
RISC-V has introduced unprecedented flexibility and openness in processor design, but these advantages also place new demands on compilers. One of the most impactful microarchitectural optimizations is instruction fusion, where hardware recognizes common instruction patterns and executes them as a single, more efficient operation.
Processors such as Synopsys ARC-V integrate advanced instruction fusion capabilities that can deliver substantial gains in performance, and energy efficiency. Compilers that generate and preserve patterns the hardware can fuse are able to effectively harness the full performance potential of the processor.
This article examines the implications of instruction fusion for compiler design and demonstrates why a purpose-built, fusion-aware compiler, such as the TASKING RISC-V compiler, is ideally suited to leverage ARC-V’s capabilities.
Traditional vs. RISC-V Approaches to Instruction Fusion
There is a fundamental difference between conventional approaches to instruction fusion in embedded microcontrollers and RISC-V’s strategy.
Traditional ISAs often introduce new “fused” instructions such as load-pair or wide-multiply by encoding multiple operations in a single opcode. While these instructions rely on standard compiler handling rather than specialized fusion logic, they fragment the ecosystem: binaries using these extensions run only on processors that implement them, limiting software portability.
In order to implement instruction fusion in RISC-V, its base ISA must remain minimal and clean, leaving fusion entirely to the processor’s microarchitecture implementation. Individual processor implementations can internally fuse common instruction patterns, improving performance without affecting software portability. This strategy allows hardware designers to optimize fusion for specific workloads while keeping all standard binaries compatible across devices.
Advanced Instruction Fusion in ARC-V Processors
The ARC-V implementation of instruction fusion enables dual instruction issues by fusing instructions from different functional units on an in-order processor. Two instructions can be fused if they target different functional units, use up to three source operands, and produce no more than two destination registers.
These fusion capabilities reduce pipeline pressure through fewer internal micro-ops, lower decode overhead, and increase instruction throughput by executing more effective work per cycle. It also lowers energy consumption and achieves these gains without affecting interrupt latency.
By carefully selecting fusible instruction patterns, ARC-V maximizes performance while maintaining the simplicity and predictability needed for real-time embedded applications. To maximize these benefits, compilers must be micro-architecture aware.

To learn about Synopsys ARC-V Processor IP, please visit ARC-V Processor IP webpage.
Compiler Implications of Instruction Fusion
Instruction fusion has profound implications for compiler design, particularly in in-order processors that must meet real-time criteria like ARC-V. Fusion is inherently pattern-based: hardware recognizes specific sequences while applying constraints on registers, dependencies, and pipeline availability. Not all instruction sequences are eligible for fusion, and subtle microarchitectural considerations determine which sequences can be paired.
The instruction selector must choose sequences matching fusion-capable patterns. For ARC-V, these patterns extend beyond standard RISC-V cores, requiring explicit heuristics or fusion-aware templates. The instruction scheduler must place dependent instructions close together while avoiding hazards or unintended interleaving. Scheduling must occur before and after register allocation: the first phase forms fusible patterns, and the second optimizes sequences where hardware fusion constraints could not be satisfied after register assignment.
Registers have a direct impact on fusion. The allocator must ensure that operands for potential fusion candidates do not conflict with other live ranges, while maintaining adjacency requirements. Fusion-friendly allocation strategies include pairing registers for dual-issue opportunities and minimizing live-range interference in critical sequences. Fusion-aware peephole passes repair or create fusible patterns after post-register-allocation optimizations. These passes ensure that microarchitectural opportunities are not lost during optimization, and they can also adjust instruction sequences to meet pipeline or alignment requirements.
Code relaxation is a common optimization in variable-length ISAs such as RISC-V with compressed instructions. It is a post-code-generation phase, typically performed by the linker, that rewrites instructions into shorter or longer forms based on the final code layout. Relaxation may increase or decrease instruction size, potentially shifting the alignment of subsequent instructions. Changes in instruction alignment can have a negative impact on cache and instruction fetch & decode performance. The linker must account for this and preserve the alignment of alignment-sensitive instruction patterns.

Why the TASKING Compiler Excels on ARC-V
General-purpose compilers like GCC and LLVM deliver portable, fusion-aware optimization for ARC-V. However, the TASKING RISC-V compiler, developed in close cooperation with Synopsys, is purpose-built to fully exploit ARC-V’s fusion capabilities and unlock additional performance gains.
Key advantages include:
- Fusion-Centric Backend: Instruction selection, scheduling, and register allocation are all designed to generate and preserve fusible patterns.
- Microarchitecture Awareness: The compiler models ARC-V’s pipeline, functional units, and fusion constraints, aligning optimizations with hardware behavior.
- Co-Development with Synopsys: Close collaboration enables rapid integration of new fusion patterns, validation against cycle-accurate models, and iterative performance tuning.
- By combining deep microarchitectural knowledge with fusion-aware compilation strategies, TASKING ensures that ARC-V processors achieve performance gains while maintaining software portability across the RISC-V ecosystem.
Conclusion
Instruction fusion is a defining feature of modern RISC-V microarchitectures. Synopsys ARC-V processors implement an especially rich fusion model, offering performance, and efficiency gains.
A fusion-aware compiler such as the TASKING RISC-V compiler, engineered specifically for ARC-V and developed in close partnership with Synopsys, surpasses generalized toolchains by shaping instruction streams explicitly around ARC-V’s fusion rules and microarchitectural behavior.
In a RISC-V landscape where microarchitectural innovation is a key differentiator, compiler design is as critical as the hardware itself. The combination of ARC-V processors and a purpose-built, fusion-optimized TASKING compiler unlocks the full potential of instruction fusion and establishes a new bar for RISC-V performance.
Author
Gerard Vink, Industry Specialist, BDI, TASKING
Revi Ofir, Principal Product Manager, ARC-V™ Processors, Synopsys
