Stage 4 Research Report: C++17 Register Access vs Industry Landscape
Document under review: _posts/2020-08-30-cxx-reg-access.md
Date: 2026-02-24
A. C++ MMIO Register Access Libraries
Template-based approaches
- Kormanyos, Real-Time C++: The book that inspired this post. Uses static template parameters for register address, value, and access type. Compile-time constant evaluation eliminates runtime overhead. The blog post’s critique — that Kormanyos’s approach is verbose and error-prone — is the motivating design decision.
- https://github.com/ckormanyos/real-time-cpp
- Kvasir: Template metaprogramming library for Cortex-M register access. Distinguishes itself with compile-time register field validation and atomic bit manipulation. Uses
apply()to batch multiple field writes into a single register access, reducing read-modify-write cycles. Generates from SVD via its own tooling.- https://github.com/kvasir-io/Kvasir
- cppreg (Sendyne): C++11 register access library. Defines register “packs” (contiguous MMIO regions) with type-safe field access. Enforces access policies (read-only, write-only, read-write) at compile time. Targets Cortex-M but platform-independent. No code generation — register definitions are written manually.
- https://github.com/sendyne/cppreg
- https://sendyne.com/cppreg/
- AllThingsEmbedded: C++17 approach very similar to this blog post. Uses
if constexprfor compile-time optimization of field access paths. Blog series walks through the design rationale. Demonstrates the same single-field vs multi-field optimization pattern.- https://allthingsembedded.com/post/2019-01-03-arm-cortex-m0-register-access/
HAL/framework approaches
- modm (Modular Open Mobile Devices): Full platform HAL generated from SVD and vendor data for STM32 and SAM families. Register access is one layer of a larger framework including GPIO, DMA, UART, SPI, etc. Uses lbuild as the code generation engine rather than Jinja directly.
- https://modm.io/
- https://github.com/modm-io/modm
- Genode OS MMIO framework:
Mmio::Register_setbase class with declarative bitfield definitions usingstruct Register : Mmio::Register<offset, width>. Used throughout the Genode microkernel for platform drivers. Different design point — an OS framework rather than a standalone library.- https://genode.org/documentation/developer-resources/index
- Embedded Template Library (ETL): General-purpose C++ library for embedded systems. Includes some register access utilities but focuses more broadly on containers, algorithms, and utilities for resource-constrained environments.
- https://www.etlcpp.com/
Comparison table
| Library | Standard | Code Gen | SVD Input | Compile-Time Safety | Access Optimization | Scope |
|---|---|---|---|---|---|---|
| Kormanyos | C++11/14 | No | No | Address/value as template params | Compile-time branching | Register access only |
| Kvasir | C++14 | Yes (SVD) | Yes | Field validation, type checking | Batched writes via apply() |
Register access + atomic ops |
| cppreg | C++11 | No | No | Access policy enforcement | Single write optimization | Register packs |
| modm | C++20 | Yes (SVD+) | Yes | Full type safety | Platform-specific | Full HAL |
| AllThingsEmbedded | C++17 | No | No | if constexpr branching |
Same pattern as blog post | Register access only |
| Blog post | C++17 | Yes (SVD+Jinja) | Yes | if constexpr branching |
Single/multi-field optimization | Register access only |
The blog post’s combination of SVD-based Jinja code generation with C++17 if constexpr optimization is relatively unique. Most libraries either require manual register definitions (Kormanyos, cppreg, AllThingsEmbedded) or are part of a larger framework (modm, Genode).
B. SVD-Based Code Generation Landscape
C++ generators
- modm/lbuild: The most mature C++ SVD-based generator. Processes SVD files plus vendor-specific data sheets to produce a complete HAL. Uses a Python-based build system (lbuild) rather than generic templates.
- https://github.com/modm-io/modm
- Kvasir tooling: Generates Kvasir-compatible register definitions from SVD. Tightly coupled to Kvasir’s type system.
- https://github.com/kvasir-io/Kvasir
- svdtools: Python library for modifying/patching SVD files before feeding them to generators. Addresses the common problem of vendor SVD files containing errors or omissions. Used by both Rust and C++ ecosystems.
- https://github.com/rust-embedded/svdtools
Rust generators (the dominant ecosystem)
- svd2rust: The de facto standard for SVD-to-code generation. Produces Peripheral Access Crate (PAC) definitions with ownership semantics and closure-based field writes. 1000+ generated PAC crates published on crates.io covering most ARM and RISC-V vendors.
- https://github.com/rust-embedded/svd2rust
- chiptool (Embassy): Alternative Rust SVD generator focused on async-first embedded. Generates for the Embassy HAL framework.
- https://github.com/embassy-rs/chiptool
The blog post’s Jinja approach
The blog post’s use of generic Jinja2 templates for SVD code generation is architecturally distinctive:
- Flexibility: Jinja templates can produce any output format — C, C++, Rust, documentation, test harnesses. The template is the product, not the generator.
- Transparency: Templates are readable and modifiable without understanding a code generator’s internals.
- Trade-off: Less sophisticated than purpose-built generators. No SVD patching, no cross-peripheral deduplication, no vendor-specific workarounds.
Most generators (modm, svd2rust, chiptool) embed the output format in procedural code. The template-driven approach is closer to how ARM’s own CMSIS tools work.
C. Zero-Cost Abstraction Evidence
The blog post’s proof
The disassembly listing showing template code compiling to movs, ldr, orr.w, str — identical to hand-written C — is the standard proof for zero-cost MMIO abstractions. This is the strongest argument for the C++ approach over C macros or bitfield structs.
Industry consensus
- Kormanyos provides similar disassembly comparisons in Real-Time C++
- cppreg documentation includes code size comparisons
- Kvasir claims zero overhead with benchmarks
- Rust PACs (svd2rust output) make the same claim with equivalent evidence
Caveats not discussed in the post
- Debug builds (
-O0): Zero-cost abstractions are only zero-cost with optimization enabled. At-O0, template instantiation produces significantly larger code with actual function calls. This is a known pain point — debugging optimized code is harder, but debug builds lose the “zero cost” property. - Link-time optimization (LTO): Some optimizations (cross-translation-unit inlining) require LTO. The post’s single-file example doesn’t encounter this, but real projects with registers accessed across files may.
- Compiler differences: GCC, Clang, and ARM Compiler (armclang) can produce different output for the same template code. The post shows one compiler’s output.
D. C++20/23 Improvements Since the Post
The post targets C++17. Several newer language features are relevant:
| Feature | Standard | Relevance |
|---|---|---|
consteval |
C++20 | Guarantees compile-time evaluation — stronger than constexpr for address/offset calculations |
| Concepts | C++20 | Could replace SFINAE for constraining register/field types (e.g. template<RegisterType R>) |
std::bit_cast |
C++20 | Type-safe reinterpretation — potential replacement for reinterpret_cast in some patterns |
| Volatile compound deprecation | C++20 | volatile compound assignment (|=, &=) deprecated — directly affects read-modify-write patterns |
constexpr virtual |
C++20 | Enables polymorphic register interfaces at compile time |
static operator() |
C++23 | Could simplify functor-based register access patterns |
The volatile deprecation in C++20 is particularly relevant: the post’s read-modify-write pattern (reg_value = *reinterpret_cast<volatile r_datatype_t*>(...)) uses separate read and write through volatile pointers, which remains valid. But naive patterns like *reg |= mask are deprecated.
E. The Rust Comparison
svd2rust / PAC architecture
Rust’s svd2rust has become the benchmark for SVD-based register access. The architecture is parallel to the blog post’s:
| Concept | Blog post (C++) | svd2rust (Rust) |
|---|---|---|
| Device definition | FPGAIO_dev<BASE_ADDR> |
Peripherals::take().FPGAIO |
| Register access | FPGAIO_i.LED.write(val) |
p.LED.write(\|w\| w.bits(val)) |
| Field access | FPGAIO_i.LED.LED0.write(1) |
p.LED.write(\|w\| w.led0().set_bit()) |
| Base address | Template parameter | Singleton with take() |
| Code generation | Jinja2 templates from SVD | Procedural Rust from SVD |
| Safety model | Compile-time optimization | Ownership + closure-based writes |
Key differences
- Ownership: Rust’s
Peripherals::take()returnsOption<Peripherals>— only one caller gets the peripheral set. This prevents aliased register access at the type level. The C++ approach has no equivalent protection. - Closure-based writes:
reg.write(|w| w.field1().bits(x).field2().set_bit())batches multiple field writes into a single register write by construction. The C++ approach requires the developer to use the register-levelwrite()for batching. - Enumerated values: svd2rust generates enum types for fields with defined values in SVD, providing exhaustive match checking. The blog post’s approach passes raw integers.
Why this matters
The Rust embedded ecosystem has grown significantly since the blog post was written (2020). svd2rust PACs exist for most ARM and RISC-V vendors. In discussions of MMIO register access, the Rust approach is now the primary point of comparison. The blog post doesn’t mention Rust, which was reasonable in 2020 but is a notable absence by current standards.
F. Alternative C/C++ Approaches
Bitfield structs
The traditional C approach — and what ARM CMSIS headers partially use:
typedef struct {
uint32_t LED0 : 1;
uint32_t LED1 : 1;
uint32_t : 30;
} LED_t;
Pros: Simple, familiar, good IDE support. Cons: Bitfield layout is implementation-defined (not portable), no control over access width, no read-modify-write optimization.
C11 _Generic
Type-generic macros can provide some of the same dispatch:
#define reg_write(reg, val) _Generic((reg), \
volatile uint32_t*: write32, \
volatile uint16_t*: write16)(reg, val)
Limited compared to C++ templates — no compile-time field optimization, no type-safe field access.
Volatile struct overlay (CMSIS pattern)
#define FPGAIO ((FPGAIO_Type *)0x40028000UL)
FPGAIO->LED = 0x01;
The most common pattern in production embedded C. Simple, debugger-friendly, well-understood. The blog post acknowledges this (“ARM do with CMSIS headers”) and positions the C++ approach as enabling “more optimizations, protect some register accesses, and even use custom instructions.”
G. Unique Differentiators
The blog post has several genuine differentiators worth preserving:
-
Three-tier generated architecture (param/regs/dev): Separates constants, typed register classes, and device composition. Cleaner than monolithic generation.
-
Template base address with honest tradeoff: The advantage (no stored pointer, pure load/store) and disadvantage (template propagation) are stated directly. Most MMIO library authors avoid discussing the template-propagation cost.
-
Jinja-based generation: More accessible and modifiable than procedural generators. A developer can read the template and understand the output without learning a generator framework.
-
if constexproptimization paths: The three-way branch (single bit, single field, read-modify-write) is clearly motivated and well-documented in code. -
Disassembly proof: Showing the actual compiler output is the gold standard for zero-cost claims.
Summary
The blog post occupies a specific position: a C++17 template-based MMIO register access library with SVD code generation via Jinja templates. Its closest peers are AllThingsEmbedded (same if constexpr pattern, no code gen), cppreg (similar type safety, no SVD input), and Kvasir (SVD input, more complex type system). The Jinja-based generation approach is distinctive — most generators embed format in code rather than templates. The zero-cost abstraction proof via disassembly is solid, though the -O0 caveat and LTO considerations are unmentioned. The Rust svd2rust ecosystem has become the dominant comparison point for SVD-based register access since 2020, offering ownership semantics and closure-based writes that the C++ approach lacks. C++20/23 bring relevant improvements (concepts, consteval, volatile deprecation) that could inform a future update but don’t diminish the post’s C++17 contribution.