Skip to content

embassy-rp: Executor fails to sleep due to interaction between AtomicPtr.swap() and wfe instruction #4818

@jakoblell

Description

@jakoblell

During debugging of a CPU usage issue on an RP2350 I've discovered that the standard Executor does not actually sleep in the wfe/poll loop even if there are no pending tasks (or even when no tasks are spawned at all). Some debug instrumentation quickly showed that the device went through more than a million wfe/poll loop iterations per second even after disabling all tasks. This clearly can't be right and so I started debugging this thing, mostly by changing dependencies to a local copy and commenting out stuff until the problem went away.

Turns out that the executor is calling AtomicPtr::swap via the following call flow:

Disabling this AtomicPtr.swap() call (via a [patch.crates-io] in the Cargo.toml) actually changed the behavior and allowed wfe to sleep.

In order to demonstrate the issue, I've written a minimal reproducer to directly run AtomicPtr::swap and wfe in a loop:

#![no_main]
#![no_std]

use core::fmt::Write;
use core::ptr;
use core::sync::atomic::{AtomicPtr, Ordering};
use cortex_m::asm::wfe;
use cortex_m_rt::entry;
use embassy_rp::{uart};
use embassy_time::Instant;
use defmt::{println};
use embassy_rp::uart::{UartTx};
use {defmt_rtt as _, panic_probe as _};


struct UartWriter {
    uart: UartTx<'static, uart::Blocking>,
}

impl Write for UartWriter
{
    fn write_str(&mut self, s: &str) -> core::fmt::Result {
        self.uart.blocking_write(s.as_bytes()).unwrap();
        Ok(())
    }
}
#[entry]
fn main() -> !{
    println!("main");
    let p = embassy_rp::init(Default::default());
    let uart: UartTx<uart::Blocking> = UartTx::new(p.UART1, p.PIN_4, p.DMA_CH1, embassy_rp::uart::Config::default());
    let mut uart = UartWriter{uart};
    let _ = write!(uart, "Starting main\r\n");

    let atomic: AtomicPtr<u32> = AtomicPtr::new(ptr::null_mut::<u32>());
    for i in 0u32..1_000_000u32{
        wfe();
        if Instant::now().as_ticks() > 1_000_000{
            panic!("i={} after 1_000_000 ticks", i);
        }
        let val = atomic.swap(ptr::null_mut(), Ordering::AcqRel);
    }
    write!(uart, "1M events took {} ticks\r\n", Instant::now().as_ticks()).unwrap();
    panic!("1M events took {} ticks", Instant::now().as_ticks());
}

And the corresponding Cargo.toml:

[package]
name = "wfe_issue_reproducer"
version = "0.1.0"
edition = "2021"

[dependencies]
cortex-m-rt = "0.7.5"
defmt = "1.0.1"
defmt-rtt = "1.1.0"

embassy-rp = { version = "0.8.0", features = ["defmt", "unstable-pac", "time-driver", "critical-section-impl", "rp235xa", "binary-info"] }
embassy-time = { version = "0.5.0", features = ["defmt", "defmt-timestamp-uptime"]}

panic-probe = { version = "1.0.0", features = ["print-defmt"] }

cortex-m = "0.7.7"

Runnint this code on an RP2350 will output something like this:

Starting main
1M events took 286988 ticks

On the other hand, when commenting out the atomic.swap line, it will actually sleep and never reach the one million iterations in a short amount of time.

Since I initially suspected that the issue could possibly be related to the SWD debug probe being attached, I also implemented uart output, that way the issue can be reproduced with SWD disconnected (just by monitoring the uart output).

As of now I do not understand the reason why AtomicPtr.swap() will set the event register (so that the next wfe instruction will directly return). The AtomicPtr implementation branches out to atomic_xchg from https://github.com/rust-lang/rust/blob/master/library/core/src/intrinsics/mod.rs#L149 and a resulting disassembly listing is using memory synchronization instruction such as strex/ldrex/dmb, which may be related to this issue. Since I'm not an expert for Arm assembly and the low-level details of the Arm Cortex-M architecture, I haven't figured out the root cause of the issue yet.

Potential root causes of this issue:

  • Bug in embassy, e.g. incorrect usage/combination of synchronization primitives and wfe instruction
  • Bug in rustc (which is implementing the compiler intrinsics used by AtomicPtr), tested version is rustc 1.91.0 (f8297e351 2025-10-28)
  • Hardware bug/errata in the RP2350

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions