It’s time to investigate the programmable I/O hardware on the Raspberry Pi Pico.

I have read the RP2040 datasheet a few times, but I need to be learning by doing. The goal: the humble LED.

What I want is something like this:

    gpio_init(PICO_DEFAULT_LED_PIN);
    gpio_set_dir(PICO_DEFAULT_LED_PIN, GPIO_OUT);

    while (true) {
        gpio_put(PICO_DEFAULT_LED_PIN, true);
        sleep_ms(250);
        gpio_put(PICO_DEFAULT_LED_PIN, false);
        sleep_ms(250);
    }

But using the PIO.

Invoking the PIO Assembler

The project is called pio-led. In CMakeLists.txt I have:

add_executable(pio-led pio-led.c blink.pio)

...

pico_generate_pio_header(pio-led
        ${CMAKE_CURRENT_LIST_DIR}/blink.pio)

In blink.pio I currently have:

.program blink
    set pindirs, 1

When I build the project, it creates a file called blink.pio.h with:

static const uint16_t blink_program_instructions[] = {
            //     .wrap_target
    0xe081, //  0: set    pindirs, 1                 
            //     .wrap
};

#if !PICO_NO_HARDWARE
static const struct pio_program blink_program = {
    .instructions = blink_program_instructions,
    .length = 1,
    .origin = -1,
};

static inline pio_sm_config blink_program_get_default_config(uint offset) {
    pio_sm_config c = pio_get_default_sm_config();
    sm_config_set_wrap(&c, offset + blink_wrap_target, offset + blink_wrap);
    return c;
}
#endif

This will be useful later on (I’m sure).

Loading the Program

hardware/pio.h defines the following:

typedef pio_hw_t *PIO;

#define pio0_hw ((pio_hw_t *)PIO0_BASE)
#define pio1_hw ((pio_hw_t *)PIO1_BASE)

#define pio0 pio0_hw
#define pio1 pio1_hw

In other words, this is valid:

PIO pio = pio0;

We can load our program now:

    uint offset = pio_add_program(pio, &blink_program);
    printf("loaded program at offset %u\n", offset);

This outputs:

loaded program at offset 31

Why 31? I’m not sure.

To clean up:

    pio_remove_program(pio, &blink_program, offset);

PIO Pins

For this program I want to use pin 25, AKA PICO_DEFAULT_LED_PIN. To assign pins to a state machine:

    pio_gpio_init(pio, PICO_DEFAULT_LED_PIN);

As far as setting pin direction (input/output), I’m not sure. It is possible to use set pindirs in the PIO program, but it is also possible to use PIO functions such as:

  • pio_sm_set_out_pins
  • pio_sm_set_consecutive_pindirs
  • pio_sm_set_pindirs_with_mask

etc.

I’ll stick with the PIO program for now.

State Machine

The Pico has 2 PIO blocks (pio0 and pio1 described above), each with 4 state machines. To claim an unused state machine:

uint sm = pio_claim_unused_sm(pio, true);

true says asks the function to panic if there are no free state machines.

Next we need to configure the state machine. There is a “default config” function available:

pio_sm_config config = blink_program_get_default_config(offset);

There are function to update this config, for example:

  • sm_config_set_out_pins
  • sm_config_set_fifo_join

etc.

I’m interested in two things:

  1. The PIO program is using set, so I need to set the set pin to PICO_DEFAULT_LED_PIN:

    sm_config_set_set_pins(&config, PICO_DEFAULT_LED_PIN, 1);
    
  2. The clock divider. I want each instruction to take 250ms which is very slow. Instructions can take a delay value between 0 and 31 (inclusive), but that’s clearly not enough. sm_config_set_clkdiv takes a div parameter where 1.0 means “full speed”, 2.0 means “half speed” etc. I want to find the right value to divide “full speed” so that it reaches 250ms. But what is “full speed”?

    We can find the current “speed” (expressed in Hz) by calling:

    clock_get_hz(clk_sys)
    

    If that is 4 (Hz), we can use a divider of 1.0. If it’s 8, we need a divider of 2.0, and if it’s X, we need a divider of X / 4.0. Then, when we divide X by the divider, we get 4.0 - that is, 4 Hz which is our target. However, by default this X is 125,000,000 (125Mhz). Which means that the divider is 31250000. Unfortunately, the maximum divider is 65535.

    float div = 65535.0; // doesn't work: clock_get_hz(clk_sys) / 4.0f
    sm_config_set_clkdiv(&config, div);
    

We can now initialise the state machine, and enable it:

pio_sm_init(pio, sm, offset, &config);
pio_sm_set_enabled(pio, sm, true);

true here means “enabled”.

The PIO Program

But at the moment our program is a bit lame - it just sets the pin direction, and… that’s it.

Let’s write a “blink” loop. The simplest option is

.wrap_target
    set pins, 1
    set pins, 0
.wrap

But that will run at 125Mhz / 65535 = 1907Hz. That’s a bit too fast for me.

What if we add the maximum delay?

.wrap_target
    set pins, 1 [31]
    set pins, 0 [31]
.wrap

That’ll be 125Mhz / 65535 / 31 = 62Hz. Still a bit fast - we want 4Hz.

But we can have up to 32 instructions. So we can have 15 x set pins, 1 [31] and 15 x set pins, 0 [31]. Yes, it’s a weird program:

.wrap_target
    set pins, 1 [31]
    set pins, 1 [31]
    ...
    set pins, 1 [31]
    set pins, 0 [31]
    set pins, 0 [31]
    ...
    set pins, 0 [31]
.wrap

And that actually work. It’s not precise like sleep_ms(250) but that’s OK. Also, it’s possible to change the system clock’s frequency, but I really don’t feel like I want to go there.

Interim Notes

Success - wrote a small PIO program which actually worked.

The steps:

  • Set the GPIO pins to be controlled by PIO
  • Load the program
  • Allocate a state machine
  • Configure the state machine
  • Run the program

Further Attempts

The first attempt above is quite naive. The first thing to note is that nop is a reasonable substitue to setting the pin once it’s already set to the value we want.

Regardless, we have scratch registers and more assembly instructions to have another go.

The base frequency is 125Mhz. Using a clock divider of 62500 we can get to 2000Hz (exactly). To get to 250ms (4Hz) we need 500 cycles between pin writes.

  • 500 = 25 x 20
  • 496 = 31 x 16

So we should be able to do something a bit nicer.

First of all, we can drop one statement with:

    pio_sm_set_consecutive_pindirs(pio, sm, PICO_DEFAULT_LED_PIN, 1, true);

Then this should work:

.program blink
.wrap_target
    // 1 + 3 + 31 x 16 cycles = 500 cycles = 250ms
    set pins, 1
    set x, 31 [2]
loop1:
    jmp x--, loop1 [15]
    // and now with pin=0
    set pins, 0
    set x, 31 [2]
loop2:
    jmp x--, loop2 [15]
.wrap

And… it does!

More Fun and Games

It’s possible to do this with even fewer instructions, but it does require some support from the CPU. We can put 0x55555555 in the FIFO of our state machine, and use out instead of set. Pulling one bit at a time from the OSR means that we’ll get 1, 0, 1, 0, … which is wat we want to set the pin to.

If we enable autopull, the OSR automatically reads from the FIFO when it needs to. So all we need to do is make sure that the FIFO always has what we need.

The first change is to use out instead of `set:

    sm_config_set_out_pins(&config, PICO_DEFAULT_LED_PIN, 1);

The second change is to enable autofill:

    sm_config_set_out_shift(&config, false, true, 32);

We then have to make sure that the FIFO always contains 0x55555555. Initial setup:

    pio_sm_put(pio, sm, 0x55555555);

(I put it after the call to pio_sm_init but before enabling the state machine).

And then, in a loop:

    while (true) {
        pio_sm_put_blocking(pio, sm, 0x55555555);
        sleep_ms(100);
    }

The program:

.program blink
.wrap_target
    out pins, 1
    set x, 31 [2]
loop1:
    jmp x--, loop1 [15]
.wrap

And that works as well.

While this is shorter, I do prefer the previous version - it doesn’t need anything from the CPU, so it’s a “fire and forget” version. The LED keeps blinking while the CPU is free to do whatever it needs to do.

I’m going to leave it here for now, but that’s a reasonable start.