Sampling audio with ESP32

The ESP32, a cutting-edge microcontroller equipped with WiFi and Bluetooth, succeeds Espressif’s widely-used ESP8266, a game-changer for hobbyists.

The ESP32 is a feature-rich system-on-a-chip (SoC) that practically necessitates an operating system to fully utilize its capabilities.

This tutorial addresses the specific challenge of sampling the analog-to-digital converter (ADC) from a timer interrupt on the ESP32 using the Arduino IDE. While the Arduino IDE may lack advanced features, it offers ease of setup and extensive library support for various hardware modules. However, we’ll prioritize performance by leveraging native ESP-IDF APIs alongside Arduino functions.

ESP32 Audio: Timers and Interrupts

The ESP32 has four hardware timers, grouped in pairs. Each timer features 16-bit prescalers and 64-bit counters. The prescale value regulates the timer’s clock signal (derived from an 80 MHz internal clock) by allowing only every Nth tick. With a minimum prescale value of 2, interrupts can occur at a maximum frequency of 40 MHz. This implies that at maximum timer resolution, the handler code must execute within 6 clock cycles (240 MHz core/40 MHz). Timers possess several key properties:

  • divider—Frequency prescale value
  • counter_en—Enables/disables the timer’s 64-bit counter (usually enabled)
  • counter_dir—Determines if the counter increments or decrements
  • alarm_en—Enables/disables the counter’s action (“alarm”)
  • auto_reload—Resets the counter upon alarm trigger

Important timer modes include:

  • Disabled: The timer hardware is inactive.
  • Enabled with alarm disabled: The timer runs, optionally incrementing/decrementing the internal counter, but triggers no action.
  • Enabled with alarm enabled: The timer operates as before, but performs an action (counter reset and/or interrupt generation) when the counter reaches a predefined value.

While code can read timer counters, we usually configure the timer to generate interrupts for periodic actions. We then write code to handle these interrupts.

Interrupt handler functions need to complete before the subsequent interrupt, limiting their complexity. Ideally, they should be brief and primarily set flags checked by non-interrupt code. Complex I/O operations are best delegated to separate handlers.

In ESP-IDF, the vTaskNotifyGiveFromISR() function notifies a task about pending actions from the Interrupt Service Routine (ISR). Example code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
portMUX_TYPE DRAM_ATTR timerMux = portMUX_INITIALIZER_UNLOCKED; 
TaskHandle_t complexHandlerTask;
hw_timer_t * adcTimer = NULL; // our timer

void complexHandler(void *param) {
  while (true) {
    // Sleep until the ISR gives us something to do, or for 1 second
    uint32_t tcount = ulTaskNotifyTake(pdFALSE, pdMS_TO_TICKS(1000));  
    if (check_for_work) {
      // Do something complex and CPU-intensive
    }
  }
}

void IRAM_ATTR onTimer() {
  // A mutex protects the handler from reentry (which shouldn't happen, but just in case)
  portENTER_CRITICAL_ISR(&timerMux);

  // Do something, e.g. read a pin.
  
  if (some_condition) { 
    // Notify complexHandlerTask that the buffer is full.
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;
    vTaskNotifyGiveFromISR(complexHandlerTask, &xHigherPriorityTaskWoken);
    if (xHigherPriorityTaskWoken) {
      portYIELD_FROM_ISR();
    }
  }
  portEXIT_CRITICAL_ISR(&timerMux);
}

void setup() {
  xTaskCreate(complexHandler, "Handler Task", 8192, NULL, 1, &complexHandlerTask);
  adcTimer = timerBegin(3, 80, true); // 80 MHz / 80 = 1 MHz hardware clock for easy figuring
  timerAttachInterrupt(adcTimer, &onTimer, true); // Attaches the handler function to the timer 
  timerAlarmWrite(adcTimer, 45, true); // Interrupts when counter == 45, i.e. 22.222 times a second
  timerAlarmEnable(adcTimer);
}

Note: Refer to the ESP-IDF API and ESP32 Arduino core GitHub project for documentation on the code functions used throughout this article.

CPU Caches and the Harvard Architecture

The IRAM_ATTR clause in the onTimer() interrupt handler definition is crucial because CPU cores can only execute instructions and access data from embedded RAM, not flash storage. To address this, a 128 KiB IRAM cache within the 520 KiB RAM transparently loads code from flash. The ESP32’s “Harvard architecture” necessitates separate handling of code and data, extending to memory properties. IRAM, accessible only at 32-bit address boundaries, plays a special role.

ESP32 memory is non-uniform, with various regions serving different purposes. The largest continuous region is about 160 KiB, and user-accessible memory totals roughly 316 KiB.

Loading data from flash is slow and might require SPI bus access. Therefore, speed-critical code needs to fit within the IRAM cache, often less than 100 KiB, as the operating system utilizes a portion. Notably, if interrupt handler code is not cached during an interrupt, the system throws an exception. The IRAM_ATTR specifier on onTimer() instructs the compiler and linker to statically allocate this code in IRAM, preventing it from being swapped out.

However, IRAM_ATTR only affects the function it is applied to, not functions called within it.

Sampling ESP32 Audio Data from a Timer Interrupt

Audio signal sampling typically involves using a memory buffer to store samples collected from an interrupt. A handler task is then notified when data is ready.

The ESP-IDF documentation describes the adc1_get_raw() function, which measures data on a specific ADC channel of the primary ADC peripheral (the second one is used by WiFi). However, utilizing this function within the timer handler leads to instability due to its complexity and reliance on numerous IDF functions (particularly those handling locks). Neither adc1_get_raw() nor its dependencies are IRAM_ATTR marked, making the interrupt handler prone to crashes when code execution (potentially involving WiFi, SPIFFS, or other components) forces the ADC functions out of IRAM.

Note: Certain IRAM_ATTR marked IDF functions, like vTaskNotifyGiveFromISR(), are designed for safe use within interrupt handlers.

The recommended workaround involves the interrupt handler notifying a task to perform ADC sampling. While conceptually sound, this approach introduces significant overhead due to operating system interactions and extensive instruction execution, potentially impacting CPU availability for other tasks.

Digging through IDF Source Code

Given the simplicity of ADC sampling, an alternative approach is to directly replicate the IDF’s implementation without relying on the provided API. Examining the adc1_get_raw() function in the IDF’s rtc_module.c file reveals that out of its eight or so actions, only the call to adc_convert() actually samples the ADC. Fortunately, adc_convert() is straightforward, manipulating peripheral hardware registers through a global SENS structure.

Adapting this code for our program is simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
int IRAM_ATTR local_adc1_read(int channel) {
    uint16_t adc_value;
    SENS.sar_meas_start1.sar1_en_pad = (1 << channel); // only one channel is selected
    while (SENS.sar_slave_addr1.meas_status != 0);
    SENS.sar_meas_start1.meas1_start_sar = 0;
    SENS.sar_meas_start1.meas1_start_sar = 1;
    while (SENS.sar_meas_start1.meas1_done_sar == 0);
    adc_value = SENS.sar_meas_start1.meas1_data_sar;
    return adc_value;
}

We then include the necessary headers to access the SENS variable:

1
2
#include <soc/sens_reg.h>
#include <soc/sens_struct.h>

Finally, we call adc1_get_raw() directly after ADC setup to execute its configuration steps before timer activation.

This approach has a drawback: potential conflicts with other IDF functions that modify ADC configuration. At least WiFi, PWM, I2C, and SPI do not interfere. However, if conflicts arise, calling adc1_get_raw() can restore the correct configuration.

ESP32 Audio Sampling: The Final Code

With the local_adc_read() function implemented, the timer handler code becomes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#define ADC_SAMPLES_COUNT 1000
int16_t abuf[ADC_SAMPLES_COUNT];
int16_t abufPos = 0;

void IRAM_ATTR onTimer() {
  portENTER_CRITICAL_ISR(&timerMux);

  abuf[abufPos++] = local_adc1_read(ADC1_CHANNEL_0);
  
  if (abufPos >= ADC_SAMPLES_COUNT) { 
    abufPos = 0;

    // Notify adcTask that the buffer is full.
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;
    vTaskNotifyGiveFromISR(adcTaskHandle, &xHigherPriorityTaskWoken);
    if (xHigherPriorityTaskWoken) {
      portYIELD_FROM_ISR();
    }
  }
  portEXIT_CRITICAL_ISR(&timerMux);
}

Here, adcTaskHandle represents the FreeRTOS task responsible for processing the audio buffer, similar to the complexHandler function in the earlier snippet. This task would create a local copy of the buffer for further analysis, compression, transmission, or other processing.

Interestingly, using Arduino API functions like analogRead() instead of their ESP-IDF counterparts (adc1_get_raw()) would work because Arduino functions are IRAM_ATTR marked. However, their higher level of abstraction results in slower performance compared to the ESP-IDF equivalents. Notably, our custom ADC read function is roughly twice as fast as the ESP-IDF version.

ESP32 Projects: To OS or Not to OS

This exercise highlights the trade-offs of using an operating system. We essentially re-implemented an OS API to circumvent issues stemming from the OS itself.

Direct programming, sometimes in assembly, is common for smaller microcontrollers, giving developers absolute control. However, this becomes cumbersome for complex microcontrollers like the ESP32, with its diverse peripherals, dual CPU cores, and intricate memory layout.

Operating systems, despite imposing limitations, generally offer faster and simpler development. Nevertheless, as demonstrated, occasional workarounds can be beneficial, especially in embedded systems.

Licensed under CC BY-NC-SA 4.0