Skip to content

Latest commit

 

History

History
192 lines (139 loc) · 12.1 KB

ESP-Audio_Design_Guideline.md

File metadata and controls

192 lines (139 loc) · 12.1 KB

ESP-Audio Design Guideline

This document introduces bugs related to ESP32, memory system, peripheral issues, Wi-Fi awareness as well as hardware design guideline that users should know in an ESP32 audio project.

Contents

[TOC]

1. I2s Issues

1.1. Bits Restriction

Can not play 24bits music directly, but can be elevated to 32 bits by filling subsequent 8 bits with zero or be descended to 16 bits by deleting the last 8 bits.

1.2. Mono

I2s has a bug in mono mode, which would cause the data reversed.
Following codes should be added before i2s_write_bytes() function or after i2s_read_bytes() function.

// for 16 bits i2s
if (channel == 1) {
    int16_t *tempBuf = i2sBuffer;
    int16_t tempBox;
    for (int i = 0; i < i2sBufLen / 2; i += 2) {
        tempBox = tempBuf[i];
        tempBuf[i] = tempBuf[i + 1];
        tempBuf[i + 1] = tempBox;
    }
}

// for 32 bits i2s
if (channel == 1) {
    int32_t *tempBuf = i2sBuffer;
    int32_t tempBox;
    for (int i = 0; i < i2sBufLen / 4; i += 4) {
        tempBox = tempBuf[i];
        tempBuf[i] = tempBuf[i + 1];
        tempBuf[i + 1] = tempBox;
    }
}

1.3. Slave Mode

When I2s acts as slave mode, then master device's MCLK (or chip clk) should be provided by ESP32, which can be generated by I2s MCLK via GPIO0 or GPIO1 or GPIO3.

If master device can not accept external CLK, then a noise would generate because CLK mismatching, in this case change ESP32's i2s to master mode is the only possible way to solve this issue. Thus two ESP32 can not communicate in I2s since ESP32 can not accept external CLK.

1.4. Clock

ESP32's I2s have two different CLK resources, please refer to ESP32 Datasheet and Technical References in Espressif Documents for details.

There is only one high-quality CLK called APLL (Audio PLL), which can provide a very accurate CLK for highly demanded occasion, which also means that even ESP32 have two I2s, but only one of them can use APLL.

The other CLK is divided from a 160MHZ CLK which provides a good timing for I2s that can also qualify for most of the audio scenarios.

APLL does not support 8K sample rate, i2s driver would switch to another CLK if 8K sample rate is selected.

The I2s driver already integrated APLL, please modify the initialization code of I2s to use it or not.

1.5. I2s Output Mechanism

There are several DMA capable ring-buffer that in charge of delivering data to I2s hardware, which is actually an endless loop.
Here are two important attributions of this mechanism, one of which is that these ring-buffer would always be written to I2s hardware that could caused a perpetual looping of some audio data, the other subtle attribution is that even if the i2s_write() function is returned doesn't mean all of the audio data is played out, because i2s_write() would return immediately once all the data is written into i2s hardware, while the hardware also needs some time to play them out.

To avoid these two issues, users need to write zero-dummy data to i2s ring-buffer when the playing procedure is stopped and call i2s_zero_dma_buffer() to mute the output instead of calling i2s_stop().

/*
    If the initial i2s buffer size is 3 * 300 * 4 bytes
*/
i2s_config_t i2s_config = {
    ...
    .dma_buf_count = 3,  /*!< amount of the dam buffer sectors*/
    .dma_buf_len = 300,  /*!< dam buffer size of each sector (word, i.e. 4 Bytes) */
};

i2s_driver_install(0, &i2s_config, 0, NULL);
...

while (1)
{
    //playing music
    i2s_write_bytes(0, music_data, size, portMAX_DELAY);
    ...
    
    if (music_finished) 
    {
        int zero_size = i2s_config.dma_buf_count * i2s_config.dma_buf_len * 4;
        memset(music_data, 0, zero_size);
        i2s_write_bytes(0, music_data, zero_size, portMAX_DELAY);//then i2s would continuously play zero-data
        break;
    }
    
    if (mute)
    {
        i2s_zero_dma_buffer(0);//this would empty all of the ring-buffer immediately
        break;
    }
}

2. RAM system

The spare internal Data-RAM is 290K with "hello_world" example. For audio system this may be insufficient, and therefore the ESP32 incorporates the ability to use up to 4MiB of external SPI RAM (i.e. PSRAM) memory as memory. The external memory is incorporated in the memory map and is, within certain restrictions, usable in the same way internal Data-RAM is.

Refer to PSRAM section in IDF document for details, especially pay attention to its Restrictions section which is very important.

Note: BT and Wi-Fi can not coexist without PSRAM because of the RAM inefficiency.

2.1. PSRAM

  • When disabling flash cache, the external RAM becomes inaccessible, which means that functions like flash_read(buffer, 512) would crash if buffer is allocated at PSRAM, the same crash occurs when PSRAM is used in IRAM Interrupt Service Routine, etc.
  • Task stack will always be allocated at internal RAM. But users can also make full use of the xTaskCreateStatic() function that allows users to create tasks with stack on PSRAM (options in PSRAM and FreeRtos menuconfig), but pay attention to its help information.

Don't use rom code in xTaskCreateStatic task:The rom code itself is linked in in components/esp32/ld/esp32.rom.ld. However, you also need to consider things that call ROM functions, as well as other code that isn't recompiled against the patch, like the WiFi and BT libraries. In general, we advise using this only in threads that do not call any IDF libraries (including libc) and only do calculations and use FreeRTOS primitives to talk to other threads, you should be good.

  • Tips for using PSRAM & RAM:
    Call char *buf = heap_caps_malloc(1024 * 10, MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT) instead of malloc(1024 * 10) to use PSRAM, and call char *buf = heap_caps_malloc(512, MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT) to use internal RAM.
    Don't relay on malloc() to automatically use PSRAM allows users to make a full control of the memory, most importantly, avoid the internal RAM is used up by other malloc() calls, and also can reserve more memory for high-efficiency usage and task stack since PSRAM cannot be used as task stack memory.

2.2. Optimization of Internal RAM

Internal RAM is more valuable since there are some restrictions on PSRAM. Here are some tips for optimizing internal RAM.

  • Set all the static buffer to minimum value in "Component config -> Wi-Fi" if PSRAM is in use, if PSRAM is not used then dynamic buffer should be selected to save memory. Refer to Wi-Fi Buffer Usage section in IDF document for details.
  • If PSRAM and BT are used, then CONFIG_BT_ALLOCATION_FROM_SPIRAM_FIRST and CONFIG_BT_BLE_DYNAMIC_ENV_MEMORY should be set as "yes" to allocate more of 40KB memory to PSRAM
  • If PSRAM and Wi-Fi are used, then CONFIG_WIFI_LWIP_ALLOCATION_FROM_SPIRAM_FIRST should be set as "yes" to allocate some memory to PSRAM
  • Set CONFIG_WL_SECTOR_SIZE as 512 in "Component config -> Wear Levelling

Attention: The smaller the size of sector be, the slower the W /R speed will be, vice versa, but only 512 and 4096 are supported.

  • Use RAM reduction decoder lib.
    There are some special decoder libraries that needs much less memory than its full-function version, but all at the expense of losing some audio quality. Replace all the files in esp-audio-app/components with the ones in esp-audio-app/tools/Performance_Codec_Library/ directory, if RAM and efficiency is prior to quality and the quality loss can be acceptable.

3. Wi-Fi Issues

3.1. Configuration

Once the PSRAM is in use, then the "Type of WiFi Tx Buffer" option must be set as "STATIC", and please be aware of that each static Tx and Rx buffer occupied 1.6KB internal RAM. If PSRAM is not used, then dynamic buffer should be selected to reduce internal RAM.

3.2. Performance

Following methods should adopted to achieve a high Wi-Fi performance in audio project.

If users are not using official ESP32 module, must make sure the board is well-designed and the RF is well-calibrated.

  • Set these following options in menuconfig.
    • Set Flash SPI mode as QIO
    • Set Flash SPI speed as 80 MHz
    • Set CPU frequency as 240MHZ
    • Set PSRAM clock speed as 80MHZ if PSRAM is used
    • Set (Default receive window size) as 5 times greater than (Maximum Segment Size) in "Component > config > LWIP > TCP"
  • Set enough buffer for audio input, greater than 2MB is recommended, if PSRAM is not used then TF-card can be an alternative storage choice.
  • If external antenna is used, then set PHY_RF_CAL_PARTIAL as PHY_RF_CAL_FULL in idf/components/esp32/phy_init.c
  • Call esp_wifi_set_protocol(ESP_IF_WIFI_STA, (WIFI_PROTOCOL_11B|WIFI_PROTOCOL_11G)); after esp_wifi_set_mode(WIFI_MODE_STA); to enable Wi-Fi 11g mode. (this method is optional, and would be removed in further version)

4. FatFS

To ensure FatFS works well, please choose a desirable OEM Code Page and enable long filename in "Component config -> FAT Filesystem support", as well as other options.

5. HardWare Awareness

5.1. GPIO

Not all of the GPIO of ESP32 have the ability to output, which means some of them can only be used as input purpose. Refer to ESP32 datasheet from Espressif Documents to learn details.

5.2. Wi-Fi Noise

Since Wi-Fi needs a profound amount of electric current, this may leads to noise issues in audio project.

But users can also mute this noise by switching power amplifier (PA) on and off, if only speakers are connected to audio output.

This noise is mainly generated while Wi-Fi is connecting, so switching off and on the PA before and after the Wi-Fi connection would mute the noise if users can accept there is no audio output while WiFi is connecting.

Following hardware workaround could be taken if users want to resolve this issue other than bypassing it.

  • Separate all the power domains for different components, like, four different LDO for ESP32-module, Tf-card, power amplifier, external ADC or codec.
  • Add LC smoothing in power amplifier LDO

  • LC should be placed as close as to power amplifier
  • Capacitor should be 1000uF electrolytic capacitor
  • L should be coil inductor, 10mH, and its wire diameter must be greater than 3mm, otherwise the electric current would not be enough for 100% audio volume
  • If it is acceptable that Wi-Fi coverage area is within 100m, then set Max WiFi TX power as 17db in "Component-config > PHY" with make menuconfig command.

5.3. External ADC Noise

Usually, the initialization of an external ADC or codec would cause a sharp noise, so a GPIO that controls the PA (power amplifier) is recommended in hardware and software design.

Note: this method only take effect on speakers

5.4. TF-card

There are two different ways to use TF-card, one of them is SD mode and another is SPI mode.

SD 1-line mode uses 3 GPIO while SPI mode use 4 GPIO, however the 3 GPIO of SD 1-line mode are immutable while the 4 GPIO of SPI mode are flexible to programmer, meanwhile the GPIO of Jtag are using some GPIO of SD 1-line mode indicating that users have to choose one of them to use.
There is also a SD 4-line mode for high-performance usage of TF-card reading and writing, but at the expense of using another extra 2 immutable GPIO.

5.5. ADC interruption

ADC would possibly affect GPIO33 and GPIO39 by causing an accidentally interruption even if the voltage of GPIO33 or GPIO39 didn't change.

To avoid this issue, users should not use these two GPIO as far as possible, or at least do not use them for protocol purpose (e.g. SPI, I2S), instead, for instance, use them as input detection, in this case, programmers should decide whether the interruption is valid by judging the activities of these two GPIO (e.g. the rising-edge interruption has never come, but an falling-edge interruption come first, then this is an invalid interruption).