Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio: SRC: Optimize HiFi4 and HiFi5 version code #9825

Merged
merged 4 commits into from
Feb 12, 2025

Conversation

singalsu
Copy link
Collaborator

No description provided.

@singalsu singalsu force-pushed the src_hifi5_optimizations branch from bae5764 to b1a4863 Compare February 11, 2025 12:05
@singalsu singalsu force-pushed the src_hifi5_optimizations branch from b1a4863 to 4c8b8f1 Compare February 11, 2025 14:39
@singalsu singalsu marked this pull request as ready for review February 11, 2025 14:40
@lgirdwood
Copy link
Member

@singalsu can you check CI thanks !

@singalsu
Copy link
Collaborator Author

@singalsu can you check CI thanks !

Found it, there's a libs load failure to fix.

The intrinsics "functions" set these in their first use despite
it doesn't appear so if looking the code like normal C-functions.
E.g. this code line loads data from address coepf into coef2 and
advances coefp.

	AE_L32X2F24_IP(coef2, coefp, sizeof(ae_f24x2));

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Copying the file as such from HiFi4 and adding it to build
as preparation for optimizations.

The src_config.h is updated to follow the kconfig of FILTER
(CONFIG_FILTER_HIFI_NONE, CONFIG_FILTER_HIFI_3, etc.). The
gcc build is no more forced to use 16 bit SRC mode. It's a
limitation from HiFi2 platform BYT. The new platforms are
powerful enough to run it.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch contains code optimizations in both generic C and with
Xtensa HiFi5 intrinsics

- As a fix for code the coefficients read align register is primed
  for 16 bit coefficients version. Without priming the load works
  correctly only for aligned addresses.
- The FIR filter functions are separated for stereo and any channels
  count versions. It avoids channels count check for every
  intermediate and final sample value.
- The 32 bit coefficiens version is similarly separated for stereo
  and any channels version.
- In 32 bits version the coefficients load is changed to 128 bits
  load with aligning. Data load can't be similarly enhanced due
  to frame granularity shift in data in polyphase filters matrix
  compute.
- The dual-MAC for FIR calculate is changed to quad-MAC
- The data store is changed in stereo version to 64 bit store of
  stereo frame.
- In src_polyphase_stage_cir() function the filters process loops
  are separated for stereo and for any channels count.

In a HiFi5 build sof sof-testbench4 the MCPS for stereo 32 bit
44.1 kHz to 48 kHz the performance improves from 18.37 to
16.26 MCPS.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch applies to HiF4 part of the optimizations done for HiFi5
version.

- As a fix for code the coefficients read align register is primed
  for 16 bit coefficients version. Without priming the load works
  correctly only for aligned addresses.
- The FIR filter functions are separated for stereo and any channels
  count versions. It avoids channels count check for every
  intermediate and final sample value.
- The 32 bit coefficients version is similarly separated for stereo
  and any channels version.
- The data loads are changed to more efficient 32 bits without
  s24 conversion. The dual-MAC is changed to 32x32 with Q17.47
  format.
- The data store is changed in stereo version to 64 bit store of
  stereo frame.
- In src_polyphase_stage_cir() function the filters process loops
  are separated for stereo and for any channels count.

The MCPS is improved from 18.03 to 16.00 MCPS in stereo 32 bit
44.1 kHz to 48 kHz conversion.

Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
@singalsu singalsu force-pushed the src_hifi5_optimizations branch from 4c8b8f1 to 3372cfb Compare February 11, 2025 17:14
@singalsu
Copy link
Collaborator Author

@singalsu can you check CI thanks !

Found it, there's a libs load failure to fix.

Some timeout with python libraries, only with MTL. I rebased and pushed the same again.

@lgirdwood lgirdwood merged commit 0401893 into thesofproject:main Feb 12, 2025
45 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants