Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpng 1.8 hardware support cleanup #583

Closed
wants to merge 9 commits into from
Closed

libpng 1.8 hardware support cleanup #583

wants to merge 9 commits into from

Conversation

jbowler
Copy link
Contributor

@jbowler jbowler commented Sep 7, 2024

  • cmake: Remove architecture-specific checks for hardware optimization
  • Remove unused machine dependent include and check
  • Fix no-read-transforms build
  • Unify hardware handling (1st step)
  • git cp pngpriv.h pnghardware.c
  • HARDWARE: remove build system requirements

@jbowler jbowler changed the base branch from libpng16 to master September 7, 2024 23:57
@jbowler
Copy link
Contributor Author

jbowler commented Sep 8, 2024

I don't have permission to request reviews so I will do it here:

I would like reviews from @ctruta and @csparker247 This is a draft although it could go into an alpha; all my own small tests passed!

This is a big change yet it is small; it deletes very very many lines of code and compresses the remainder into specific files where the individual hardware maintainers are on the hook for keeping it working. It's a one-liner to remove broken code (just edit out the #include in pnghardware.c).

Note that this PR includes Seth's PR; I didn't want to create a merge conflict (most of the #575 changes are still in there) so #575 should be first and this PR will simply rebase on it. #576 is there too but that should not be taken because I didn't rebase on it (to difficult since they are independent and git isn't monotone); the lines in #576 have all been moved.

@ctruta
Copy link
Member

ctruta commented Sep 8, 2024

I would like reviews from @ctruta and @csparker247

Not an actual review, just an informal note that it looks ok to me from a distance.

Here's an idea: rename pnghardware.c to pngsimd.c. The name "hardware optimization" was kinda goofy to begin with, because SIMD code running on a modern CPU is neither "more hardware" nor "less hardware" than traditional code (or SISD code) running on a traditional CPU.

SIMD-enabled CPUs might be modern, but "SIMD" as a concept has been around for ~50 years. Just sayin'...

@ctruta
Copy link
Member

ctruta commented Sep 8, 2024

More on this: about SIMD optimization: I suggest, for the nomenclature, to match the "optimization" nomenclature with the "sanitization" nomenclature. For example: in the CMake file, rename PNG_HARDWARE_OPTIMIZATIONS to PNG_OPTIMIZE_SIMD. (Following on names like SANITIZE_ADDRESS, SANITIZE_MEMORY, SANITIZE_UNDEFINED, etc., used elsewhere.)

Another idea is to simply use the enable/disable nomenclature, like almost everything else in the Autotools world (and others, like Mesonbuild). Consider the following:

./configure --optimize-simd=off
./configure --optimize-simd=on
./configure --optimize-simd=sse2  # x86 only
./configure --optimize-simd=avx2  # x86 only
./configure --optimize-simd=msa   # mips only
./configure --optimize-simd=mmi   # mips or longsoon
./configure --optimize-simd=ext   # longsoon only
./configure --optimize-simd=ext2  # longsoon only

or (for the other idea) the following:

./configure --disable-simd
./configure --enable-simd
./configure --enable-simd=sse2  # x86 only
./configure --enable-simd=avx2  # x86 only
...

Still brainstorming...

@jbowler
Copy link
Contributor Author

jbowler commented Sep 8, 2024

It only does SIMD at present but there are other things out there, like binary16, that may require hardware specific stuff.

There are no configure options: I just haven't got round to removing them yet because I know from prior experience that this can result in irretrievable merge conflicts. On the other hand I just did a rebase on 'master' and this has totally destroyed everything, so it might be a while...

@jbowler
Copy link
Contributor Author

jbowler commented Sep 8, 2024

./configure --enable-simd=sse2 # x86 only
./configure --enable-simd=avx2 # x86 only

We know everything we need to know about the target system from the compiler. Even in the Universal Binary - @csparker247's original build break - the compiler tells us when it runs what instruction set, or in the MIPS case sets, it supports.

The way this is controlled in configure, cmake and makefiles, is by calling the correct compiler with the correct instruction set flags so that what the compiler produces won't crash when it runs on the target system. This is done using the system configuration information; "toolchain files" in cmake, --host in configure (possibly with extra -mfpu= arguments for ARM) and CFLAGS setting -m arguments (for gcc or clang) when makefiles are make'ed

There's no way --enable-hardware-foo will work unless the compiler is set up to support it and setting compiler -m flags from a project CMakeLists.txt or configure is a recipe for bug reports; this is why MongoDB "does not work" on ARM any longer. They set a "-m" flag in their build which causes Mongo to crash on every Raspberry Pi system up to and including RPi4. I have to hack their build configuration to get it to run on my RPi4s (and then it works just fine.)

The compiler configuration has to be correct - libpng is just a library, the user had to compile other stuff too. Say that the target has ARM NEON instructions (as armv7a has) but then run the code on armv7 results in a crash; always build for the highest common factor target!

This "just works" with hosted builds (where the build-host and the target are identical) because the compiler has been built to support at least the build-host.

I'm inclined to completely remove even the "disable" option; it's not necessary if the build system is correctly set up and, in fact, if the build system isn't "disable" can be done in CPPFLAGS.

It's easy to add to a major release yet libpng 1.6 reveals that it is impossible to remove things. Hence the mess.

Why did I quote those two lines of yours? Search for this comment:

/* We are not actually using AVX, but checking for AVX is the best way we can
* detect SSE4.1 and SSSE3 on MSVC.
*/

It's easy to add a rule; note that the above does not even check for MSVC!

csparker247 and others added 8 commits September 9, 2024 08:26
Signed-off-by: John Bowler <jbowler@acm.org>
If an ARM build is attempted with read transforms disabled the hardware
code in arm/palette_neon_intrinsics does not compile.

Signed-off-by: John Bowler <jbowler@acm.org>
This provides support code in png_set_option and the new
PNG_HARDWARE_SUPPORTED to allow removal of the mass of #defines which
handle hardware-specific code from pngpriv.h.

The #defines will be removed in the next step along with actual
implementations of the initialization functions.

Signed-off-by: John Bowler <jbowler@acm.org>
This is done to preserve the history; in effect a large block of
pngpriv.h is being split off.

Signed-off-by: John Bowler <jbowler@acm.org>
This introduces png_hardware.c which is pretty much a copy of pngpriv.h
but then moves the hardware-specific parts to the arch/arch_init.c
files.

The change is extensive by line numbers but quite simple by effect;
since we have established that all the hardware-specific files can be
compiled on all platforms it is no longer necessary to have any hardware
specific tests in the main part of libpng.

This was easy for the filter code but the palette code requires work
within the main body of libpng, for this reason this is a
work-in-progress and therefore a draft pull request.  Some amount of
testing is important at this point as my cross-platform tests are
limited.  Nevertheless this can be included in a 1.8 alpha; it meets all
the requirements of an alpha, it just needs a little review from parties
who can do the cross platform stuff on platforms other than intel and
arm.

Signed-off-by: John Bowler <jbowler@acm.org>
This gives precedence to the MIPS MSA and Loongson MMI hardware
extensions over the newer Loongarch SX extensions; see the comments in
pnghardware.c

Signed-off-by: John Bowler <jbowler@acm.org>
Signed-off-by: John Bowler <jbowler@acm.org>
@csparker247
Copy link

My week has started with a plague of broken projects, so I don't think I'll get to test this today, but overall seems fine to me.

I generally agree about not needing the --enable/--disable flags for specific architecture optimizations. These flags provide convenience methods for behavior that can be controlled in other ways but at the expense of non-trivial complexity in the build system. It seems much more maintainable to let the builder provide -m (or the equivalent) based on their build requirements than it is to construct a complicated system for automatically detecting and controlling optimization support across multiple build systems, particularly when the best way to detect such support (in CMake at least) is to try-compile test code with the very flag the builder could provide.

That said, you can't ignore that the --enable/--disable flags are a convenience for people who maybe need to control the build for some reason but aren't particularly used to working with the compilers directly. I would think this is especially true in the CMake world, where people are used to projects providing convenience flags for everything. I think that if you're going to avoid something like --enable-simd=sse2 in the build system, you should probably add some documentation somewhere that helps these sorts of users know what to do across the various build systems. This doesn't need to be complicated, something like the following in some easily accessible place with a few examples across compilers:

# (CMake) Limit the SIMD level with CFLAGS/CXXFLAGS:
CXXFLAGS=-mno-sse4.1 cmake -S libpng/ -B build/

@jbowler
Copy link
Contributor Author

jbowler commented Sep 9, 2024

My week has started with a plague of broken projects, so I don't think I'll get to test this today, but overall seems fine to me.

It's still a work-in-progress so it's not worth examining the details yet. I will need to squash this PR to remove the multiple commits before any real reviews.

I think that if you're going to avoid something like --enable-simd=sse2 in the build system, you should probably add some documentation somewhere that helps these sorts of users know what to do across the various build systems. This doesn't need to be complicated, something like the following in some easily accessible place with a few examples across compilers

Agreed. There is already documentation of that form in scripts/pnglibconf.dfa for -mfpu=neon but since I'm suggesting replacing all that with a single option to disable target-specific code I had started putting documentation in pngsimd.c. It's convenient to do that during development and it can be split out when it is finalized if there is a better place.

The only practical option is to "disable" something because anything that is possible is enabled by default. I don't think we can rely on all compilers providing flags to switch off particular instruction sets though gcc-mno-sse2 does disable SSE2 (and hence the optimisations) so long as it is the last -m option.

This factors the original pnghardware.c into a header and the actual
implementation.

The header is pngtarget.h.  It sets general purpose macros prefixed with
PNG_TARGET_ to set up general changes to pngstruct.h and the actual
code in png*.c.  None of this depends on the actual target; any target,
new or old, can implement the "optimizations".

The code is in pngsimd.c.  If target-specific support is available it
includes a target-specific body of code then provides the additional
functions (all general purpose prefixed by png_target_) to implement the
target-specific code by calling the functions from the included
"target-specific body of code".

This means that new implementations do not modify the main body of the
libpng code except for a one-line addition of the include of a
macro-only file in pngtarget.h.  If someone objects to the code for some
reason they can just comment out that line and not a single bit of it
will even get parsed by the compiler!

That, however, should never be necessary: the "check" header file just
defines minimal macros and pngtarget.h verifies that they are
consistent.  I believe it is almost impossible for one "check" header to
clobber another; the compiler will produce a diagnostic.

The approach also means that the checks get performed in a consistent
way that allows minimization of the overhead of target specific code
support on targets which have none.

Signed-off-by: John Bowler <jbowler@acm.org>
@jbowler
Copy link
Contributor Author

jbowler commented Sep 10, 2024

@ctruta, @csparker247 I'm going to squash this as soon as @ctruta [apologies to @CTuta] has finished with .44; for the moment there is no point (since this is still a draft). I just reworked it to be closer to libpng16 in terms of its effects on non-supported targets. In fact I think it is now better than libpng16 - it does not force memory alignment where it is not required.

@jbowler
Copy link
Contributor Author

jbowler commented Sep 11, 2024

Merge conflict disaster area.

@jbowler jbowler closed this Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants