Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xcode 15 + GCC = No simple build #1

Closed
mathomp4 opened this issue Sep 25, 2023 · 36 comments
Closed

Xcode 15 + GCC = No simple build #1

mathomp4 opened this issue Sep 25, 2023 · 36 comments
Assignees
Labels
bug Something isn't working

Comments

@mathomp4
Copy link
Owner

As found in my group by @bena-nasa, xcode 15 does not build GCC anymore. This doesn't work by hand via the branch of @iains or via Homebrew.

We are trying to figure out the best/easiest way to effect a workaround. @iains says one way is back-installing XCode 14 but on our NASA maintained laptops this is...difficult.

Another path might be using the gfortran installer from @fxcoudert. I assume they built that with XCode 14 and since I can run gcc on my xcode 15 laptop with gcc from Homebrew (built by 14), that might work.

@mathomp4 mathomp4 self-assigned this Sep 25, 2023
@mathomp4 mathomp4 added the bug Something isn't working label Sep 25, 2023
@iains
Copy link

iains commented Sep 25, 2023

We can actually build GCC/gfortran using XC15 - but it has somewhat ugly configuration and make environment values.

What is your bootstrap compiler?
Do you have a Command Line Install in the usual place (/Library/Developer/CommandLineTools/)?

@mathomp4 mathomp4 changed the title Xcode 15 + GCC = No build Xcode 15 + GCC = No simple build Sep 25, 2023
@mathomp4
Copy link
Owner Author

@iains I guess clang-15? You can see some of my instructions here in my modulefile:

https://github.com/mathomp4/m1modulefiles/blob/main/Core/gcc-gfortran/12.3.0.lua

It's very basic and I don't even have a CC or CXX in the configure line.

@iains
Copy link

iains commented Sep 25, 2023

OK. are you able to ty an experiment?
if so, please add
--with-ld=/Library/Developer/CommandLineTools/usr/bin/ld-classic and perhaps (for the sake of safety) CC=clang CXX=clang++

This is not enough if (1) you are using a previous version of GCC as the bootstrap compiler [i.e. the one that is used to build stage1 and $build system executables] (2) it is not capable of building Ada or D [because those cannot be built by clang]

@iains
Copy link

iains commented Sep 25, 2023

NOTE: Please DO NOT bake this change in; we are expecting Xcode to be fixed in due course, so that any actions in the short-term should be considered ephemeral work-arounds.

@mathomp4
Copy link
Owner Author

@iains I'll give that a try. We are also trying the @fxcoudert method as well, if that works, that might be the "easier" path for our users.

Our admins can make a "press this button" that installs it and for most users, that's good enough. 😄

@fxcoudert
Copy link

Homebrew already ships GCC 13 with Xcode 15 on macOS Sonoma (both Intel and ARM).
Our Ventura binaries are still with Xcode 14 at this point, and we are considering how to transition nicely (but there is no obvious solution).

@mathomp4
Copy link
Owner Author

@fxcoudert Well, we have a couple issues.

  1. Our model can't use GCC 13 yet (seems to be a compiler bug that we need to find a workaround for)
  2. On our machines, we don't have sudo or admin rights, so we can't install homebrew in /opt. We have to install in home which means brew install gcc@XX is a fun long "build in the background" process since the bottle doesn't work.

@iains
Copy link

iains commented Sep 25, 2023

@fxcoudert Well, we have a couple issues.

  1. Our model can't use GCC 13 yet (seems to be a compiler bug that we need to find a workaround for)

Please file a Bugzilla - or, at least an issue here, especially if this is a regression (although we do need to be able to reproduce the problem)

  1. On our machines, we don't have sudo or admin rights, so we can't install homebrew in /opt. We have to install in home which means brew install gcc@XX is a fun long "build in the background" process since the bottle doesn't work.

Yeah - I suppose that the home-brew is re-writing the library paths to absolute values? unfortunately, that destroys the ability to relocate it ..

NOTE: you are allowed to install things into /User/Shared/ that can be used by everyone on the machine

@mathomp4
Copy link
Owner Author

Homebrew already ships GCC 13 with Xcode 15 on macOS Sonoma (both Intel and ARM). Our Ventura binaries are still with Xcode 14 at this point, and we are considering how to transition nicely (but there is no obvious solution).

One more question: do you happen to have the Intel version of your Ventura 12.2 somewhere? Our group is mostly on Arm macs now, but a couple are still on Intel waiting for their hardware refresh.

(Of course, if the temp fix from @iains works, we can always build from scratch...though for Intel I guess we can go from "real" GCC tarball?)

@iains
Copy link

iains commented Sep 25, 2023

Homebrew already ships GCC 13 with Xcode 15 on macOS Sonoma (both Intel and ARM). Our Ventura binaries are still with Xcode 14 at this point, and we are considering how to transition nicely (but there is no obvious solution).

One more question: do you happen to have the Intel version of your Ventura 12.2 somewhere? Our group is mostly on Arm macs now, but a couple are still on Intel waiting for their hardware refresh.

(Of course, if the temp fix from @iains works, we can always build from scratch...though for Intel I guess we can go from "real" GCC tarball?)

The "real" GCC tarball does not support the relocation yet (we are working on upstreaming those patches) - only matters if, say you want one team member to build it and share the result.

@mathomp4
Copy link
Owner Author

The "real" GCC tarball does not support the relocation yet (we are working on upstreaming those patches) - only matters if, say you want one team member to build it and share the result.

Huh. Didn't know you could do that? Does that mean I don't really have to use your branch? I guess part of me figured "He knows M1, I'll keep using his code base!" 😄

@iains
Copy link

iains commented Sep 25, 2023

The "real" GCC tarball does not support the relocation yet (we are working on upstreaming those patches) - only matters if, say you want one team member to build it and share the result.

Huh. Didn't know you could do that? Does that mean I don't really have to use your branch? I guess part of me figured "He knows M1, I'll keep using his code base!" 😄

You have to use the branch(es) for Arm64.
You do not have to use them for x86_64 (or i686 or powerpc) - but there are fixes and features on the branches that can make your life easier. (not to mention the idea of being consistent)

@iains
Copy link

iains commented Sep 25, 2023

for the record, we are actually making a determined effort to upstream for GCC14, but resources are limited (as always)

@mathomp4
Copy link
Owner Author

Sigh. I guess we need to buckle down and figure out why GCC 13 hates our code if GCC 14 is coming along! :)

@mathomp4
Copy link
Owner Author

Oh. I can confirm that using your new --with-ld and CC and CXX, I could build GCC 12. I'm now going through the rest of our stack (Open MPI, Baselibs, GEOS) to see if they are happy. If so, huzzah!

My colleague is trying the @fxcoudert bundle on his laptop and building the same stack.

@fxcoudert
Copy link

My colleague is trying the @fxcoudert bundle on his laptop and building the same stack.

I'd be surprised if it works with Xcode 15, it was built against older Xcode.

@mathomp4
Copy link
Owner Author

My colleague is trying the @fxcoudert bundle on his laptop and building the same stack.

I'd be surprised if it works with Xcode 15, it was built against older Xcode.

Interesting. Well, I haven't heard anything yet from him. Probably be tomorrow once the stack is built and tested. Meanwhile, I'm close to testing our model with the by-hand GCC...

That said, I'm still rocking a built-by-xcode-14 gcc@12 from brew that seems to work. We had to move to Open MPI 5 since Open MPI 4 has a clang-15 issue, but, working so far...

@mathomp4
Copy link
Owner Author

Good news. Things seem to work with the build-by-hand GCC 12 using the @iains instructions (aka use ld-classic).

Of course, bad news for me, for some reason our model dies if we aren't running with debugging options. Grah. Time for some tedious compiler flag exclusion experiments.

@iains
Copy link

iains commented Sep 26, 2023

Good news. Things seem to work with the build-by-hand GCC 12 using the @iains instructions (aka use ld-classic).

OK that's a workaround then.

Of course, bad news for me, for some reason our model dies if we aren't running with debugging options. Grah. Time for some tedious compiler flag exclusion experiments.

If you can, please try to produce some way in which we can reproduce your issue - and then we can try to fix whatever the underlying problem is.

@mathomp4
Copy link
Owner Author

@iains More...oddities. A colleague of mine who used the @fxcoudert package for gfortran can get our model to run with Release flags. Now, in his case, he'd be using -march=armv8-a -mtune=generic:

elseif (${proc_description} MATCHES "Apple M")
  if (${CMAKE_HOST_SYSTEM_PROCESSOR} STREQUAL "arm64")
    # Testing show GEOS fails with -march=native on M1 in native mode
    set (GNU_TARGET_ARCH "armv8-a")
    set (GNU_NATIVE_ARCH ${GNU_TARGET_ARCH})
  elseif (${CMAKE_HOST_SYSTEM_PROCESSOR} STREQUAL "x86_64")
    # Rosetta2 flags per tests by @climbfuji
    set (GNU_TARGET_ARCH "westmere")
    set (GNU_NATIVE_ARCH "native")
    set (PREFER_AVX128 "-mprefer-avx128")
    set (NO_FMA "-mno-fma")
  endif ()

which is what I first tested with. Now, I think I have an M2 mac and I want to say his was from an M1 refresh cycle. Do you know of any "M2 Mac optimization was wonky in GCC 12"?

I'm going to try a run on an M1 I have access to see...but I can't imagine armv8-a is bad for M2.

As noted above -march=native didn't seem to work when I tried it last...and it didn't for my tests yesterday, but native might just be armv8-a with gfortran 12. Then again, just -O3 failed for me so...huh.

@fxcoudert
Copy link

@mathomp4 without specific reproducers or errors, it is very hard to guess. “Fails” is unclear: do you mean compilation error, linking error, runtime error, or simply that the results are different/unexpected/invalid?

A lot of factors come into play in large-scale numerical software. “Our model dies” could be invalid code exposed by a new optimisation, or a bug in the compiler, or many things

@mathomp4
Copy link
Owner Author

@fxcoudert A bit annoying for me as well. This is what I get:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Could not print backtrace: executable file is not an executable
#0  0x10ff4f4f7
#1  0x10ff4e4d3
#2  0x1878aaa23
#3  0x1104bc3df

Now, I mean, it is an executable as it ran for like a minute or two before this crash, so I'm not sure what 'executable file is not an executable' means. 🤷🏼

My usual plan of attack after this is "Let's run with debugging on" to get a good traceback, but, well, it runs with our debugging options!

But, it is a segfault so memory got hosed somehow. I'm going to try and exactly duplicate what my colleague is using on my laptop. Same stack: compilers, MPI, libraries, etc. If I still crash, then that's a data point.

Also, I just looked I can get a similar error (SIGSEGV or oddly SIGBUS) in some of our unit tests. So hopefully that can make tracking this down easier. (Though even this test requires our full stack: MPI, hdf5, netcdf, ESMF, GFE... so not quite at reproducer level yet.)

@mathomp4
Copy link
Owner Author

mathomp4 commented Sep 27, 2023

New update. All is well on my M1 Max Mac Studio at work. This is...baffling. I mean, M1 and M2 aren't that different. The folks at archspec list -march=armv8.4-a for M1 and -march=armv8.5-a for M2. But we use -march=armv8-a which (from my reading of the GCC options) is just a "less exciting" version of the more specific flags.

And on my M1 and M2, -march=native seems to report the same thing:

$ gfortran-12 -march=native -Q --help=target | grep march
  -march=                     		armv8-a

Though gfortran-13 doesn't seem to report...anything:

$ gfortran-13 -march=native -Q --help=target | grep march
  -march=

@mathomp4
Copy link
Owner Author

Oh boy. This might be the same GCC 13 error we get with GEOS. I'm thinking I did something bad in my modulefiles and GCC 13 snuck in via "brew's gfortran is gfortran-13".

I'm nuking everything and being careful.

@iains
Copy link

iains commented Sep 27, 2023

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Could not print backtrace: executable file is not an executable
#0  0x10ff4f4f7
#1  0x10ff4e4d3
#2  0x1878aaa23
#3  0x1104bc3df

Now, I mean, it is an executable as it ran for like a minute or two before this crash, so I'm not sure what 'executable file is not an executable' means. 🤷🏼

not the most helpful diagnostic ever written, it seems ...

libbacktrace needs debug symbols; therefore, actually it needs to find a debug package for the executable (on macOS, unlike Linux, the debug info is not contained in the main executable). So, if you want to have a chance to backtrace you need to do dsymutil MyExecutable (which will create a MyExecutable.dSYM package) At that point (depending on what debug was enabled) you should see some backtrace.

@iains
Copy link

iains commented Sep 27, 2023

New update. All is well on my M1 Max Mac Studio at work. This is...baffling. I mean, M1 and M2 aren't that different. The folks at archspec list -march=armv8.4-a for M1 and -march=armv8.5-a for M2. But we use -march=armv8-a which (from my reading of the GCC options) is just a "less exciting" version of the more specific flags.

And on my M1 and M2, -march=native seems to report the same thing:

$ gfortran-12 -march=native -Q --help=target | grep march
  -march=                     		armv8-a

Though gfortran-13 doesn't seem to report...anything:

$ gfortran-13 -march=native -Q --help=target | grep march
  -march=

I think that some work was done in the GCC-14 cycle to improve the --march=native for aarch64. However, there is no profile in the backend for either M1 or M2. Perhaps @fxcoudert will make a recommendation based on Homebrew's experiences - I would be happy to add the profiles (we collected the necessary data) but, unfortunately, we are already overloaded with getting the port prepared for upstreaming (so this is something for the future) ...

AFAIR, the performance-related changes from M1->M2 are not that significant - the main addition is actually BTI which is more of a security thing + some 8bit vector/matrix ops - but I might misremember. In any event, I'd suggest not using native at the moment, but configuring to the closest base core.

edit : I think I'd stick with -march=armv8.4-a for M2, because AFAIR the M2 is not a complete armv8.5-a, it just has more of the 8.5 features than M1.

@mathomp4
Copy link
Owner Author

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Could not print backtrace: executable file is not an executable
#0  0x10ff4f4f7
#1  0x10ff4e4d3
#2  0x1878aaa23
#3  0x1104bc3df

Now, I mean, it is an executable as it ran for like a minute or two before this crash, so I'm not sure what 'executable file is not an executable' means. 🤷🏼

not the most helpful diagnostic ever written, it seems ...

libbacktrace needs debug symbols; therefore, actually it needs to find a debug package for the executable (on macOS, unlike Linux, the debug info is not contained in the main executable). So, if you want to have a chance to backtrace you need to do dsymutil MyExecutable (which will create a MyExecutable.dSYM package) At that point (depending on what debug was enabled) you should see some backtrace.

Oh. Huh. Okay. I can try doing that manually. It looks like doing it automatically isn't quite supported by CMake:

https://gitlab.kitware.com/cmake/cmake/-/issues/20256

@mathomp4
Copy link
Owner Author

Well, more updates. One, a very careful rebuild gives the same error. 😞 Two, using dsymutil does provide a better traceback:

Backtrace for this error:
pe=00002 FAIL at line=00094    ExtDataConfig.F90                        <status=13>
pe=00002 FAIL at line=00084    ExtDataConfig.F90                        <status=13>
pe=00002 FAIL at line=00051    ExtDataOldTypesCreator.F90               <status=13>
pe=00002 FAIL at line=00311    ExtDataGridCompNG.F90                    <status=13>
pe=00002 FAIL at line=01901    MAPL_Generic.F90                         <status=13>
pe=00002 FAIL at line=00805    MAPL_CapGridComp.F90                     <status=13>
pe=00002 FAIL at line=00658    MAPL_CapGridComp.F90                     <status=13>
pe=00002 FAIL at line=00956    MAPL_CapGridComp.F90                     <status=13>
pe=00002 FAIL at line=00316    MAPL_Cap.F90                             <status=13>
pe=00002 FAIL at line=00258    MAPL_Cap.F90                             <status=13>
pe=00002 FAIL at line=00192    MAPL_Cap.F90                             <status=13>
pe=00002 FAIL at line=00169    MAPL_Cap.F90                             <status=13>
pe=00002 FAIL at line=00033    GEOSgcm.F90                              <status=13>

though I also still get:

Could not print backtrace: executable file is not an executable
#0  0x1127c34f7
#1  0x1127c24d3
#2  0x1878aaa23
#3  0x112d303df

Now that traceback doesn't really help too much. It points to code that if it does fail should fail for all optimization types:

  94             hconfig_key = ESMF_HConfigAsStringMapKey(hconfigIter,_RC)

since ESMF was built the same no matter what.

So, I guess the good news is that the @fxcoudert package seems to work for me as well, but surprising since that was built on XCode 14 and just being plopped into my Xcode 15 laptop.

Meanwhile, a build of GCC 12.3 built by XCode 15 on the same system has an issue. Baffling... And I'm not even using fancy compiler options. Just ol -O3 no arch or anything:

Fortran_FLAGS = -O3 -funroll-loops -g -ffree-line-length-none -fno-range-check 
  -Wno-missing-include-dirs -fbacktrace -Wno-unused-dummy-argument 
  -ffpe-trap=zero,overflow -fbacktrace -fallow-argument-mismatch 
  -fallow-invalid-boz -falign-commons 
  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk 
  -mmacosx-version-min=13.6 -J../../../../../include/MAPL.ExtData2G -fPIC

@mathomp4
Copy link
Owner Author

mathomp4 commented Oct 2, 2023

One more update for myself. I decided to try brew install gcc@12 on my laptop after seeing Homebrew/homebrew-core#143368 from @fxcoudert hoping that that would let it build but:

==> ../configure --prefix=/Users/mathomp4/.homebrew/brew/opt/gcc@12 --libdir=/Users/mathomp4/.homebrew/brew/opt/gcc@12/lib/gcc/12 --disable-nls --enable-checking=release --with-gcc
==> make
Last 15 lines from /Users/mathomp4/Library/Logs/Homebrew/gcc@12/02.make:
4  0x1876cc440  _dispatch_client_callout2 + 20
5  0x1876dff1c  _dispatch_apply_invoke + 224
6  0x1876cc400  _dispatch_client_callout + 20
7  0x1876ddfb8  _dispatch_root_queue_drain + 684
8  0x1876de6c0  _dispatch_worker_thread2 + 164
9  0x187878038  _pthread_wqthread + 228
ld: Assertion failed: (resultIndex < sectData.atoms.size()), function findAtom, file Relocations.cpp, line 1336.
collect2: error: ld returned 1 exit status
make[6]: *** [libstdc++.la] Error 1
make[5]: *** [all-recursive] Error 1
make[4]: *** [all-recursive] Error 1
make[3]: *** [all] Error 2
make[2]: *** [all-stage1-target-libstdc++-v3] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2

Do not report this issue to Homebrew/brew or Homebrew/homebrew-core!

I tried building with -v but I didn't see the ld-classic in the configure line:

==> ../configure --prefix=/Users/mathomp4/.homebrew/brew/opt/gcc@12 --libdir=/Users/mathomp4/.homebrew/brew/opt/gcc@12/lib/gcc/12 --disable-nls --enable-checking=release --with-gcc
-major-version-only --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-12 --with-gmp=/Users/mathomp4/.homebrew/brew/opt/gmp --with-mpfr=/Users/mathomp4/.homebrew/brew/
opt/mpfr --with-mpc=/Users/mathomp4/.homebrew/brew/opt/libmpc --with-isl=/Users/mathomp4/.homebrew/brew/opt/isl --with-zstd=/Users/mathomp4/.homebrew/brew/opt/zstd --with-pkgversio
n=Homebrew GCC 12.3.0 --with-bugurl=https://github.com/Homebrew/homebrew-core/issues --with-system-zlib --build=aarch64-apple-darwin22 --with-sysroot=/Library/Developer/CommandLine
Tools/SDKs/MacOSX13.sdk

I think the answer is that Homebrew/homebrew-core#143368 with:

      # Work around a bug in Xcode 15's new linker (FB13038083)
      if DevelopmentTools.clang_build_version >= 1500
        toolchain_path = "/Library/Developer/CommandLineTools"
        args << "--with-ld=#{toolchain_path}/usr/bin/ld-classic"
      end

only applies to gcc (gcc@13, as it were).

I'm going to try and remember how to do a self tap to see if I can insert that (and maybe other changes?) into a gcc@12.rb file and try a build from that.

@iains
Copy link

iains commented Oct 2, 2023

yes, this is all the same linker bug. The same workaround (--with-ld=/Library/Developer/CommandLineTools/usr/bin/ld-classic) should work on all current GCC versions.

@mathomp4
Copy link
Owner Author

mathomp4 commented Oct 2, 2023

yes, this is all the same linker bug. The same workaround (--with-ld=/Library/Developer/CommandLineTools/usr/bin/ld-classic) should work on all current GCC versions.

I'll try it out...once I remember how to do self taps in Brew. I do it so infrequently I forget how to do it everytime! 😄

@mathomp4
Copy link
Owner Author

mathomp4 commented Oct 2, 2023

Figured out the self tap and it seems to have built. I guess next up is a full GEOS stack to see if it works. If so, we can go back to using brew's build.

I suppose I should make a PR to brew. I wonder, though, should I make it for all the GCC? @fxcoudert do you have advice? I can confidently say gcc@12 built, but I haven't done all the other versions...

@mathomp4
Copy link
Owner Author

mathomp4 commented Oct 3, 2023

Yep. My self tap can build our full stack! I'll make a PR to brew.

@iains
Copy link

iains commented Oct 3, 2023

If possible, please consider making some reproducer for your GCC-13 problems and raise an issue here.

Otherwise, should we consider this issue as resolved?

@mathomp4
Copy link
Owner Author

mathomp4 commented Oct 3, 2023

Yes. I think so. Though I'm confused why the brew gcc@12 build works for me but not my "by hand" build of your 12.3 branch. Must be some magic sauce I'm missing in configure...

@iains
Copy link

iains commented Oct 3, 2023

Yes. I think so. Though I'm confused why the brew gcc@12 build works for me but not my "by hand" build of your 12.3 branch. Must be some magic sauce I'm missing in configure...

We try very hard to avoid too much 'magic sauce' in configure lines (the intent is that the configure script should choose the correct settings for your environment). However, distributions do need to set paths etc. for their installations (and, I suspect, that the library path re-writing done by HB might be necessary for you to operate in step with an existing HB installation).

So, in summary, either one needs to be operating "stand alone" (which in the case of GCC/gfortran means supplying the dependent libs - or building them along with the compiler) - or as part of a distribution (which means making any changes that the distribution does).

So, if you're happy with the resolution - please close the issue (or let me know an I will) and @fxcoudert and I will very much look forward some day to a bug report for GCC13 :)

@mathomp4 mathomp4 closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants