Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux install warning #85

Open
bbolker opened this issue Jan 8, 2022 · 47 comments
Open

Linux install warning #85

bbolker opened this issue Jan 8, 2022 · 47 comments

Comments

@bbolker
Copy link

bbolker commented Jan 8, 2022

I don't know if this is a significant problem, either substantively or from CRAN's point of view, but thought I would report it.

WARNING
Found the following significant warnings:
  .../nloptr.Rcheck/00_pkg_src/nloptr/src/nlopt-src/src/algs/slsqp/slsqp.c:959:12: 
warning: ‘__builtin_memset’ writing between 8 and 34359738368 bytes into a region of size 0
 overflows the destination [-Wstringop-overflow=]
See ‘.../nloptr.Rcheck/00install.out’ for details.

00install.out

R Under development (unstable) (2022-01-05 r81451)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Pop!_OS 21.04
[g++ (Ubuntu 10.3.0-1ubuntu1) 10.3.0]
@eddelbuettel
Copy link
Contributor

All of this is one and the same issue: nloptr decided to embed nlopt, but left it at a version that is outdated and (IIRC) three full releases behind the current nlopt.

nloptr, as an R package, is in fine shape. All the warnings AFAICT are from the embedded nlopt sources, which are outdated (and one could of course throw those out again too and rely on a 'system installation').

Anyway all this is eminently fixable but someone needs to step up and own it. @jyypma said he had a candidate, but said candidate got stuck on another CRAN detail. Is @astamm that candidate? If so, can we help?

@bbolker
Copy link
Author

bbolker commented Jan 8, 2022

I believe that @astamm is indeed the new intended maintainer (that was communicated in an e-mail to me and Martin Mächler): that's why I cross-posted to #4 . It's not clear whether CRAN knows that or not, which may be why communication was failing ...

I have a vague memory that in earlier versions nloptr tried to download sources from up-to-date sources from upstream at build time, if possible - maybe that was too hard to make work properly? (yes, that's right, see here)

@eddelbuettel
Copy link
Contributor

I contributed an extension years ago to use a system library (such as the the libnlopt-dev you can get from your Ubuntu derivative) and I still rely on those. That cleanly avoids all issues with CRAN about nlopt source code.

But it gets us back to the issue that nlopt needs to be present. That is clearly something we can sort out with Uwe and Tomas for Windows / Windows UCRT, and hopefully even with Simon for macOS. It would then push the burden (and occassional need to hand-holding) back to us all to explain to the occassional Unix user lost at bringing a library in. We cannot say if that is "better" than trying to appease the CRAN maintainers on whatever deviations the embedded nlopt code may have. It's simply a trade off.

But right we risk losing nlopt because of the joint decision to a) embed and b) keep that at older and apparently no-longer-good-enough-for-CRAN. The trouble here is that we can never tell ex ante which version would be good enough for CRAN as they insist, for better or worse and here clearly worse, to not make their test system truly replicable.

@astamm
Copy link
Owner

astamm commented Jan 13, 2022

We just merged PR #88 which mainly re-use an existing nlopt system build if pkg-config can detect it, which might improve the situation especially on Linux. Would you mind testing it out please?

@bbolker
Copy link
Author

bbolker commented Jan 13, 2022

Still get the warning.

I don't have nlopt installed locally; I can install it if you like (sudo apt install nlopt0; would I need libnlopt-cxx0 and libnlopt-cxx-dev as well?) — for no good reason I'm still not 100% up to dockerizing etc etc so I decided to hold off on installing it up until now in case we wanted to test the "without nlopt" case.

How is the CRAN communication going? Are they (yet) aware that you're on the job, are they willing to give extra time if necessary? If we could find out that they are willing to install nlopt on their test platforms that would make our set of high-priority testing targets a lot smaller ...

I'm considering communicating with the other 70+ downstream package maintainers - I'm not sure if they are up to speed or not and I got a somewhat panicked e-mail from a maintainer I know asking what I was planning to do ...

If you're not getting a response from CRAN let me/Dirk know, we can probably try some back channels.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 14, 2022

As always sudo apt install libnlopt-dev because

  • we need the -dev package, not the run-time package, to build and link
  • nobody said anything about the C++ bindings so don't make assumptions
  • per checks we added you do need version 2.7.0 or newer which is not in Ubuntu 20.04
  • I added it to my ppa so sudo add-apt-repository ppa:edd/misc will help you for amd64 focal; I use it in CI here
  • What @astamm designed and implemented is a build from source on all systems that do not have a recent enough NLopt
  • For that you need cmake and nothing else

@bbolker
Copy link
Author

bbolker commented Jan 14, 2022

OK, thanks. FWIW I'm on PopOS! 21.04, which AFAIK is downstream of Ubuntu 21.04.

I'm still not clear (sorry) on whether I should (1) install all the system-level stuff and try again; or (2) keep my system as an "nlopt-free" platform to test that variant. (Depends on part on what CRAN maintainers say ... ?)

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 14, 2022

There are two cases:

  • if you not have libnlopt-dev (or too old a version), then the nloptr package which compile from its own sources; this requires cmake

  • if you do have it (as I do) then they are used; I documented both options with full logs in the first section of PR Minimal working layer to re-use existing nlopt #88

You can also test either case quickly in, say, a Docker container or look at our CI runs.

If you have any other questions please let us know.

@bbolker
Copy link
Author

bbolker commented Jan 14, 2022

I'm in case 1; I don't have libnlopt-dev, I do have cmake, and everything installs and checks fine except for the warning that's reported as the primary subject of this issue ("‘__builtin_memset’ writing between 8 and 34359738368 bytes into a region of size 0"). The complete log is attached ...

I'm happy to ignore the warning if that seems appropriate; just wanted to make sure it was reported in case it was something that should be diagnosed and resolved (I personally have no idea how ...) I can imagine that CRAN test platforms would encounter this warning if they don't have nlopt installed and if their configuration matches mine sufficiently closely.

On the other hand, one could ignore this until directed to fix it by CRAN (at the very least, submitting a version to CRAN that works on almost all platforms would presumably reset the clock for when the package gets archived to sometime more than 8 days from now ...)

@eddelbuettel
Copy link
Contributor

We need more detail.

  • Does the warning happy when you install or check nloptr, or your package?
  • What is your compiler version?
  • Where you able to reproduce on another system, or with other compilers, or ... ?
  • Or you using unusual switches to compilers etc?
  • Something reproducible would help, we have spent hours on this package already but we can't read your mind from a distance so talk to me like I am five and make things explicit

The gold standard for package maintainers is the page at CRAN. Last time it was updated NLopt (as used by the package) was three years older but so such messages were show,

@eddelbuettel
Copy link
Contributor

To be more plain: In your original post above you show two things. An (strongly appreciated output), and a list of your system. But not how the compiler was invoked etc pp. I ran on two very similar systems (Ubuntu 21.04, Debian testing) and include full logs (behind the <details> ... </details> wrapper) here: #88 (comment) That allows everybody fuller access.

@bbolker
Copy link
Author

bbolker commented Jan 14, 2022

  • Does the warning [happen] when you install or check nloptr, or your package?

It happens when I R CMD INSTALL nloptr_2.0.0.tar.gz or (equivalently) when I run R CMD check --as-cran nloptr_2.0.0.tar.gz (tarball built from the git directory via R CMD build nloptr). It has nothing to do with lme4.

  • What is your compiler version?

gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0, cmake 3.18.4

  • Were you able to reproduce on another system, or with other compilers, or ... ?

Not so far.

  • Or you using unusual switches to compilers etc?

Not as far as I know. I have CXXFLAGS = -Wno-ignored-attributes; PKG_CXXFLAGS += -Wno-ignored-attributes in my ~/.R/Makevars file (which throws a warning, but an understandable one). AFAIK everything else is stock.

  • Something reproducible would help, we have spent hours on this package already but we can't read your mind from a distance so talk to me like I am five and make things explicit

I completely appreciate that. As far as I can tell this is a stock system of its type (PopOS! 21.04 → Ubuntu 21.04). I am just as baffled as you are that this is throwing a warning that doesn't seem to happen on any of the r-hub Linux platforms (I haven't tried them myself, but I know @astamm has).

Digging into the logs above I can see that this happens here:

[ 94%] Building CXX object CMakeFiles/nlopt.dir/src/algs/ags/local_optimizer.cc.o
In function ‘hfti_’,
    inlined from ‘lsei_’ at /home/bolker/Documents/Rpkgs/nloptr.Rcheck/00_pkg_src/nloptr/src/nlopt-src/src/algs/slsqp/slsqp.c:1176:5,
    inlined from ‘lsq_’ at /home/bolker/Documents/Rpkgs/nloptr.Rcheck/00_pkg_src/nloptr/src/nlopt-src/src/algs/slsqp/slsqp.c:1417:5:
/home/bolker/Documents/Rpkgs/nloptr.Rcheck/00_pkg_src/nloptr/src/nlopt-src/src/algs/slsqp/slsqp.c:959:12: warning: ‘__builtin_memset’ writing between 8 and 34359738368 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
  959 |  rnorm[jb] = dnrm2___(&i__1, &b[kp1 + jb * b_dim1], 1);
      |  ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I will go check your logs from Ubuntu 21.04 to see if I can detect any differences from my setup. I could also try to do comparisons for the lines above between the hard-wired nloptr version (2.7.0) and the relevant code in the nlopt repo ... ??? git blame seems to think most of this code is very old, but there is one line where "long double" was modified to "double" recently ... ??

The problem with this rabbit hole is that I don't know what I'm looking for ...

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 14, 2022

I wonder if it might be gcc/g++ 10? What is strange is that CRAN is usually pretty forward-looking and tries compilers early and had nothing, and I had at the same or newer and see nothing either.

But long story short at the end of the day this is you seeing a warning. Unplesant, but immaterial. At best as we can tell we do not yet see this at CRAN (though we of course cannot simulate a full submission). What p values do we put on ths being a nothingbuger versus this being a tell. As you yourself pointed out, we may be are up against a more important clock here.

(The one things I will say is that the '94%' step is the linking step during the cmake build. There could be something there but I am then the wrong person to talk to because I am not the one who bet on the farm on building all under cmake.)

@astamm
Copy link
Owner

astamm commented Jan 14, 2022

I got this very same warning on R-Hub Windows since they updated the platform to Windows Server 2022, R-devel, 64 bit : (https://artifacts.r-hub.io/nloptr_2.0.0.tar.gz-ef81a471d8d749aeb311072790c239d0).
Not sure if it can help us understand better.
But then again, the most important thing is to keep nloptr from being archived and this warning seems not to be generated on the win-builder machine.
On winbuilder, as @eddelbuettel knows already, I am facing a more urgent matter as CMake is found on the winbuilder release machine but not on the devel machine, which makes any submission of the current version of nloptr not pass even the pre-tests.
I emailed again (this time Uwe directly) yesterday evening (I am based in France) asking if he could install it on the devel machine as well. I hope he gets back at me soon.

@bbolker
Copy link
Author

bbolker commented Jan 14, 2022

Hmm, looking at the r-hub artifacts it looks like a different warning - albeit at the same place in the code, so presumably the same issue ...

I agree that figuring out what's up with cmake support on CRAN is the priority now.

WARNING
Found the following significant warnings:
  C:/Users/USERVehIAdHXco/nloptr.Rcheck/00_pkg_src/nloptr/src/nlopt-src/src/algs/slsqp/slsqp.c:959:12: warning: '__builtin_memset' offset [8, 15] is out of the bounds [0, 8] of object 'xnorm' with type 'double' [-Warray-bounds]

@eddelbuettel
Copy link
Contributor

There is an option to make CMake be verbose and show all commands (just like <cough, cough> standard build systems do). If we enable that we may be able to triage more.

It seems related to the CMake build, so you could also triage and see what happens when you compile and link a standalone example against the libnlopt.a you built. I usually try to separate R interactions out that way. Could of course also be in the R settings interacting with the (lib)NLopt settings.

@bbolker
Copy link
Author

bbolker commented Jan 14, 2022

FWIW, confirming that the warning does not occur after I install libnlopt-dev (and reappears if I remove it).

@astamm
Copy link
Owner

astamm commented Jan 14, 2022

After reading protocolbuffers/protobuf#7140, it seems that

  1. This warning is legit. I reported it in Offset out of bounds in SLSQP algorithm stevengj/nlopt#431 and someone confirmed that. I do not know however their timeline to look into, if they decide to look into it.
  2. There is a workaround that pertains to silencing the warning by adding to the main CMakeLists.txt something along this line set_source_files_properties(${NLOPT_SOURCES} PROPERTIES COMPILE_FLAGS -Wno-array-bounds), which I did.
  3. When I ran on R-Hub Windows Server 2022, R-devel, 64 bit machine, the warning is gone.

@bbolker could you please test the current state of the master to see if that also makes the warning vanish for you too?

@bbolker
Copy link
Author

bbolker commented Jan 15, 2022

The warning goes away, hurrah.
CRAN are normally very grumpy about suppressing warnings, but I'm guessing/hoping that this is being done in a part of the build process that won't trigger their alarms ...
Hopefully the nlopt folks will fix it upstream eventually ...

@bbolker bbolker closed this as completed Jan 15, 2022
@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 15, 2022

I would urge you to re-open to keep us abreast of trying to get this fixed upstream.

[ Aside: @astamm you could also try to upgrade to 2.7.1 to if it helps ]

I very much do in my packages as we do here: suppress warnings if we must to appease and not trigger the watch dogs. But we also try to be honest about it as engineers. There likely is an out-of-bounds write so this should get fixed in NLopt.

@bbolker bbolker reopened this Jan 15, 2022
@astamm
Copy link
Owner

astamm commented Jan 15, 2022

I updated to NLopt 2.7.1. It does not help with the warning though.
I informed on stevengj/nlopt#431 of what we opted for but that it was likely an out-of-bounds write was made and asked them if they could look into it.

Regarding submission of nloptr which is now in good shape, Uwe has not yet gotten back at me about making cmake available on the winbuilder devel machine.
I was planning on making a re-submission given the updates we've worked on but I am pretty sure it is going to come back with that same pre-test error unless I can first get confirmation from Uwe that cmake can be found on the devel machine (which I see no reason why they would forbid it, given that they installed it on the release machine).
I am happy if you want to back-channel though if you think that can speed things up, given that archiving is scheduled for next Friday and acceptance on CRAN can take up to 5 days...

@eddelbuettel
Copy link
Contributor

I think you should upload to CRAN.

The email by Brian Ripley was (as I understand it) in response to older issues prior to your active maintenance. The package is widely used, I would expect them to take that as well as the renewed active development into consideration. I would think that we can look into the warning in a 2.0,1 or later upload.

@astamm
Copy link
Owner

astamm commented Jan 15, 2022

I agree. But when I submit to CRAN, it comes back with a pre-test error on the winbuilder devel machine where no cmake can be found and the process stops there. That is my current problem.

@eddelbuettel
Copy link
Contributor

Ah, yes. In that case you reply, politely declare that this is a false positive due to the declared dependency on cmake and kindly ask for them to move to automatic checks of reverse dependencies.

(In 2022 I have made two submissions to CRAN, a slightly slower pace than usual -- and they both hit this for different 'known' reasons. This can happen, and can be ok. Give it a shot.)

@astamm
Copy link
Owner

astamm commented Jan 15, 2022

I just submitted and received instantly the pre-test error as expected. I replied asking to move to automatic checks of reverse dependencies. I'll keep you both informed.

@eddelbuettel
Copy link
Contributor

Good.

Thought I just had is of course that on Windows espcially where the build system is so nailed we really should not need cmake given that there is no 'discovery' -- all systems are the same as they all use the same Rtools. So we could have realized that a little earlier and shipped a fixed Makefile for Windows. Oh well.

@astamm
Copy link
Owner

astamm commented Jan 15, 2022

For sure that would help as we would not need to bother with Cmake on Windows. But hopefully it will go through as it is.

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

Still no answers from CRAN. And more e-mails coming in of people worrying about the scheduled archiving date next Friday. Should we think about the idea of shipping a fixed Makefile for Windows so that the package will proceed to the next stage upon submission to CRAN? I have no clue how to do that however. @eddelbuettel: do you have a similar Makefile in one of your other packages that we could reuse here to easily set this up?

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

Actually what I could do maybe would be to use the makefile that Cmake generates on windows and ship it as fixed.

I experimented a bit and what could be done is to ship the output of the cmake --build command in a specific folder for Windows users and then call GNU make and make install on it, since, as @eddelbuettel was saying, they all use the same Rtools compared to Unix-like distributions on which many things can very and it makes sense there (e.g. Unix platforms) to let cmake sort everything out.

So I just need to find a way to emulate a Windows setup with proper Rtools installed. Any chance you already have a way?

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 17, 2022

I had the same idea of pre-cooking a Makefile via cmake, storing it in inst/ of the package made there (RHub, win-builder, ...) and extracting it from the artifact. You can then do as the aforementioned RcppRedis does in this Makefile.in (where configure just fills in hiredis/libhiredis.a as @PKG_TARGET@ if a build is needed).

Another possibility is to do what many other packages (includig RcppRedis) do on Windows: rely on @opencpu [Edit: I meant @jeroen here] to build a library and have it downloaded at build. Also very reliable, fast (no new build) and closer in spirit to what I restored for Linux as it also relies on an external libnlopt.a (and then 'knows' it is there and downloads it).

(I do have Windows on personal computer.)

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

I'll give a try to your first approach (which is closer in spirit to what I already have in mind) which I believe I can do myself.
The second approach seems neater but I have no experience regarding @opencpu so I might switch to that in a later version.

@eddelbuettel
Copy link
Contributor

Yes, it should be doable. When I do what RcppRedis does in a Docker container (and no hiredis library present) I get thje following simple Makefile:

## -*- mode: makefile; -*-

## Configure tells us about locations for
## hiredis headers and library via the variables
## if MsgPack is seen we add it as well
PKG_CXXFLAGS =   -I. 
#-DMSGPACK_DISABLE_LEGACY_NIL -DMSGPACK_DISABLE_LEGACY_CONVERT
PKG_LIBS =       hiredis/libhiredis.a

CXX_STD =       CXX11

.PHONY:         all  libhiredis.a

all:            $(SHLIB)
$(SHLIB):        libhiredis.a

libhiredis.a:
                $(MAKE) CC="$(CC)" CFLAGS="$(CFLAGS) $(CPICFLAGS)" AR="$(AR)" RANLIB="$(RANLIB)" --directory=hiredis libhiredis.a

So to build the shared library, a static library is needed and for that a directory within src/ is visited and another Makefile called. No more, no less. For NLopt you probably want to the internal Makefile generated as it walks over a number of source files. But as those may not change often you can even hardcode all files. I did that in this Makefile.win. Just tell g++ what is needed, and a single static library is build (also on Windows). (That is from a package where I eventually got tired of depending on an external librarie and stuck the (fixed, for all intents) sources in.

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

Thanks Dirk. Do you think that would be even simpler in this case to ship a fixed libnlopt.a for both Windows architectures and be done with it altogether? I got them from the artifacts on the winbuilder release machine.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 17, 2022

It's up to you. Relying on cmake may be better / more flexible if the upstream sources change (but do they still?)

It is really up to you -- you as maintainer make these choices. I think my first message to you may have been about a hard dependeny on cmake not being the wisest choice in the CRAN context where very few to no other packages do that, and apparently none on Windows. I get by with my 60+ packages without it (and not all of these use compilation or external libraries, of course).

@bbolker
Copy link
Author

bbolker commented Jan 17, 2022

Regarding your comment about shipping a fixed libnlopt.a; CRAN will not accept packages with binary "blobs", so I think that's not an option ...

I'm a little concerned about the radio silence from CRAN. Would you like me to try a back-channel query?

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

You are right about the fixed library I think.
I'll accept any back-channeling you can do ;-), thanks!

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 17, 2022

Ben, that is not what I suggested. Have you actually looked at any of the packages supported by Jeroen's rwinlib ? There are literally eighty plus packages downloading a binary blob. They all downloading at build time and yes, I have a package (that is fairly complex to build) in that set too (and, come to think about it, a few others too -- it is a viable service). So yes, this is a viable approach at CRAN and no I never said to stick a library into a tarball.

I already tried the back-channeling a few days ago via a quiet email to CRAN but with nothing to show. They don't always reply. Ben of course carries a mighty stick in this discussion because if lme4 gets in trouble ... so worth trying too.

@bbolker
Copy link
Author

bbolker commented Jan 17, 2022

Dirk, I (mistakenly?) interpreted @astamm's idea to

ship a fixed libnlopt.a for both Windows architectures and be done with it altogether

as sticking a blob in the package, not downloading at build time. (I see that downloading a fixed libnlopt.a would work ... the verb "ship" threw me off.)

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

I created confusion here, I apologise. I did suggest embedding a fixed library, but that indeed is doable but not as easily as I'd thought.
Jeroen did contribute a PR #43 which I closed 3 days ago (not merged) but could certainly be adapted to compile on Windows using rwinlib. Maybe we should look into that PR which may contain an existing solution to our Windows issue here.

@bbolker
Copy link
Author

bbolker commented Jan 17, 2022

Dirk, did you e-mail cran@r-project.org or someone (e.g. Uwe Ligges) individually? (I'm of two minds about which is better.)

@eddelbuettel
Copy link
Contributor

I replied to the group email sender, the windows maintaier, and the cran handle for record keeping.

Jeroen's PR, ignored three years ago by the previous maintainer, could be updated to NLopt 2.7.1. You could ask him. Or you could insist on building NLopt yourself each and every time. Both methods have a number of plus and minus items going for them.

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

I asked Jeroen and actually contributed a PR in which I updated the headers and libs. If he accepts it, then I can also adapt and merge #43.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jan 17, 2022

Ah, good. I had not seen that. Also, I goofed two hours ago (lack of coffee ?) when I used his twitter handle here instead of his github handle @jeroen ...

We then of course have the whole drama about @jeroen owning a Windows toolchain and Tomas @kalibera owning another...

@astamm
Copy link
Owner

astamm commented Jan 17, 2022

No worries. He still would have to review, accept and make a release out of it so that we could download it as he does in #43.
Unless I can make a release on my fork temporarily to speed things up a bit regarding Friday's deadline.

@aadler
Copy link
Contributor

aadler commented Jun 27, 2024

@bbolker, is this still an issue with version 2.1.1?

@aadler
Copy link
Contributor

aadler commented Nov 24, 2024

@bbolker, friendly nudge :)

@bbolker
Copy link
Author

bbolker commented Nov 26, 2024

If I don't answer this in the next day or two you can probably assume it can be closed (ditto with the other open issue ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants