Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8353665: RISC-V: IR verification fails in TestSubNodeFloatDoubleNegation.java #24421

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Apr 3, 2025

Hi,
Can you help to review this patch?
The newly added TestSubNodeFloatDoubleNegation.java (in #24150) is to check 0 - (0 - x) is not folded to x for float and double.
I have manually checked the IR and generated assembly code, it's not folded on riscv either, just there is an extra SubF in some code path.
So, the fix for this test on riscv should be simply make the check as >= 2 rather than 2.

Tested on both x86 and riscv64.

Thanks


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8353665: RISC-V: IR verification fails in TestSubNodeFloatDoubleNegation.java (Bug - P4)

Reviewers

Reviewers without OpenJDK IDs

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24421/head:pull/24421
$ git checkout pull/24421

Update a local copy of the PR:
$ git checkout pull/24421
$ git pull https://git.openjdk.org/jdk.git pull/24421/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24421

View PR using the GUI difftool:
$ git pr show -t 24421

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24421.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 3, 2025

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 3, 2025

@Hamlin-Li This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8353665: RISC-V: IR verification fails in TestSubNodeFloatDoubleNegation.java

Reviewed-by: thartmann, luhenry

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 74 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 3, 2025
@openjdk
Copy link

openjdk bot commented Apr 3, 2025

@Hamlin-Li The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Apr 3, 2025
@mlbridge
Copy link

mlbridge bot commented Apr 3, 2025

Webrevs

@TobiHartmann
Copy link
Member

Where does the extra SubF come from?

@@ -55,7 +55,8 @@ public static void assertResults() {
@Test
// Match a generic SubNode, because scalar Float16 operations are often
// performed as float operations.
@IR(counts = { IRNode.SUB, "2" })
@IR(counts = { IRNode.SUB, "2" }, applyIfPlatform = {"riscv64", "false"})
@IR(counts = { IRNode.SUB, ">= 2" }, applyIfPlatform = {"riscv64", "true"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps make sense to fix the number of SubNodes or does the Float16 code add a bunch of them on RISC-V?

@Hamlin-Li
Copy link
Author

Hamlin-Li commented Apr 4, 2025

026     lh  R28, [R11, #12]     # short, #@loadS ! Field: jdk/incubator/vector/Float16.value (constant)
02a     NullCheck R11

02a     B2: #   out( B5 B3 ) <- in( B1 )  Freq: 0.999999
02a +    --     // R23=Thread::current(), empty, #@tlsLoadP
02a     ld  R10, [R23, #464]    # ptr, #@loadP
02e +   fmv.h.x F2, zr  # float, #@loadConH0
032 +   fmv.h.x F0, R28
036 +   ld  R7, [R23, #480]     # ptr, #@loadP
03a +   addi  R28, R10, #16     # ptr, #@addP_reg_imm
03e +   binop_hf F3, F2, F0
042 +   bgeu  R28, R7, B5       #@cmpP_branch  P=0.000100 C=-1.000000

046     B3: #   out( B4 ) <- in( B2 )  Freq: 0.999899
046 +   mv R7, #1       # long, #@loadConL
048 +   sd  R28, [R23, #464]    # ptr, #@storeP
04c +   mv  R29, narrowklass: precise jdk/incubator/vector/Float16: 0x00007fa4dc396ba8 (java/io/Serializable,java/lang/Comparable):Constant:exact *     # compressed klass ptr, #@loadConNKlass
058 +   sd  R7, [R10]   # long, #@storeL
05c +   sw  R29, [R10, #8]      # compressed klass ptr, #@storeNKlass
060 +   prefetch_w [R28, #192]  # Prefetch for write
064 +   sw  zr, [R10, #12]      # int, #@storeimmI0

068     B4: #   out( N1 ) <- in( B6 B3 )  Freq: 0.999999
068 +   fmv.h.x F0, zr  # float, #@loadConH0
06c +   binop_hf F0, F0, F3
070 +   fmv.x.h R28, F0
074 +   sh  R28, [R10, #12]     # short, #@storeC
078
078 +   MEMBAR-store-store      #@membar_storestore
078 +   # checkcastPP of R10, #@checkCastPP
078     # pop frame 48
        add  sp, sp, #48
        ld  ra, [sp,#-16]
        ld  fp, [sp,#-8]
        # test polling word
        ld t0, [xthread,#40]
        bgtu sp, t0, #slow_path
08a +   ret     // return register, #@Ret

08c     B5: #   out( B8 B6 ) <- in( B2 )  Freq: 0.000100016
08c +   fmv.w.x F0, zr  # float, #@loadConF0
090 +   convHF2SAndHF2F F2, F3
094 +   fsub.s  F0, F0, F2      #@subF_reg_reg
098     spill F3 -> [sp, #0]    # spill size = 32
09c +   spill F0 -> [sp, #4]    # spill size = 32
0a0 +   mv  R11, precise jdk/incubator/vector/Float16: 0x00007fa4dc396ba8 (java/io/Serializable,java/lang/Comparable):Constant:exact *  # ptr, #@loadConP
0b8     CALL,static 0x00007fa4cb85ba80  #@CallStaticJavaDirect wrapper for: C2 Runtime new_instance
        # jdk.incubator.vector.Float16::valueOf @ bci:0 (line 329) L[0]=sp + #4
        # jdk.incubator.vector.Float16::subtract @ bci:9 (line 1137) L[0]=_ L[1]=_
        # compiler.floatingpoint.TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:12 (line 62) L[0]=_
        # OopMap {off=196/0xc4}

0d0     B6: #   out( B4 ) <- in( B5 )  Freq: 0.000100014
        # Block is sole successor of call
0d0 +   spill [sp, #0] -> F3    # spill size = 32
0d4 +   j  B4   #@branch

Thanks for having a look, interesting!

Please check the B5 block (which is when TLAB run out I think), there is an extra fsub.s putting value into F0, but F0 value is not really used, as in B4 block it just loads zero into F0. This fsub.s should be useless, although it should do no harness in the sense of correctness.

I checked the x86, find out I don't have the CPU feature to generate real float16 instructions, so it only has 2 SubF rather than SubHF which I think is expected for Float16.
Not sure if this useless fsub.s is only an issue on riscv or maybe also on x86 if supports_avx512_fp16 return true. It will be great if someone can help to verify it.

Also not sure if this useless extra instruction (here it's fsub.s) could be generated in other situations.

@mhaessig
Copy link
Contributor

mhaessig commented Apr 4, 2025

To me B5kind of looks like a backup codepath (see branch at 042). But I cannot see what R7is there.

One option I see would be to match two SUB_HF nodes for RISC-V, since it seems to always generate two of those. The only reason I match on SUB in the half float case is that I also do not have supports_avx512_fp16. I think I will file an RFE for that separately.

@Hamlin-Li
Copy link
Author

Hamlin-Li commented Apr 4, 2025

Ah, I just checked on riscv, if I disable float16(zfh) it will not generate the extra SubF in slow path, i.e. just 2 SubF.
So, I guess on x86, it could have the same issue, and the test could fail too if supports_avx512_fp16 return true.

@Hamlin-Li
Copy link
Author

To me B5kind of looks like a backup codepath (see branch at 042). But I cannot see what R7is there.

R7 should be tlab_end of the thread.

@Hamlin-Li
Copy link
Author

@jatin-bhateja Could you please help to run the test with supports_avx512_fp16 if you're available? Thanks!

@mhaessig
Copy link
Contributor

mhaessig commented Apr 4, 2025

I ran the test with software emulation of avx512-fp16 and the test failed the same way as on RISC-V:

$ sde64 -gnr -- jtreg [...] test/hotspot/jtreg/compiler/floatingpoint/TestSubNodeFloatDoubleNegation.java
[...]
Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method "public static jdk.incubator.vector.Float16 compiler.floatingpoint.TestSubNodeFloatDoubleNegation.testHalfFloat(jdk.incubator.vector.Float16)" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={}, counts={"_#SUB#_", "2"}, failOn={}, applyIfPlatform={}, applyIfPlatformOr={}, applyIfOr={}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\d+(\s){2}(Sub(I|L|F|D|HF).*)+(\s){2}===.*)"
           - Failed comparison: [found] 3 = 2 [given]
             - Matched nodes (3):
               * 326  SubHF  === _ 560 325  [[ 327 479 ]]  !orig=[478] !jvms: Float16::valueOf @ bci:5 (line 329) Float16::subtract @ bci:9 (line 1137) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:9 (line 63)
               * 443  SubF  === _ 561 442  [[ 607 ]]  !jvms: Float16::subtract @ bci:8 (line 1137) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:12 (line 61)
               * 479  SubHF  === _ 560 326  [[ 480 ]]  !jvms: Float16::valueOf @ bci:5 (line 329) Float16::subtract @ bci:9 (line 1137) TestSubNodeFloatDoubleNegation::testHalfFloat @ bci:12 (line 61)

The ideal graph shows the same "alternative codepath" that your opto assembly shows.

I guess we need to generally predicate the test on native float16 support. But I can do that in a separate issue, where I also investigate ARM.

@Hamlin-Li
Copy link
Author

Hamlin-Li commented Apr 4, 2025

Seems to me the SubF is not necessary to be generated (maybe should and could be removed? ), are you going to investigate that or just modify the test?

@mhaessig
Copy link
Contributor

mhaessig commented Apr 4, 2025

Filed JDK-8353730. I will first fix the test and then investigate the additional SubF.

@Hamlin-Li
Copy link
Author

Great!
Let's fix this test first, I'll just fix riscv part, as I don't have the env to verify other platforms.
Also file a bug to track this SubF (potential) "issue", https://bugs.openjdk.org/browse/JDK-8353732, feel free to take it when you want to start the work.

Copy link
Contributor

@mhaessig mhaessig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with or without my suggestion.

Thank you for catching and fixing this!

…leNegation.java

Co-authored-by: Manuel Hässig <manuel.hassig@oracle.com>
Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 8, 2025
@Hamlin-Li
Copy link
Author

Thanks for your reviews @mhaessig @TobiHartmann @luhenry !

@Hamlin-Li
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Apr 8, 2025

Going to push as commit 21db0fd.
Since your change was applied there have been 74 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 8, 2025
@openjdk openjdk bot closed this Apr 8, 2025
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Apr 8, 2025
@openjdk openjdk bot removed the rfr Pull request is ready for review label Apr 8, 2025
@openjdk
Copy link

openjdk bot commented Apr 8, 2025

@Hamlin-Li Pushed as commit 21db0fd.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants