Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultipartReader performs badly for large reads #8664

Open
yschimke opened this issue Jan 29, 2025 · 0 comments
Open

MultipartReader performs badly for large reads #8664

yschimke opened this issue Jan 29, 2025 · 0 comments
Labels
bug Bug in existing code

Comments

@yschimke
Copy link
Collaborator

@Test
  public void testParserBenchmarkOkHttp() throws ParsingException, IOException {
    String boundary = "foo";
    RequestBody multipartBody =
        new MultipartBuilder(boundary)
            .addPart(
                Headers.of("header-name", "header-value"),
                new RequestBody() {
                  @Override
                  public MediaType contentType() {
                    return MediaType.parse("application/octet-stream");
                  }

                  @Override
                  public long contentLength() throws IOException {
                    return 1024 * 1024 * 100;
                  }

                  @Override
                  @SuppressWarnings("KotlinInternal")
                  public void writeTo(BufferedSink sink) throws IOException {
                    Thread thread =
                        new Thread(
                            () -> {
                              for (int i = 1; i <= 100; i++) {
                                Buffer buffer = (Buffer) sink;
                                if (buffer.size() > 4 * 1024 * 1024) {
                                  Log.i("yschimke", "Sink has more than 4MB of data, waiting");
                                  while (buffer.size() > 4 * 1024 * 1024) {}
                                }
                                Log.i("yschimke", "Writing 1MB string");
                                buffer.writeUtf8(Strings.repeat("a", 1024 * 1024));
                                Log.i("yschimke", String.valueOf(buffer.size()));
                              }
                            });
                    thread.start();
                    try {
                      thread.join();
                    } catch (InterruptedException e) {
                      throw new RuntimeException(e);
                    }
                  }
                })
            .build();
    Buffer buffer = new Buffer();
    new Thread(
            () -> {
              try {
                multipartBody.writeTo(buffer);
              } catch (IOException e) {
                throw new RuntimeException(e);
              }
            })
        .start();

    MultipartReader multipartReader = new MultipartReader(buffer, boundary);
    while (true) {
      MultipartReader.Part part = multipartReader.nextPart();

      if (part == null) break;

      assertThat(part.headers().get("header-name")).isEqualTo("header-value");
      while (true) {
        Buffer readBuff = new Buffer();
        long red = part.body().read(readBuff, 1024 * 1024);
        if (red == -1) {
          Log.i("yschimke", "no more data to read");
          break;
        } else {
          Log.i("yschimke", "read " + red + " leftover = " + buffer.size());
          assertThat(readBuff.readUtf8()).isEqualTo(Strings.repeat("a", (int) red));
        }
      }
    }

Specifically

        long red = part.body().read(readBuff, 1024 * 1024);

May scan whole 1mb, before only returning 8kb, and then repeat.

Workaround is sticking with @swankjesse's magic number https://publicobject.com/2020/09/14/many-correct-answers/

@yschimke yschimke added the bug Bug in existing code label Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in existing code
Projects
None yet
Development

No branches or pull requests

1 participant