Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In S3 library ResponseInputStream<?> doesn't seem to support the InputSteam int read(byte[] buffer) method correctly #5381

Closed
esteeele opened this issue Jul 9, 2024 · 4 comments
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.

Comments

@esteeele
Copy link

esteeele commented Jul 9, 2024

Describe the bug

When requesting the ResponseInputStream<GetObjectRequest> object with then calling read(bytes) with a defined byte too few bytes are read into the array. Comparing the SDKs side-by-side shows that the V1 SDK loads the full size of the byte array whereas the new SDK only loads a subset.

Expected Behavior

I expect the 2 SDKs to work the same i.e. S3Object.getObjectContent() to behave the same as the new response type.

Current Behavior

Using the code below I get a smaller amount of bytes returned than requested (see this in the context of the small program I've submitted)

1388 for AWS version 2
10000 for AWS version 1

If I supply

inputStream.read(bytes, 0, bytes.length);

It works perfectly in old and new

Reproduction Steps

  public void compareOldAndNewDownloads() {
    AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard().withRegion("eu-west-1").build();

    String dummyFile = "/anyFile";
    Path logFile = Path.of(dummyFile);

    String key = "delete-me-" + UUID.randomUUID();
    try (S3Client s3Client = lowHttpPoolClient()) {
      try {
        s3Client.putObject(PutObjectRequest.builder().bucket(destBucket).key(key).build(),
            RequestBody.fromFile(logFile));
      } catch (S3Exception s3Exception) {
        System.out.println("*** V2 SDK ***");

        System.out.println(s3Exception.getMessage());
        System.out.println(s3Exception.awsErrorDetails().errorMessage());
      }

      ResponseInputStream<GetObjectResponse> is = s3Client.getObject(GetObjectRequest.builder().bucket(destBucket).key(key).build());
      loadBytes(is, true);
    }

    InputStream is = amazonS3.getObject(destBucket, key).getObjectContent();
    loadBytes(is, false);
  }

  private void loadBytes(InputStream is, boolean newAws) {
    byte[] bytes = new byte[10_000];
    int result;
    try {
      result = is.read(bytes);
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
    System.out.println(result + " for AWS version " + (newAws ? 2 : 1));
  }

  private static S3Client lowHttpPoolClient() {
    SdkHttpClient apacheHttpClient = ApacheHttpClient.builder()
        .maxConnections(5)
        .build();
    return S3Client.builder()
        .httpClient(apacheHttpClient)
        .defaultsMode(DefaultsMode.IN_REGION)
        .region(Region.EU_WEST_1)
        .overrideConfiguration(ClientOverrideConfiguration.builder()
            .build())
        .build();
  }

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

s3:2.25.11

JDK version used

openjdk version "21.0.2" 2024-01-16 LTS

Operating System and version

macOS 14.5 (23F79)

@esteeele esteeele added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 9, 2024
@esteeele esteeele changed the title (short issue description) In S3 library ResponseInputStream<?> doesn't seem to support the InputSteam int read(byte[] buffer) method In S3 library ResponseInputStream<?> doesn't seem to support the InputSteam int read(byte[] buffer) method Jul 9, 2024
@esteeele esteeele changed the title In S3 library ResponseInputStream<?> doesn't seem to support the InputSteam int read(byte[] buffer) method In S3 library ResponseInputStream<?> doesn't seem to support the InputSteam int read(byte[] buffer) method correctly Jul 9, 2024
@steveloughran
Copy link

bad news: java sdk says any number less than the requested number may be returned

  An attempt is made to read as many as
  len bytes, but a smaller number may be read.
  The number of bytes actually read is returned as an integer.

reading from any input stream correctly requires you to iterate until the full value is read or a -1 comes back

@esteeele
Copy link
Author

esteeele commented Dec 9, 2024

Closing this as the InputStream interface makes no guarantees about the result returned by implementations but it is worth having as a reference in case anyone else encounters this moving from V1 -> V2

@esteeele esteeele closed this as completed Dec 9, 2024
@esteeele esteeele closed this as not planned Won't fix, can't repro, duplicate, stale Dec 9, 2024
Copy link

github-actions bot commented Dec 9, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

1 similar comment
Copy link

github-actions bot commented Dec 9, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

2 participants