S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes #5824

tarehart · 2025-01-25T11:53:09Z

Describe the bug

S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes. The operation completes as though it were successful, which increases the harm of this bug since there's potential for undetected data loss.

I believe this is related to #5801. The problem goes away when I set the environment variable AWS_REQUEST_CHECKSUM_CALCULATION=WHEN_REQUIRED.

This started with 2.30.0.

Regression Issue

Select this option if this issue appears to be a regression.

Expected Behavior

S3 putObject via RequestBody.fromContentProvider uploads all content from the input stream, yielding an object with non-zero size in the S3 bucket.

Current Behavior

The object in S3 has zero bytes, despite the operation reporting success.

Reproduction Steps

package repro;

import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
import software.amazon.awssdk.core.sync.RequestBody;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.HeadObjectRequest;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;

import java.io.IOException;
import java.net.URL;


public class UploadRepro {

    public static void main(String[] args) throws IOException {
        final var bucket = "XXXXXXXXXXXXXX";
        final var key = "XXXXXXXX";
        
        final var url = new URL(
            "https://raw.githubusercontent.com/aws/aws-sdk-java-v2/ad35231f768e1bb68e6f77cb29f69d1a7278931e/.changes/next-release/feature-AmazonS3-c101d4d.json"
        );

        var s3Client = S3Client.builder()
                .credentialsProvider(ProfileCredentialsProvider.builder().profileName("Dev-Admin").build())
                .region(Region.US_EAST_1)
                .build();

        try (var stream = url.openStream()) {
            s3Client.putObject(
                PutObjectRequest.builder().bucket(bucket).key(key).build(),
                RequestBody.fromContentProvider(() -> stream, "application/json")
            );
        }

        var length = s3Client.headObject(
            HeadObjectRequest.builder().bucket(bucket).key(key).build()
        ).contentLength();

        System.out.println(length); // prints 0
    }
}

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.30.1

JDK version used

openjdk version "17.0.13"

Operating System and version

MacOS 15.1.1

The text was updated successfully, but these errors were encountered:

bhoradc · 2025-01-27T15:45:42Z

Hi @tarehart,

Thank you for reporting the issue. I am able to reproduce the behaviour you mentioned. Looks specifically related to using RequestBody.fromContentProvider() method to upload to S3.

Minimal reproducible code sample:

public class Main {
    public static void main(String[] args) throws MalformedURLException {

        final var bucket = "****";
        final var key = "regression.json";

        final var url = new URL(
                "https://raw.githubusercontent.com/aws/aws-sdk-java-v2/ad35231f768e1bb68e6f77cb29f69d1a7278931e/.changes/next-release/feature-AmazonS3-c101d4d.json"
        );

        var s3Client = S3Client.builder()
                .region(Region.US_EAST_1)
                .build();

        try (var stream = url.openStream()) {
            s3Client.putObject(
                    PutObjectRequest.builder().bucket(bucket).key(key).build(),
                    RequestBody.fromContentProvider(() -> stream, "application/json")
            );
        } catch (IOException e) {
          throw new RuntimeException(e);
        }

        var length = s3Client.headObject(
                HeadObjectRequest.builder().bucket(bucket).key(key).build()
        ).contentLength();

        System.out.println(length); // prints 0
    }
}

From Java SDK v2.30.0 onwards it causes the S3 objects to be uploaded with 0 bytes despite the operation reporting
success. Same code sample works fine for version 2.29.52 and prior.

We are looking into this issue further.

Regards,
Chaitanya

agicquelamz · 2025-01-29T19:04:11Z

Hi @tarehart, It looks like the implementation of ContentStreamProvider in your sample code does not satisfy its API contract. Per the interface's documentation [1], the result of newStream() must always start at the beginning of the data, and must return the same content over all invocations. Depending on whether the stream implementation supports mark and reset, this requirement can be satisfied in a few different ways:

Use `mark` and `reset`

Here, we use mark(int) in the constructor before reading starts to ensure that we can reset back to the beginning, and then on each invocation of newStream(), we ensure that the stream is reset.

public class MyContentStreamProvider implements ContentStreamProvider {  
    private InputStream contentStream;  
  
    public MyContentStreamProvider(InputStream contentStream) {  
        this.contentStream = contentStream;  
        this.contentStream.mark(MAX_LEN);  
    }  
  
    @Override  
    public InputStream newStream() {  
        contentStream.reset();  
        return contentStream;  
    }  
}

Use buffering if `mark` and `reset` are not available

If your stream doesn't support mark and reset directly, you can still use the above solution by first wrapping the stream in a BufferedInputStream:

public class MyContentStreamProvider implements ContentStreamProvider {  
    private BufferedReader contentStream;  
  
    public MyContentStreamProvider(InputStream contentStream) {  
        this.contentStream = new BufferedInputStream(contentStream);  
        this.contentStream.mark(MAX_LEN);
    }  
  
    @Override  
    public InputStream newStream() {  
        contentStream.reset();  
        return contentStream;  
    }  
}

Always return a new stream, and close the previous one

A simpler approach is to simply obtain a new stream to your data on each invocation, and close the previous one:

public class MyContentStreamProvider implements ContentStreamProvider {  
    private InputStream contentStream;  
  
    @Override  
    public InputStream newStream() {  
        if (contentStream != null) {  
            contentStream.close();  
        }  
        contentStream = openStream();  
        return contentStream;  
    }  
}

[1] https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/ContentStreamProvider.html

github-actions · 2025-01-29T19:12:36Z

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

tarehart added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 25, 2025

bhoradc added p1 This is a high priority issue potential-regression Marking this issue as a potential regression to be checked by team member and removed needs-triage This issue or PR still needs to be triaged. labels Jan 27, 2025

zoewangg closed this as completed Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes #5824

S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes #5824

tarehart commented Jan 25, 2025

bhoradc commented Jan 27, 2025

agicquelamz commented Jan 29, 2025

github-actions bot commented Jan 29, 2025

S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes #5824

S3 putObject via RequestBody.fromContentProvider yields an object with 0 bytes #5824

Comments

tarehart commented Jan 25, 2025

Describe the bug

Regression Issue

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

AWS Java SDK version used

JDK version used

Operating System and version

bhoradc commented Jan 27, 2025

agicquelamz commented Jan 29, 2025

Use mark and reset

Use buffering if mark and reset are not available

Always return a new stream, and close the previous one

github-actions bot commented Jan 29, 2025

Use `mark` and `reset`

Use buffering if `mark` and `reset` are not available