Source URL: https://simonwillison.net/2024/Nov/22/amazon-s3-append-data/#atom-everything
Source: Simon Willison’s Weblog
Title: Amazon S3 Express One Zone now supports the ability to append data to an object
Feedly Summary: Amazon S3 Express One Zone now supports the ability to append data to an object
This is a first for Amazon S3: it is now possible to append data to an existing object in a bucket, where previously the only supported operation was to atomically replace the object with an updated version.
This is only available for S3 Express One Zone, a bucket class introduced a year ago which provides storage in just a single availability zone, providing significantly lower latency at the cost of reduced redundancy and a much higher price (16c/GB/month compared to 2.3c for S3 standard tier).
The fact that appends have never been supported for multi-availability zone S3 provides an interesting clue as to the underlying architecture. Guaranteeing that every copy of an object has received and applied an append is significantly harder than doing a distributed atomic swap to a new version.
More details from the documentation:
There is no minimum size requirement for the data you can append to an object. However, the maximum size of the data that you can append to an object in a single request is 5GB. This is the same limit as the largest request size when uploading data using any Amazon S3 API.
With each successful append operation, you create a part of the object and each object can have up to 10,000 parts. This means you can append data to an object up to 10,000 times. If an object is created using S3 multipart upload, each uploaded part is counted towards the total maximum of 10,000 parts. For example, you can append up to 9,000 times to an object created by multipart upload comprising of 1,000 parts.
That 10,000 limit means this won’t quite work for constantly appending to a log file in a bucket.
Presumably it will be possible to “tail" an object that is receiving appended updates using the HTTP Range header.
Tags: s3, aws, scaling, architecture
AI Summary and Description: Yes
Summary: Amazon S3 now allows appending data to existing objects in S3 Express One Zone, representing a significant innovation for users needing efficient data management. This feature may introduce unique considerations for compliance and infrastructure security professionals as it alters data handling and storage strategies within the AWS environment.
Detailed Description:
Amazon S3’s recent update to allow appending data to objects in the S3 Express One Zone bucket class is a groundbreaking advancement in cloud storage capabilities. Here are the critical insights and implications stemming from this update:
– **Feature Overview**:
– This is the first time Amazon S3 supports appending data to existing objects, which was previously limited to atomic replacements.
– The new capability is exclusive to the S3 Express One Zone class, which emphasizes lower latency for applications needing faster access to data.
– **Performance Characteristics**:
– The S3 Express One Zone bucket class provides lower latency but comes with reduced data redundancy compared to multi-availability zone deployments.
– The pricing for S3 Express One Zone is significantly higher at 16c/GB/month, thus affecting cost-benefit analysis for businesses.
– **Technical Specifications**:
– There is no minimum size for submitted appends, but the maximum size for a single append operation is set at 5GB—consistent with the maximum upload size in any S3 API.
– Each object can consist of up to 10,000 parts, allowing for a maximum of 10,000 append operations, contingent on how the object is created.
– **Limitations and Use Cases**:
– The limit of 10,000 parts suggests challenges when used for applications like log file storage, where rapid and continuous appending may require redesign in data storage strategies.
– The use of the HTTP Range header may allow for ‘tailing’ an object receiving appended updates, a potential consideration for systems demanding real-time data access.
– **Impact on Security and Compliance**:
– This update necessitates a reevaluation of security measures regarding how object storage is managed, especially in terms of data integrity and compliance with regulations governing data storage and retrieval.
– Infrastructure security professionals may need to assess new risks associated with the architecture changes that come with appending data, particularly around data consistency and redundancy strategies.
In summary, the introduction of data appending to Amazon S3 Express One Zone significantly alters how organizations store and manage data in the AWS environment, inviting new considerations related to efficiency, cost increment, and security compliance protocols.