chore(storage): add chunking, composing and cleanup logic for PCU by cpriti-os · Pull Request #13600 · googleapis/google-cloud-go

cpriti-os · 2026-01-14T11:01:23Z

Asynchronous Chunking and Uploads

Data written by the user is buffered into fixed-size parts. Once a part buffer is full, or during Close(), it's dispatched for asynchronous upload via a dedicated pool of worker goroutines.
The pcuState.write() method handles buffering incoming data, dispatching full buffers, and acquiring new buffers from a pool (bufferCh).
pcuState.flushCurrentBuffer() encapsulates the logic for taking a filled buffer, incrementing the part number (s.partNum++), creating an uploadTask, and sending it to the uploadCh for processing by a worker.

Recursive Composition for Large Objects

To handle objects larger than the maximum allowed composite components (currently 32), a multi-level composition strategy is implemented
pcuState.composeParts() manages the recursive composition process, creating intermediate objects (up to 32 components per compose operation) until a final object is formed.
Each level of intermediate composition can occur in parallel using goroutines, with errors aggregated and handled.
Temporary object names for parts and intermediates are generated using a default naming strategy, including random bytes (generateRandomBytes) to prevent hot-spotting

Robust Cleanup

Temporary part and intermediate objects are actively managed and cleaned up upon successful completion or error, based on a configurable cleanupStrategy (e.g., cleanupAlways, cleanupOnSuccess).
pcuState.doCleanup() orchestrates the deletion of all temporary parts and intermediate composed objects, performing these deletions concurrently using a semaphore to limit goroutines
Deletion errors are recorded in failedDeletes and can be retrieved in future.

gemini-code-assist · 2026-01-14T11:01:44Z

Summary of Changes

Hello @cpriti-os, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Parallel Composite Upload (PCU) functionality by integrating comprehensive data chunking, multi-level object composition, and robust cleanup mechanisms. These changes enable the system to efficiently handle large file uploads by breaking them into manageable parts, assembling them intelligently, and ensuring that temporary artifacts are properly managed, all while improving the overall testability of the component.

Highlights

Chunking and Buffering Logic: Introduced core logic for chunking incoming data, buffering it, and dispatching full parts for parallel upload. This includes write and flushCurrentBuffer methods within pcuState to manage data flow efficiently.
Multi-Level Object Composition: Implemented composeParts to handle the composition of uploaded parts into a final object. This includes a multi-level composition strategy to manage scenarios where the number of parts exceeds the maximum allowed for a single compose operation, creating intermediate composite objects as needed.
Robust Cleanup Mechanism: Added doCleanup logic to remove temporary and intermediate objects created during the parallel upload process. This cleanup respects configurable strategies (e.g., cleanupAlways, cleanupOnSuccess, cleanupNever) and handles potential failures during deletion.
Enhanced Testability: Refactored pcuState to use injectable function fields (deleteFn, doCleanupFn, composePartsFn, composeFn) for key operations, significantly improving the testability and modularity of the parallel composite upload process.
Cryptographically Secure Randomness: Switched from math/rand to crypto/rand for generating random bytes used in temporary part names, enhancing security and uniqueness.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant new functionality for parallel composite uploads (PCU), including chunking data into parts, composing parts (recursively if needed), and cleaning up temporary objects. The changes are well-structured, using dependency injection for testability, and are accompanied by a comprehensive suite of unit tests covering many edge cases. The logic for concurrent part uploads, composition, and cleanup appears robust. I have two main points of feedback: a critical issue in generateRandomBytes where a potential error from crypto/rand.Read is ignored, which could lead to non-random data and object name collisions, and a high-severity issue in the close method's error handling, which could cause the cleanup logic to behave incorrectly on composition failures. After addressing these points, the implementation will be much more robust.

storage/pcu.go

krishnamd-jkp · 2026-01-20T15:51:26Z

storage/pcu.go

+		// Acquire a buffer from the pool if we don't have one.
+		if s.currentBuffer == nil {
+			// Fail-fast check before taking a new buffer.
+			s.mu.Lock()


Can you move this logic to a function. This is repeated at multiple places.

There are just three lines that can be moved to a function but it can only possibly replace three instances, I think adding a function for so small a logic will not add to readability.

storage/pcu.go

storage/pcu_test.go

krishnamd-jkp · 2026-01-20T16:46:25Z

initial review

snippet-bot · 2026-02-12T04:30:50Z

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

Refresh this comment

cpriti-os · 2026-02-12T09:17:36Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces significant new logic for parallel composite uploads, including asynchronous chunking, recursive composition for large objects, and robust cleanup of temporary parts. The changes are well-structured, leveraging goroutines, channels, and other concurrency primitives effectively. The addition of comprehensive unit tests for the new functionality is also a great inclusion.

I've identified a couple of issues related to error handling that could lead to bugs under specific failure conditions, and one minor anti-pattern in the test code. My feedback includes suggestions to address these points.

storage/pcu.go

gemini-code-assist · 2026-02-12T09:19:46Z

storage/pcu.go

+	b := make([]byte, n)
+	_, _ = rand.Read(b)
+	return b


The error returned by rand.Read is ignored. According to the documentation, crypto/rand can return an error if the underlying OS entropy source fails. Ignoring this error could result in the function returning a buffer of zeros, which would lead to predictable temporary object names and potential collisions, defeating the purpose of using a cryptographically secure random generator. This is a critical failure condition, and the application should probably panic to avoid operating in an insecure state.

b := make([]byte, n) if _, err := rand.Read(b); err != nil { // This is a catastrophic failure. The OS's entropy source is broken. panic(fmt.Sprintf("crypto/rand.Read failed: %v", err)) } return b

storage/pcu_test.go

cpriti-os requested a review from a team as a code owner January 14, 2026 11:01

cpriti-os requested a review from a team January 14, 2026 11:01

cpriti-os requested a review from a team as a code owner January 14, 2026 11:01

product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Jan 14, 2026

cpriti-os requested a review from krishnamd-jkp January 14, 2026 11:01

gemini-code-assist bot reviewed Jan 14, 2026

View reviewed changes

storage/pcu.go Show resolved Hide resolved

storage/pcu.go Outdated Show resolved Hide resolved

krishnamd-jkp requested changes Jan 20, 2026

View reviewed changes

cpriti-os added 3 commits February 12, 2026 04:27

chore(storage): add chunking, compose and cleanup logic

cfd0f70

restore conformance test

ffe46af

remove unnecessary change

2aaa5bf

cpriti-os requested review from a team and igorbernstein2 as code owners February 12, 2026 04:28

cpriti-os force-pushed the pcu branch from f56e5da to 2aaa5bf Compare February 12, 2026 04:37

cpriti-os removed request for a team and igorbernstein2 February 12, 2026 04:38

cpriti-os added 2 commits February 12, 2026 08:23

resolve review comments

e5272e5

make close thread safe and idempotent

e83effa

cpriti-os requested a review from krishnamd-jkp February 12, 2026 09:13

gemini-code-assist bot reviewed Feb 12, 2026

View reviewed changes

minor improvements

5e648d3

Conversation

cpriti-os commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Asynchronous Chunking and Uploads

Recursive Composition for Large Objects

Robust Cleanup

Uh oh!

gemini-code-assist bot commented Jan 14, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

krishnamd-jkp Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

cpriti-os Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

krishnamd-jkp commented Jan 20, 2026

Uh oh!

snippet-bot bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

No region tags are edited in this PR.

Uh oh!

cpriti-os commented Feb 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cpriti-os commented Jan 14, 2026 •

edited

Loading

snippet-bot bot commented Feb 12, 2026 •

edited

Loading