Infrastructure teams deploying machine learning models in cloud environments face a fresh class of supply-chain risk. Security researchers at Palo Alto Networks Unit 42 disclosed a flaw in Google's Vertex AI SDK for Python that permitted attackers to hijack model uploads without requiring any access to the victim's cloud project. The technique, dubbed "Pickle in the Middle," exploited a bucket-naming collision vulnerability that could allow code execution inside Google's model serving infrastructure.
How Bucket Squatting Works in This Context
The vulnerability centres on how the Vertex AI SDK resolves storage bucket names during model uploads. When a developer uploads a trained model, the SDK needs to stage it temporarily in a Google Cloud Storage bucket. The flaw allowed an attacker to create a publicly writable bucket with a predictable or guessable name before the legitimate user's bucket was provisioned, then intercept the upload request.
The attacker's bucket would receive the model artefact, which in Python's serialisation format is typically a pickled object. Rather than a clean model file, the attacker could have poisoned this pickle with arbitrary Python code. When Google's infrastructure deserialised the uploaded model for serving, that malicious code would execute with the permissions of the serving container—potentially granting access to internal APIs, environment variables, or adjacent workloads.
This is not a novel concept in principle. Bucket squatting has been documented in other cloud contexts. What makes this instance notable is the direct path from unauthenticated attack to code execution inside the infrastructure provider's own serving layer, bypassing the user's own authentication boundaries entirely.
Why This Matters for Cloud Infrastructure Teams
ML workload deployment introduces additional complexity to the supply-chain attack surface. Unlike traditional application deployments, where code is often reviewed and versioned explicitly, model artefacts are frequently treated as opaque binary objects. A developer may trust the model file itself without realising that the upload mechanism itself can be compromised.
The SDK flaw also highlights a subtle but critical issue: cloud SDKs often make assumptions about resource naming and isolation that developers never question. When the SDK auto-generates a bucket name or derives one from project metadata, it's easy to assume that name is unique and secure. In practice, attackers can often enumerate naming schemes and race the provisioning process.
For teams running Vertex AI in production, the practical risk depends on how tightly access to the SDK is controlled. If developers can use the SDK from arbitrary networks or untrusted build environments, an attacker positioned on the same network segment or controlling DNS could potentially intercept the bucket-creation and upload sequence. Even if no active exploitation has been observed, the window between the discovery and patching of such bugs is critical.
Mitigation and Detection Strategies
Google addressed this in the SDK, so the immediate step is ensuring your Vertex AI SDK is fully patched. Beyond that, several controls reduce exposure.
Explicitly specify bucket names in your deployment scripts rather than relying on auto-generation. Use project-level IAM policies to restrict who can create or modify storage buckets, ensuring that only trusted service accounts can provision upload locations. Enable bucket versioning and object integrity checks (such as signed checksums) so that modifications to uploaded artefacts are detectable before they reach the serving layer.
Monitor storage bucket creation events through Cloud Audit Logs, alerting on any buckets created outside your normal deployment pipeline. If you're using Vertex AI in a multi-tenant or federated environment, isolate model uploads to dedicated, pre-provisioned buckets under strict access controls rather than auto-provisioning.
From a broader infrastructure posture, treating ML model artefacts with the same code-signing and verification rigour as application binaries is worth serious consideration. A model file that can execute arbitrary code during delocalisation or serving deserves the same scrutiny as source code.
Closing Thought
The Vertex AI SDK flaw is a reminder that cloud infrastructure security extends well beyond authentication and encryption at rest. The mechanisms SDKs use to orchestrate resources—naming, ordering, permissions—can introduce subtle but exploitable gaps. As ML workloads become standard infrastructure, treating model pipelines with the same adversarial mindset applied to application pipelines will reduce risk during both development and deployment.
