GCP - Dataproc Enum

Reading time: 2 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)

Support HackTricks

Basic Infromation

Google Cloud Dataproc is a fully managed service for running Apache Spark, Apache Hadoop, Apache Flink, and other big data frameworks. It is primarily used for data processing, querying, machine learning, and stream analytics. Dataproc enables organizations to create clusters for distributed computing with ease, integrating seamlessly with other Google Cloud Platform (GCP) services like Cloud Storage, BigQuery, and Cloud Monitoring.

Dataproc clusters run on virtual machines (VMs), and the service account associated with these VMs determines the permissions and access level of the cluster.

Components

A Dataproc cluster typically includes:

Master Node: Manages cluster resources and coordinates distributed tasks.

Worker Nodes: Execute distributed tasks.

Service Accounts: Handle API calls and access other GCP services.

Enumeration

Dataproc clusters, jobs, and configurations can be enumerated to gather sensitive information, such as service accounts, permissions, and potential misconfigurations.

Cluster Enumeration

To enumerate Dataproc clusters and retrieve their details:

gcloud dataproc clusters list --region=<region>
gcloud dataproc clusters describe <cluster-name> --region=<region>

Job Enumeration

gcloud dataproc jobs list --region=<region>
gcloud dataproc jobs describe <job-id> --region=<region>

Privesc

GCP - Dataproc Privesc

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)

Support HackTricks