AWS - SageMaker Enum

Reading time: 8 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks

Service Overview

Amazon SageMaker is AWS' managed machine-learning platform that glues together notebooks, training infrastructure, orchestration, registries, and managed endpoints. A compromise of SageMaker resources typically provides:

  • Long-lived IAM execution roles with broad S3, ECR, Secrets Manager, or KMS access.
  • Access to sensitive datasets stored in S3, EFS, or inside feature stores.
  • Network footholds inside VPCs (Studio apps, training jobs, endpoints).
  • High-privilege presigned URLs that bypass console authentication.

Understanding how SageMaker is assembled is key before you pivot, persist, or exfiltrate data.

Core Building Blocks

  • Studio Domains & Spaces: Web IDE (JupyterLab, Code Editor, RStudio). Each domain has a shared EFS file system and default execution role.
  • Notebook Instances: Managed EC2 instances for standalone notebooks; use separate execution roles.
  • Training / Processing / Transform Jobs: Ephemeral containers that pull code from ECR and data from S3.
  • Pipelines & Experiments: Orchestrated workflows that describe all steps, inputs, and outputs.
  • Models & Endpoints: Packaged artefacts deployed for inference via HTTPS endpoints.
  • Feature Store & Data Wrangler: Managed services for data preparation and feature management.
  • Autopilot & JumpStart: Automated ML and curated model catalogue.
  • MLflow Tracking Servers: Managed MLflow UI/API with presigned access tokens.

Every resource references an execution role, S3 locations, container images, and optional VPC/KMS configuration—capture all of them during enumeration.

Account & Global Metadata

bash
REGION=us-east-1
# Portfolio status, used when provisioning Studio resources
aws sagemaker get-sagemaker-servicecatalog-portfolio-status --region $REGION

# List execution roles used by models (extend to other resources as needed)
aws sagemaker list-models --region $REGION --query 'Models[].ExecutionRoleArn' --output text | tr '	' '
' | sort -u

# Generic tag sweep across any SageMaker ARN you know
aws sagemaker list-tags --resource-arn <sagemaker-arn> --region $REGION

Note any cross-account trust (execution roles or S3 buckets with external principals) and baseline restrictions such as service control policies or SCPs.

Studio Domains, Apps & Shared Spaces

bash
aws sagemaker list-domains --region $REGION
aws sagemaker describe-domain --domain-id <domain-id> --region $REGION
aws sagemaker list-user-profiles --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-user-profile --domain-id <domain-id> --user-profile-name <profile> --region $REGION

# Enumerate apps (JupyterServer, KernelGateway, RStudioServerPro, CodeEditor, Canvas, etc.)
aws sagemaker list-apps --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-app --domain-id <domain-id> --user-profile-name <profile> --app-type JupyterServer --app-name default --region $REGION

# Shared collaborative spaces
aws sagemaker list-spaces --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-space --domain-id <domain-id> --space-name <space> --region $REGION

# Studio lifecycle configurations (shell scripts at start/stop)
aws sagemaker list-studio-lifecycle-configs --region $REGION
aws sagemaker describe-studio-lifecycle-config --studio-lifecycle-config-name <name> --region $REGION

What to record:

  • DomainArn, AppSecurityGroupIds, SubnetIds, DefaultUserSettings.ExecutionRole.
  • Mounted EFS (HomeEfsFileSystemId) and S3 home directories.
  • Lifecycle scripts (often contain bootstrap credentials or push/pull extra code).

tip

Presigned Studio URLs can bypass authentication if granted broadly.

Notebook Instances & Lifecycle Configs

bash
aws sagemaker list-notebook-instances --region $REGION
aws sagemaker describe-notebook-instance --notebook-instance-name <name> --region $REGION
aws sagemaker list-notebook-instance-lifecycle-configs --region $REGION
aws sagemaker describe-notebook-instance-lifecycle-config --notebook-instance-lifecycle-config-name <cfg> --region $REGION

Notebook metadata reveals:

  • Execution role (RoleArn), direct internet access vs. VPC-only mode.
  • S3 locations in DefaultCodeRepository, DirectInternetAccess, RootAccess.
  • Lifecycle scripts for credentials or persistence hooks.

Training, Processing, Transform & Batch Jobs

bash
aws sagemaker list-training-jobs --region $REGION
aws sagemaker describe-training-job --training-job-name <job> --region $REGION

aws sagemaker list-processing-jobs --region $REGION
aws sagemaker describe-processing-job --processing-job-name <job> --region $REGION

aws sagemaker list-transform-jobs --region $REGION
aws sagemaker describe-transform-job --transform-job-name <job> --region $REGION

Scrutinise:

  • AlgorithmSpecification.TrainingImage / AppSpecification.ImageUri – which ECR images are deployed.
  • InputDataConfig & OutputDataConfig – S3 buckets, prefixes, and KMS keys.
  • ResourceConfig.VolumeKmsKeyId, VpcConfig, EnableNetworkIsolation – determine network or encryption posture.
  • HyperParameters may leak environment secrets or connection strings.

Pipelines, Experiments & Trials

bash
aws sagemaker list-pipelines --region $REGION
aws sagemaker list-pipeline-executions --pipeline-name <pipeline> --region $REGION
aws sagemaker describe-pipeline --pipeline-name <pipeline> --region $REGION

aws sagemaker list-experiments --region $REGION
aws sagemaker list-trials --experiment-name <experiment> --region $REGION
aws sagemaker list-trial-components --trial-name <trial> --region $REGION

Pipeline definitions detail every step, associated roles, container images, and environment variables. Trial components often contain training artefact URIs, S3 logs, and metrics that hint at sensitive data flow.

Models, Endpoint Configurations & Deployed Endpoints

bash
aws sagemaker list-models --region $REGION
aws sagemaker describe-model --model-name <name> --region $REGION

aws sagemaker list-endpoint-configs --region $REGION
aws sagemaker describe-endpoint-config --endpoint-config-name <cfg> --region $REGION

aws sagemaker list-endpoints --region $REGION
aws sagemaker describe-endpoint --endpoint-name <endpoint> --region $REGION

Focus areas:

  • Model artefact S3 URIs (PrimaryContainer.ModelDataUrl) and inference container images.
  • Endpoint data capture configuration (S3 bucket, KMS) for possible log exfil.
  • Multi-model endpoints using S3DataSource or ModelPackage (check for cross-account packaging).
  • Network configs and security groups attached to endpoints.

Feature Store, Data Wrangler & Clarify

bash
aws sagemaker list-feature-groups --region $REGION
aws sagemaker describe-feature-group --feature-group-name <feature-group> --region $REGION

aws sagemaker list-data-wrangler-flows --region $REGION
aws sagemaker describe-data-wrangler-flow --flow-name <flow> --region $REGION

aws sagemaker list-model-quality-job-definitions --region $REGION
aws sagemaker list-model-monitoring-schedule --region $REGION

Security takeaways:

  • Online feature stores replicate data to Kinesis; check OnlineStoreConfig.SecurityConfig.KmsKeyId and VPC.
  • Data Wrangler flows often embed JDBC/Redshift credentials or private endpoints.
  • Clarify/Model Monitor jobs export data to S3 which might be world-readable or cross-account accessible.

MLflow Tracking Servers, Autopilot & JumpStart

bash
aws sagemaker list-mlflow-tracking-servers --region $REGION
aws sagemaker describe-mlflow-tracking-server --tracking-server-name <name> --region $REGION

aws sagemaker list-auto-ml-jobs --region $REGION
aws sagemaker describe-auto-ml-job --auto-ml-job-name <name> --region $REGION

aws sagemaker list-jumpstart-models --region $REGION
aws sagemaker list-jumpstart-script-resources --region $REGION
  • MLflow tracking servers store experiments and artefacts; presigned URLs can expose everything.
  • Autopilot jobs spin multiple training jobs—enumerate outputs for hidden data.
  • JumpStart reference architectures may deploy privileged roles into the account.

IAM & Networking Considerations

  • Enumerate IAM policies attached to all execution roles (Studio, notebooks, training jobs, pipelines, endpoints).
  • Check network contexts: subnets, security groups, VPC endpoints. Many organisations isolate training jobs but forget to restrict outbound traffic.
  • Review S3 bucket policies referenced in ModelDataUrl, DataCaptureConfig, InputDataConfig for external access.

Privilege Escalation

AWS - Sagemaker Privesc

Persistence

Aws Sagemaker Persistence

Post-Exploitation

AWS - SageMaker Post-Exploitation

Unauthorized Access

AWS - SageMaker Unauthenticated Enum

References

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

Support HackTricks