AWS - SageMaker Enum
Reading time: 8 minutes
tip
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:
HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking:
HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the š¬ Discord group or the telegram group or follow us on Twitter š¦ @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.
Service Overview
Amazon SageMaker is AWS' managed machine-learning platform that glues together notebooks, training infrastructure, orchestration, registries, and managed endpoints. A compromise of SageMaker resources typically provides:
- Long-lived IAM execution roles with broad S3, ECR, Secrets Manager, or KMS access.
- Access to sensitive datasets stored in S3, EFS, or inside feature stores.
- Network footholds inside VPCs (Studio apps, training jobs, endpoints).
- High-privilege presigned URLs that bypass console authentication.
Understanding how SageMaker is assembled is key before you pivot, persist, or exfiltrate data.
Core Building Blocks
- Studio Domains & Spaces: Web IDE (JupyterLab, Code Editor, RStudio). Each domain has a shared EFS file system and default execution role.
- Notebook Instances: Managed EC2 instances for standalone notebooks; use separate execution roles.
- Training / Processing / Transform Jobs: Ephemeral containers that pull code from ECR and data from S3.
- Pipelines & Experiments: Orchestrated workflows that describe all steps, inputs, and outputs.
- Models & Endpoints: Packaged artefacts deployed for inference via HTTPS endpoints.
- Feature Store & Data Wrangler: Managed services for data preparation and feature management.
- Autopilot & JumpStart: Automated ML and curated model catalogue.
- MLflow Tracking Servers: Managed MLflow UI/API with presigned access tokens.
Every resource references an execution role, S3 locations, container images, and optional VPC/KMS configurationācapture all of them during enumeration.
Account & Global Metadata
REGION=us-east-1
# Portfolio status, used when provisioning Studio resources
aws sagemaker get-sagemaker-servicecatalog-portfolio-status --region $REGION
# List execution roles used by models (extend to other resources as needed)
aws sagemaker list-models --region $REGION --query 'Models[].ExecutionRoleArn' --output text | tr ' ' '
' | sort -u
# Generic tag sweep across any SageMaker ARN you know
aws sagemaker list-tags --resource-arn <sagemaker-arn> --region $REGION
Note any cross-account trust (execution roles or S3 buckets with external principals) and baseline restrictions such as service control policies or SCPs.
Studio Domains, Apps & Shared Spaces
aws sagemaker list-domains --region $REGION
aws sagemaker describe-domain --domain-id <domain-id> --region $REGION
aws sagemaker list-user-profiles --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-user-profile --domain-id <domain-id> --user-profile-name <profile> --region $REGION
# Enumerate apps (JupyterServer, KernelGateway, RStudioServerPro, CodeEditor, Canvas, etc.)
aws sagemaker list-apps --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-app --domain-id <domain-id> --user-profile-name <profile> --app-type JupyterServer --app-name default --region $REGION
# Shared collaborative spaces
aws sagemaker list-spaces --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-space --domain-id <domain-id> --space-name <space> --region $REGION
# Studio lifecycle configurations (shell scripts at start/stop)
aws sagemaker list-studio-lifecycle-configs --region $REGION
aws sagemaker describe-studio-lifecycle-config --studio-lifecycle-config-name <name> --region $REGION
What to record:
DomainArn,AppSecurityGroupIds,SubnetIds,DefaultUserSettings.ExecutionRole.- Mounted EFS (
HomeEfsFileSystemId) and S3 home directories. - Lifecycle scripts (often contain bootstrap credentials or push/pull extra code).
tip
Presigned Studio URLs can bypass authentication if granted broadly.
Notebook Instances & Lifecycle Configs
aws sagemaker list-notebook-instances --region $REGION
aws sagemaker describe-notebook-instance --notebook-instance-name <name> --region $REGION
aws sagemaker list-notebook-instance-lifecycle-configs --region $REGION
aws sagemaker describe-notebook-instance-lifecycle-config --notebook-instance-lifecycle-config-name <cfg> --region $REGION
Notebook metadata reveals:
- Execution role (
RoleArn), direct internet access vs. VPC-only mode. - S3 locations in
DefaultCodeRepository,DirectInternetAccess,RootAccess. - Lifecycle scripts for credentials or persistence hooks.
Training, Processing, Transform & Batch Jobs
aws sagemaker list-training-jobs --region $REGION
aws sagemaker describe-training-job --training-job-name <job> --region $REGION
aws sagemaker list-processing-jobs --region $REGION
aws sagemaker describe-processing-job --processing-job-name <job> --region $REGION
aws sagemaker list-transform-jobs --region $REGION
aws sagemaker describe-transform-job --transform-job-name <job> --region $REGION
Scrutinise:
AlgorithmSpecification.TrainingImage/AppSpecification.ImageUriā which ECR images are deployed.InputDataConfig&OutputDataConfigā S3 buckets, prefixes, and KMS keys.ResourceConfig.VolumeKmsKeyId,VpcConfig,EnableNetworkIsolationā determine network or encryption posture.HyperParametersmay leak environment secrets or connection strings.
Pipelines, Experiments & Trials
aws sagemaker list-pipelines --region $REGION
aws sagemaker list-pipeline-executions --pipeline-name <pipeline> --region $REGION
aws sagemaker describe-pipeline --pipeline-name <pipeline> --region $REGION
aws sagemaker list-experiments --region $REGION
aws sagemaker list-trials --experiment-name <experiment> --region $REGION
aws sagemaker list-trial-components --trial-name <trial> --region $REGION
Pipeline definitions detail every step, associated roles, container images, and environment variables. Trial components often contain training artefact URIs, S3 logs, and metrics that hint at sensitive data flow.
Models, Endpoint Configurations & Deployed Endpoints
aws sagemaker list-models --region $REGION
aws sagemaker describe-model --model-name <name> --region $REGION
aws sagemaker list-endpoint-configs --region $REGION
aws sagemaker describe-endpoint-config --endpoint-config-name <cfg> --region $REGION
aws sagemaker list-endpoints --region $REGION
aws sagemaker describe-endpoint --endpoint-name <endpoint> --region $REGION
Focus areas:
- Model artefact S3 URIs (
PrimaryContainer.ModelDataUrl) and inference container images. - Endpoint data capture configuration (S3 bucket, KMS) for possible log exfil.
- Multi-model endpoints using
S3DataSourceorModelPackage(check for cross-account packaging). - Network configs and security groups attached to endpoints.
Feature Store, Data Wrangler & Clarify
aws sagemaker list-feature-groups --region $REGION
aws sagemaker describe-feature-group --feature-group-name <feature-group> --region $REGION
aws sagemaker list-data-wrangler-flows --region $REGION
aws sagemaker describe-data-wrangler-flow --flow-name <flow> --region $REGION
aws sagemaker list-model-quality-job-definitions --region $REGION
aws sagemaker list-model-monitoring-schedule --region $REGION
Security takeaways:
- Online feature stores replicate data to Kinesis; check
OnlineStoreConfig.SecurityConfig.KmsKeyIdand VPC. - Data Wrangler flows often embed JDBC/Redshift credentials or private endpoints.
- Clarify/Model Monitor jobs export data to S3 which might be world-readable or cross-account accessible.
MLflow Tracking Servers, Autopilot & JumpStart
aws sagemaker list-mlflow-tracking-servers --region $REGION
aws sagemaker describe-mlflow-tracking-server --tracking-server-name <name> --region $REGION
aws sagemaker list-auto-ml-jobs --region $REGION
aws sagemaker describe-auto-ml-job --auto-ml-job-name <name> --region $REGION
aws sagemaker list-jumpstart-models --region $REGION
aws sagemaker list-jumpstart-script-resources --region $REGION
- MLflow tracking servers store experiments and artefacts; presigned URLs can expose everything.
- Autopilot jobs spin multiple training jobsāenumerate outputs for hidden data.
- JumpStart reference architectures may deploy privileged roles into the account.
IAM & Networking Considerations
- Enumerate IAM policies attached to all execution roles (Studio, notebooks, training jobs, pipelines, endpoints).
- Check network contexts: subnets, security groups, VPC endpoints. Many organisations isolate training jobs but forget to restrict outbound traffic.
- Review S3 bucket policies referenced in
ModelDataUrl,DataCaptureConfig,InputDataConfigfor external access.
Privilege Escalation
Persistence
Post-Exploitation
AWS - SageMaker Post-Exploitation
Unauthorized Access
AWS - SageMaker Unauthenticated Enum
References
- AWS SageMaker Documentation
- AWS CLI SageMaker Reference
- SageMaker Studio Architecture
- SageMaker Security Best Practices
tip
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:
HackTricks Training GCP Red Team Expert (GRTE)
Learn & practice Az Hacking:
HackTricks Training Azure Red Team Expert (AzRTE)
Support HackTricks
- Check the subscription plans!
- Join the š¬ Discord group or the telegram group or follow us on Twitter š¦ @hacktricks_live.
- Share hacking tricks by submitting PRs to the HackTricks and HackTricks Cloud github repos.
HackTricks Cloud