AWS - SageMaker Enum

Tip

学习并练习 AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
学习并练习 GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
学习并练习 Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

支持 HackTricks

查看 subscription plans!

加入 💬 Discord group 或者 telegram group 或关注我们的 Twitter 🐦 @hacktricks_live.

通过向 HackTricks 和 HackTricks Cloud github 仓库提交 PRs 来分享 hacking tricks。

服务概览

Amazon SageMaker 是 AWS 托管的机器学习平台，整合了 notebooks、training infrastructure、orchestration、registries 和 managed endpoints。在 SageMaker 资源被攻破时，通常会带来：

长期存在的 IAM 执行角色，具有对 S3、ECR、Secrets Manager 或 KMS 的广泛访问权限。
能够访问存储在 S3、EFS 或 feature stores 中的敏感数据集。
在 VPCs 内的网络立足点（Studio apps、training jobs、endpoints）。
具有高权限的 presigned URLs，可绕过控制台认证。

在你进行 pivot、persist 或 exfiltrate 数据之前，理解 SageMaker 的组成至关重要。

核心构建模块

Studio Domains & Spaces: Web IDE（JupyterLab、Code Editor、RStudio）。每个 domain 有一个共享的 EFS 文件系统和默认执行角色。
Notebook Instances: 用于独立 notebooks 的托管 EC2 实例；使用独立的执行角色。
Training / Processing / Transform Jobs: 短暂的容器，从 ECR 拉取代码并从 S3 获取数据。
Pipelines & Experiments: 描述所有步骤、输入和输出的编排工作流。
Models & Endpoints: 打包的工件，通过 HTTPS endpoints 部署用于推理。
Feature Store & Data Wrangler: 用于数据准备和特征管理的托管服务。
Autopilot & JumpStart: 自动化 ML 和策划的模型目录。
MLflow Tracking Servers: 托管的 MLflow UI/API，使用 presigned access tokens。

每个资源都会引用一个 execution role、S3 位置、container images，以及可选的 VPC/KMS 配置——在 enumeration 期间收集它们全部。

账户与全局元数据

REGION=us-east-1
# Portfolio status, used when provisioning Studio resources
aws sagemaker get-sagemaker-servicecatalog-portfolio-status --region $REGION

# List execution roles used by models (extend to other resources as needed)
aws sagemaker list-models --region $REGION --query 'Models[].ExecutionRoleArn' --output text | tr '	' '
' | sort -u

# Generic tag sweep across any SageMaker ARN you know
aws sagemaker list-tags --resource-arn <sagemaker-arn> --region $REGION

注意任何跨账户信任（execution roles 或 S3 buckets with external principals）以及基线限制，例如服务控制策略（service control policies）或 SCPs。

Studio 域、应用 & 共享空间

aws sagemaker list-domains --region $REGION
aws sagemaker describe-domain --domain-id <domain-id> --region $REGION
aws sagemaker list-user-profiles --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-user-profile --domain-id <domain-id> --user-profile-name <profile> --region $REGION

# Enumerate apps (JupyterServer, KernelGateway, RStudioServerPro, CodeEditor, Canvas, etc.)
aws sagemaker list-apps --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-app --domain-id <domain-id> --user-profile-name <profile> --app-type JupyterServer --app-name default --region $REGION

# Shared collaborative spaces
aws sagemaker list-spaces --domain-id-equals <domain-id> --region $REGION
aws sagemaker describe-space --domain-id <domain-id> --space-name <space> --region $REGION

# Studio lifecycle configurations (shell scripts at start/stop)
aws sagemaker list-studio-lifecycle-configs --region $REGION
aws sagemaker describe-studio-lifecycle-config --studio-lifecycle-config-name <name> --region $REGION

What to record:

DomainArn, AppSecurityGroupIds, SubnetIds, DefaultUserSettings.ExecutionRole.
已挂载的 EFS (HomeEfsFileSystemId) 和 S3 主目录。
Lifecycle scripts（通常包含 bootstrap credentials 或用于 push/pull 的额外代码）。

Tip

如果授予范围过广，Presigned Studio URLs 可以绕过身份验证。

Notebook Instances & Lifecycle Configs

aws sagemaker list-notebook-instances --region $REGION
aws sagemaker describe-notebook-instance --notebook-instance-name <name> --region $REGION
aws sagemaker list-notebook-instance-lifecycle-configs --region $REGION
aws sagemaker describe-notebook-instance-lifecycle-config --notebook-instance-lifecycle-config-name <cfg> --region $REGION

Notebook 元数据会显示：

执行角色（RoleArn）、直接互联网访问与仅 VPC 模式。
在 DefaultCodeRepository、DirectInternetAccess、RootAccess 中的 S3 存储位置。
生命周期脚本，可能包含 credentials 或 persistence hooks。

Training、Processing、Transform 与 Batch 作业

aws sagemaker list-training-jobs --region $REGION
aws sagemaker describe-training-job --training-job-name <job> --region $REGION

aws sagemaker list-processing-jobs --region $REGION
aws sagemaker describe-processing-job --processing-job-name <job> --region $REGION

aws sagemaker list-transform-jobs --region $REGION
aws sagemaker describe-transform-job --transform-job-name <job> --region $REGION

仔细审查：

AlgorithmSpecification.TrainingImage / AppSpecification.ImageUri – 部署了哪些 ECR images。
InputDataConfig & OutputDataConfig – S3 存储桶、前缀和 KMS 密钥。
ResourceConfig.VolumeKmsKeyId, VpcConfig, EnableNetworkIsolation – 确定网络或加密策略。
HyperParameters 可能会 leak 环境机密或连接字符串。

管道、实验与试验

aws sagemaker list-pipelines --region $REGION
aws sagemaker list-pipeline-executions --pipeline-name <pipeline> --region $REGION
aws sagemaker describe-pipeline --pipeline-name <pipeline> --region $REGION

aws sagemaker list-experiments --region $REGION
aws sagemaker list-trials --experiment-name <experiment> --region $REGION
aws sagemaker list-trial-components --trial-name <trial> --region $REGION

Pipeline 定义详细列出每个步骤、关联角色、容器镜像和环境变量。试验组件通常包含训练工件 URI、S3 日志以及提示敏感数据流动的指标。

模型、端点配置与已部署端点

aws sagemaker list-models --region $REGION
aws sagemaker describe-model --model-name <name> --region $REGION

aws sagemaker list-endpoint-configs --region $REGION
aws sagemaker describe-endpoint-config --endpoint-config-name <cfg> --region $REGION

aws sagemaker list-endpoints --region $REGION
aws sagemaker describe-endpoint --endpoint-name <endpoint> --region $REGION

Focus areas:

模型工件的 S3 URIs (PrimaryContainer.ModelDataUrl) 和推理容器镜像。
端点数据捕获配置（S3 bucket、KMS），用于可能的日志 exfil。
使用 S3DataSource 或 ModelPackage 的多模型端点（检查跨账户打包）。
网络配置和附加到端点的安全组。

Feature Store, Data Wrangler & Clarify

aws sagemaker list-feature-groups --region $REGION
aws sagemaker describe-feature-group --feature-group-name <feature-group> --region $REGION

aws sagemaker list-data-wrangler-flows --region $REGION
aws sagemaker describe-data-wrangler-flow --flow-name <flow> --region $REGION

aws sagemaker list-model-quality-job-definitions --region $REGION
aws sagemaker list-model-monitoring-schedule --region $REGION

安全要点：

Online feature stores 将数据复制到 Kinesis；检查 OnlineStoreConfig.SecurityConfig.KmsKeyId 和 VPC。
Data Wrangler flows 经常嵌入 JDBC/Redshift 凭证或私有端点。
Clarify/Model Monitor jobs 将数据导出到 S3，可能为对公众可读或可被跨账户访问。

MLflow Tracking Servers, Autopilot & JumpStart

aws sagemaker list-mlflow-tracking-servers --region $REGION
aws sagemaker describe-mlflow-tracking-server --tracking-server-name <name> --region $REGION

aws sagemaker list-auto-ml-jobs --region $REGION
aws sagemaker describe-auto-ml-job --auto-ml-job-name <name> --region $REGION

aws sagemaker list-jumpstart-models --region $REGION
aws sagemaker list-jumpstart-script-resources --region $REGION

MLflow 跟踪服务器存储实验和工件；presigned URLs 可能会暴露所有内容。
Autopilot 作业会启动多个 training jobs —— 枚举其输出以查找隐藏数据。
JumpStart 参考架构可能会在账户中部署有特权的角色。

IAM 与网络注意事项

枚举附加到所有执行角色的 IAM 策略（Studio、notebooks、training jobs、pipelines、endpoints）。
检查网络上下文：subnets、security groups、VPC endpoints。许多组织会隔离 training jobs，但忘记限制出站流量。
检查 S3 存储桶策略（在 ModelDataUrl、DataCaptureConfig、InputDataConfig 中引用）是否允许外部访问。

权限提升

AWS - Sagemaker Privesc

持久性

Aws Sagemaker Persistence

利用后操作

AWS - SageMaker Post-Exploitation

未授权访问

AWS - SageMaker Unauthenticated Enum

参考资料

Tip

学习并练习 AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
学习并练习 GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)
学习并练习 Az Hacking: HackTricks Training Azure Red Team Expert (AzRTE)

支持 HackTricks

查看 subscription plans!

加入 💬 Discord group 或者 telegram group 或关注我们的 Twitter 🐦 @hacktricks_live.

通过向 HackTricks 和 HackTricks Cloud github 仓库提交 PRs 来分享 hacking tricks。