News

Enhance Generative AI Workflows Using NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import

July 16, 2025

NVIDIA and AWS rolled out DGX Cloud on AWS, a managed AI training platform with massive GPU clusters aimed at slashing time-to-train for generative AI models.

The platform pairs NVIDIA’s latest GPU tech with AWS scalable cloud compute. It offers up to 3200 Gbps network bandwidth using EC2 p5.48xlarge instances loaded with 8 H100 GPUs each. Amazon FSx for Lustre provides high-speed shared storage. Run:ai handles workload orchestration, boosting GPU utilization via scheduling and prioritization.

DGX Cloud users get private connectivity options via AWS PrivateLink and Transit Gateway, plus 24/7 NVIDIA AI and cloud expert support. The service is live in AWS Marketplace.

The launch shows a full pipeline: fine-tune Meta’s Llama 3.1-70B model using NVIDIA’s NeMo framework on DGX Cloud with multi-GPU setups, then deploy it at scale with Amazon Bedrock’s Custom Model Import. Bedrock lets customers import their own tuned models from S3 or SageMaker and run serverless inference via a single API. It also includes tools like Knowledge Bases, Guardrails, and Agents.

Sample workflows demo spinning up Jupyter notebooks for data prep, training across 4 nodes (32 GPUs), monitoring GPU metrics, then migrating models into Bedrock’s inference playground for live testing.

“DGX Cloud on AWS is optimized for faster time to train at every layer of the full stack platform to deliver productivity from day one,” NVIDIA stated.

Developers can try fine-tuning with datasets like daring-anteater for instruction tuning. Imported models support key encryption options through AWS KMS.

Cleanup instructions detail how to delete imported models and KMS keys to avoid extra charges.

The toolchain is ready now for enterprises aiming to speed up AI innovation while minimizing ops overhead.

For more NVIDIA DGX Cloud workflows, check dgxc-benchmarking GitHub repo.

byMark Sanjani

Published July 16, 2025