{{CANONICAL}}
← Back to Tech News

Amazon SageMaker Unified Studio Notebooks now support EMR Serverless

Amazon Web Services has expanded the capabilities of SageMaker Unified Studio Notebooks by adding support for Amazon EMR Serverless with Apache Spark Connect. This integration provides data engineers and analysts with additional flexibility in selecting their Spark runtime for interactive analytics and data engineering workloads, complementing the existing Amazon Athena Spark option. Users can now choose between different Spark engines based on their specific requirements and workload characteristics. The new feature enables users to execute PySpark and Spark SQL directly within notebook cells using EMR Serverless as the underlying compute engine. The implementation includes runtime selection from the notebook's side panel, with the chosen engine applying to both Python and SQL cells. Additionally, the integration leverages SageMaker Data Agent, an AI-powered assistant that can generate code and execution plans from natural language prompts, potentially accelerating development workflows for Spark-based data processing tasks. The enhancement includes enterprise-focused capabilities such as pre-initialized capacity for faster session startup times, unified Spark UI monitoring across all supported engines, and VPC connectivity support for network-isolated workloads. The feature is now available across all AWS regions that support SageMaker Unified Studio and works with both SageMaker Unified Studio notebooks and JupyterLab IDE environments.

Why It Matters

This expansion addresses a key challenge in enterprise data engineering by providing runtime flexibility within a unified development environment. By offering multiple Spark engine options, AWS enables organizations to optimize their data processing workflows based on specific performance, cost, and feature requirements rather than being locked into a single runtime. The integration of AI-assisted code generation could significantly reduce development time for complex Spark workloads, while the enterprise features like VPC support and unified monitoring address critical production deployment needs.

Read Original Release →
Note

This summary is generated using AI analysis of the original press release. Always refer to the original source for complete details.