{{CANONICAL}}
← Back to Tech News

Amazon EMR now supports Apache Spark 4.0.2 in general availability

Amazon Web Services has announced general availability of Apache Spark 4.0.2 across all Amazon EMR deployment models, bringing significant enhancements to data processing capabilities. The latest version introduces ANSI SQL support and VARIANT data types for handling JSON and semi-structured data, making data engineering more accessible to users without requiring Spark-specific syntax knowledge. Additionally, the update enables fine-grained access control (FGAC) at both row and column levels for AWS Lake Formation registered tables. The release strengthens compliance and governance frameworks through Apache Iceberg v3 table format support, which provides enhanced transaction guarantees and data lineage tracking for regulatory audit trails. Enhanced streaming capabilities in Spark 4.0.2 offer improved controls for complex stateful operations and better monitoring, enabling faster deployment of real-time applications for use cases such as fraud detection and personalization. AWS has made the upgrade available in all regions where EMR operates and provides an Apache Spark upgrade agent to help existing users migrate their applications.

Why It Matters

This release represents a significant step forward in making big data processing more accessible and secure. The ANSI SQL support democratizes data engineering by reducing the barrier to entry for SQL-familiar developers, while the enhanced security features and compliance capabilities address growing enterprise governance requirements. The improved streaming capabilities position organizations to better compete in real-time analytics markets, particularly important as businesses increasingly rely on immediate data insights for critical operations like fraud prevention and customer personalization.

Read Original Release →
Note

This summary is generated using AI analysis of the original press release. Always refer to the original source for complete details.