Google Spanner gets a columnar engine to unite OLTP and OLAP workloads

mercredi 6 août 2025, 12:15 , par InfoWorld

Google has updated its managed database service — Spanner — with a new columnar engine to help enterprises run complex analytical queries on real-time transactional data for better decision-making.

The update, currently in preview, will tackle a typical challenge that most enterprises face: running a database that can handle both online transaction processing (OLTP) and online analytical processing (OLAP) without increasing operational overhead.

While databases designed for OLTP workloads, such as Spanner, excel at rapid-fire, high-volume transactions using row-oriented storage, databases designed for OLAP workloads, such as Amazon Redshift, BigQuery, and Snowflake, demand sweeping scans and aggregations across vast datasets, typically handled by separate columnar data warehouses.

Typically, in order to bridge the gap between the two types of databases, enterprises would need to periodically transfer data from one to the other, resulting in stale data, complex ETL pipelines, and an increase in operational overhead.

Columnar storage is key

Spanner’s new columnar engine combines transaction and analytical processing without increasing overhead by using columnar storage: storing data in a columnar format alongside the existing row-oriented storage, according to Google.

Columnar storage, typically, offers several advantages for analytical workloads, including reduced time needed for input-output (I/O) operations, improved compression, and efficient scanning of columns.

“Analytical queries often access only a few columns at a time. With columnar storage, only the relevant columns need to be read from disk, significantly reducing I/O operations,” Google explained in a blog post.

Columnar storage also boosts performance on scans, allowing consecutive values to be processed in bulk, it added.

To further boost performance and improve CPU utilization, Google has integrated the new columnar engine with Spanner’s existing vectorized execution capabilities.

“While traditional query engines process data tuple-by-tuple (row by row), a vectorized engine processes data in batches (vectors) of rows,” Google explained, adding that this optimises memory access.

Columnar engine to aid integration with BigQuery

The columnar engine is also expected to help enterprises integrate Spanner with BigQuery more easily, the hyperscaler said.

Typically, if an enterprise wanted to run big, complex data analysis in BigQuery using live data stored in Spanner, it would take considerable time for managing data pipelines and put extra load on Spanner’s main systems.

In contrast, the new columnar engine, when combined with Spanner’s Data Boost feature, can handle these complex queries much faster without slowing down day-to-day transactions, Google said.

“You (enterprises) can get the best of both worlds — Spanner’s transactional consistency and BigQuery’s analytical prowess — without the need for complex ETL pipelines to duplicate data,” the hyperscaler explained.

However, Google isn’t the only database and data warehouse provider that is looking to offer both OLTP and OLAP.

While AWS has been blending OLTP and OLAP capabilities across Aurora and Redshift, Microsoft offers Azure Cosmos DB with integrated analytical features. Snowflake, too, has added transactional workloads to its analytics-first platform.

In the open-source camp, databases like Apache Doris, ClickHouse, and MariaDB’s ColumnStore are also moving toward hybrid processing. Enterprises can also choose PostgreSQL via extensions such as Citus and Timescale for hybrid processing. Google’s PostgreSQL-based AlloyDB also offers a columnar engine for hybrid processing.

Lire la suite sur InfoWorld