navigation

Account

In your account you can view the status of your application, save incomplete applications and view current news and events

enEnglish

deGerman

February 21, 2024

Spanner's Distributed Nature

Data science

What is the article about?

Joining tables in Google Cloud Spanner poses is challenging due to its distributed setup. At OTTO, where Google Cloud Spanner is integral to multiple services, achieving optimal join performance is crucial to ensure a great customer experience.

Spanner differs from typical SQL databases – it's highly distributed, requiring careful consideration when crafting join queries. Additionally, it doesn't fully adhere to the SQL standard. This article explores the unique aspects of joins in Spanner and offers insights into optimizing their performance.

Spanner’s Distributed Nature

Spanner is a distributed database system. This architecture has implications for all executed queries, especially those covering multiple tables, such as joins.

Firstly, everything in Spanner can be thought of as a network call. Simplified: when you select columns, Spanner makes a network call to the server where the table resides, executes the query, and then transfers the data to you. Thus, if you join two tables, there's a high likelihood that those tables are stored on different servers. This scenario can result in a significant number of network calls. Given that networks are inherently unreliable, this can directly impact performance. You can gauge this by examining your query's execution plan. If it contains distributed operators, Spanner has to make several network calls to execute it. The fewer the network calls, the better the performance.

Let's examine how to minimize distributed operators in the first place.

Avoid Joins by Accident

Every time you use a secondary index and query for columns that are not in the index, Spanner does a back-join to get the missing columns. You can think of the index as a table that has the indexed column as a Key and the primary key of the original table as a Column. Spanner then uses a join between the index table and the original table to return the query result. If you only think about the index, this is counterintuitive.

There are three simple ways to mitigate the problem:

Split your query into two. First, use the secondary index to query for the IDs, then use the IDs to query the original table.
Use Stored Columns. By doing this, Spanner adds the Columns to the Index Table as not indexed fields, and therefore does not need to do a back-join on them.
Add the columns to the index itself.

Every solution has its advantages and disadvantages. The first solution must be done on the application side but is very flexible. You do not have to alter the index at all.

The second solution – my preferred one – uses more disk space as the columns are effectively duplicated.

The third way has the same downsides as the second one, yet for each insert, the index must be updated on multiple columns, which is often slower than just storing the columns.

Design for Joins – Interleaved Tables

If you anticipate that certain tables will be frequently joined, for example, customers and shipping addresses, consider using interleaved tables in Spanner. When two tables are interleaved, Spanner stores them closely together, reducing the need for network calls.

However, this approach is only viable for genuine parent-child relationships. If you interleave table A and B, you cannot insert into table B without a corresponding parent in table A; thus, interleaved tables aren't suitable for every situation.

As this article focuses on the general join use case, we will not delve deeper into the topic of interleaved tables. For more details, please refer to the official documentation.

Join Methods in Spanner

Most of the time, we do not think about what happens if we type something like:

select A.*, B.Foo from A join B on A.B_ID = B.ID

And to be honest, this is great because it reduces the cognitive load while using a database. However, as mentioned above, if you need a performant query, you have to invest some time and figure out what works best.

In the default scenario, Spanner’s query optimizer will choose the appropriate join type for your query, based on statistics collected in the background. However, if you add a new join, maybe on new tables, the optimizer does not have enough knowledge and you may want to enforce one join method to keep the performance up. You can do this with join hints:

select A.*, B.Foo from A join@{JOIN_METHOD=apply_join} B on A.B_ID = B.ID

We will take a closer look at different join types in Spanner right now. For illustrative purposes, we will use this join on the given schema:

SELECT c.FirstName, o.ProductName
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID;

CREATE TABLE Customers (
CustomerID INT64 NOT NULL,
FirstName STRING(255),
Email STRING(255)
) PRIMARY KEY (CustomerID);

CREATE TABLE Orders (
OrderID INT64 NOT NULL,
CustomerID INT64 NOT NULL,
ProductName STRING(255)
) PRIMARY KEY (OrderID);

Hash Join

Hash Joins are a good choice if you want to join two large tables. A Hash Join runs in two phases.

First, it takes the “build” table (the table that has fewer rows, in our case, it's Customers) and creates an in-memory hash table where the hash key is derived from the join. In our scenario, the hash key will be based on CustomerID.

After that, the other table (the “probe” table, Orders in our case) is scanned. For each row in Orders, the hash key of CustomerID is calculated and then searched in the Hash Table built previously. If there is a match, the row will be in the result set.

You can enforce a Hash Join in Spanner in two ways:

select A.*, B.Foo from A hash join B on A.B_ID = B.ID

select A.*, B.Foo from A join@{JOIN_METHOD=hash_join} B on A.B_ID = B.ID

You can provide additional join hints for Hash Joins:

HASH_JOIN_BUILD_SIDE with values BUILD_LEFT and BUILD_RIGHT to decide which table should be used for the build side. By default, the first table (the one after from) will be used, which is equivalent to BUILD_LEFT.
HASH_JOIN_EXECUTION with MULTI_PASS (default) and SINGLE_PASS. This is important if you reach the memory limit while building the Hash Table – SINGLE_PASS writes the Hash Table to disk.

Hash Joins come with their advantages and disadvantages.

Advantages

Hash Joins are very efficient for large data sets, especially if you expect that the join result will be large as well.
Hashing and probing the hash table are typically very fast.

Disadvantages

The Join performs a full table scan on the probe table. This is especially inefficient if only small parts of the tables are required for the join.
If the hash table does not fit into memory, it must be partially written and read from the disk, which is significantly slower.
Indexes on the join columns are not used! Thus, it does not make sense to create indexes on CustomerID and use Hash Joins simultaneously.

You can improve the performance of a Hash Join by reducing the size of the build table with an appropriate WHERE condition. Adding an index to the columns in the WHERE condition will speed up the creation of the build table, but does not affect the underlying join.

Hash Joins are useful for large ad hoc joins and data analysis but may not be the best choice to fetch the orders for a single customer.

Apply Join

Apply Joins, also known as Nested Loop Joins, work as follows: For each row on the left side (outer loop), the join condition is checked against every row on the right side (inner loop). For many developers, nested loops can be concerning due to their inherent inefficiency and quadratic runtime.

Apply Joins are effective when both tables are small or can be significantly reduced using WHERE conditions. For instance, a nested loop involving 4 rows on the left side and 5 rows on the right side is manageable

Apply Joins can also be optimized for larger tables. If the right side has an index on the join column, Spanner only needs to loop over the left side and can use the index to efficiently fetch the rows from the right side. When the left side is narrowed down via an appropriate WHERE condition (ideally backed up by an index), Apply Joins can efficiently handle larger tables. In some scenarios, they can be more efficient than Hash Joins, especially when a small subset of both tables is required.

You can enforce an Apply Join via:

select A.*, B.Foo from A join@{JOIN_METHOD=apply_join} B on A.B_ID = B.ID

Merge Join

A Merge Join takes both sides of the join and sorts them by the join columns. Spanner then reads the beginning of both sorted tables and compares them based on the join condition. If they match, the merged rows are added to the result set. If the join column of one table is smaller than the other, that row is skipped, and the next row from that table is read. If you need another mental model, think of Python's zip function, which works similarly (yet different).

A Merge Join is efficient if the data is already sorted, or can be quickly sorted using an index on the join columns.

Similar to Hash Joins, a Merge Join performs a full table scan on both tables. Therefore, the downsides of both join types are alike. The performance comes from the fact that neither hashing nor looping is needed with the sorted tables. If you only need a small subset of both tables, consider an Apply Join first.

It's worth noting that a Merge Join is never choosen by the query optimizer by default, so you'll need to enforce it with:

select A.*, B.Foo from A join@{JOIN_METHOD=merge_join} B on A.B_ID = B.ID

In day to day work, Merge Joins are not likely to be your first choice, but it is important to keep in mind that they exist.

Push Broadcast Hash Join

A Push Broadcast Hash Join acts as a distributed version of a Hash Join. Although the merge logic remains the same as the standard Hash Koin, in this method, the build side is serialized into a batch and dispatched to all remote servers holding data from the probe side. Once received, the batch is deserialized and the Hash Join logic takes over. So, when is this approach preferable to a regular Hash Join?

Push Broadcast Hash Joins facilitate parallel processing of the join, potentially resulting in swifter execution times. They tend to be more efficient when the build side is significantly smaller than the probe side. However, if both sides are of comparable size, transmitting the hash table to all the other servers might introduce a network overhead that isn't counteracted by faster query execution. Another consideration is data distribution. For instance, if 80% of the probe side data resides on one server and the remaining 20% is dispersed across multiple servers, the advantages might be negligible.

It's evident that this join type caters to specific scenarios, much like the merge join. Consequently, it isn't selected by the query optimizer by default. To enable it, use:

select A.*, B.Foo from A join@{JOIN_METHOD=push_broadcast_hash_join} B on A.B_ID = B.ID

Key Takeaways

Now that we got to know the different join methods in Spanner, let us summarize how to improve join performance.

Trust the Query Optimizer: Let the query optimizer do its job first. Only optimize when you experience bottlenecks.
Reduce Left Side: Try to reduce the left side of a join. Choose the smallest table as the left side if possible and reduce its size with a WHERE condition. Apply an index to the WHERE condition if suitable.
Use Stored Columns: Utilize Stored Columns to minimize the number of back-joins required, especially when dealing with secondary indexes.
Understand Data Distribution: Be aware of how your data is distributed across servers. Avoid join methods that might introduce unnecessary network overhead due to data distribution.
Consider Indexes on Join Columns: Especially for Apply Joins and Merge Joins, having indexes on the join columns can significantly speed up the join operation.
Monitor and Analyze Execution Plans: Regularly check the execution plans of your queries. Look for distributed operators or signs of full table scans, as they can indicate potential performance bottlenecks.
Data Volume and Query Frequency: The efficiency of join operations can be affected by the volume of data in your tables and the frequency of your queries. Large tables can slow down certain joins, especially if they're not optimized. If specific join queries are frequently executed, consider optimizing them regularly.

Thank you for exploring join optimizations in Spanner with me!

Want to become part of the team?

Jobs

15 people like this.

0No comments yet.

Write a comment

Answer to: Reply directly to the topic

Written by

Mirco Franzek

Backend Developer

About the author

We want to improve out content with your feedback.

How interesting is this blogpost?

We have received your feedback.

Allow cookies?

OTTO and four partners need your consent (click on "OK") for individual data uses in order to store and/or retrieve information on your device (IP address, user ID, browser information).
Data is used for personalized ads and content, ad and content measurement, and to gain insights about target groups and product development. More information on consent can be found here at any time. You can refuse your consent at any time by clicking on the link "refuse cookies".

Data uses

OTTO works with partners who also process data retrieved from your end device (tracking data) for their own purposes (e.g. profiling) / for the purposes of third parties. Against this background, not only the collection of tracking data, but also its further processing by these providers requires consent. The tracking data will only be collected when you click on the "OK" button in the banner on otto.de. The partners are the following companies:
Google Ireland Limited, Meta Platforms Ireland Limited, LinkedIn Ireland Unlimited Company, TikTok Information Technologies UK Limited
For more information on the data processing by these partners, please see the privacy policy at otto.de/jobs. The information can also be accessed via a link in the banner.
You can also withdraw your consent at any time without giving any reason by clicking on the button 'Cookie Settings' in the footer of the website and 'Refuse Cookies'.

more information