Streamlining Time Series Data Analysis using Snowflake's ASOF Join

Streamlining Time Series Data Analysis using Snowflake's ASOF Join

Dealing with time series data is a common challenge for data engineers, especially when joining two tables where dates or timestamps do not align perfectly. For instance, Table 1 may have dates missing in Table 2 and vice versa.

Typically, this problem is addressed by generating a table that includes dates for every day and then joining this table with both Table 1 and Table 2 to obtain a daily view. Alternatively, one might perform a left join between Table 1 and Table 2 and then forward fill the null values with the last known value. Both methods can be cumbersome.

However, Snowflake has introduced a new feature to tackle this problem efficiently—the 'ASOF' join. This feature joins two time series tables and fills in null values with the last known value, streamlining the process significantly.

Enough of the theory, let's deep dive into a example:

Table 1: Contains exchange rates on specific dates.

DateExchange Rate
2023-04-0193
2023-04-0394
2023-04-0692
2023-04-0891
2023-04-1090

Table 2: Contains payment amounts on specific days.

The typical approach to calculate the exchanged amount, even for dates missing exchange rates (assuming the rate stays the same until updated), involves complex SQL queries:

WITH all_dates AS (
    SELECT 
        p.date,
        p.payment_amount,
        e.exchange_rate
    FROM payments p
    LEFT JOIN exchange e ON p.date = e.date
),
filled_exchange_rates AS (
    SELECT
        date,
        payment_amount,
        -- Use LAST_VALUE to carry forward the non-null values and IGNORE NULLS
        LAST_VALUE(exchange_rate) IGNORE NULLS OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS exchange_rate
    FROM all_dates
)

SELECT 
    date,
    payment_amount,
    exchange_rate,
    payment_amount * exchange_rate AS exchanged_amount
FROM filled_exchange_rates
ORDER BY date;

Result:

Streamlining with Snowflake's ASOF Join

Snowflake's ASOF join can simplify these calculations significantly:

select P.date, P.payment_amount, 
  P.payment_amount * E.exchange_rate as exchanged_amount
from PAYMENTS P
ASOF join exchange E
  MATCH_CONDITION (P.date >= E.date);

Result:

Explanation

  • ASOF Join Condition: Instead of a simple LEFT JOIN, an ASOF JOIN is specified, allowing for a time-sensitive match.

  • MATCHING CONDITIONS Clause: This clause is specific to Snowflake's ASOF join syntax. It specifies that each payment should be aligned with the last known exchange rate on or before the payment date, ensuring that the exchange rate applied is the latest available up to that day.

  • Column Selection and Calculation: The query selects dates and amounts from both tables and calculates the exchanged amount by multiplying the payment amount by the applicable exchange rate.

For a more in-depth explanation and additional examples, you can refer to the following links: