I am learning Snowflake right now, and the way FLATTEN() works is a bit counter intuitive.
A simple query.
SELECT
raw:first_name::STRING AS FName,
raw:last_name::STRING AS LName,
f.value::STRING AS Skill
FROM tbl,
TABLE(FLATTEN(raw:Skills)) f
ORDER BY raw:id::INT
LIMIT 5;
Elegantly, it flattens the Skills
array and returns this.
FNAME | LNAME | SKILL |
---|---|---|
Flossy | Fasson | PS3 |
Flossy | Fasson | Vlookup |
Flossy | Fasson | Go |
Celeste | Hubert | Tcl-Tk |
Celeste | Hubert | Zines |
My questions are:
-
Does Snowflake just infer that the source of
raw:Skills
is tabletbl
, as table name is not explicitly expressed here? -
How does Snowflake align the array (as the right table) to the left table? It looks like a Cross Join with no join keys, if so, every array should be joined to each and every row on the left and result in incorrect alignment.
2
Answers
Related: Lateral Join
the first question is same "inference" as when you use a column name but do not use table name or the alias, thus these are all the same:
now when this gets tricky is if two tables have the same column name, you must say which one you want
if they both have an
id
this will not compile.In some databases if you join on id between two tables:
this is valid, but not in Snowflake.
the FLATTEN is like a LEFT JOIN every input row is match to every expanded row, and both sources and the flatten objects (and you can do many FLATTEN and it just combinations, aka cross joins all the way down).
the comma method you have show is the "old SQL style" and the new style would be a CROSS JOIN, but LATERAL is also valid as that is more akin to what is happening. Snowflake support mean dialects from other RMDBs, to make porting SQL easier, but it does not cover all cases.