i have a json like this:
{
"col1": 5905,
"col2": "4540",
"col3": "23",
"obj1": {
"col4": 5905,
"col5": 3100000000000,
"col6": null
},
"obj2": [
{
"col7": null,
"col8": 1
},
{
"col7": null,
"col8": 2
}
],
"obj3": {
"col9": 2611937,
"col10": "ABCD",
"obj4": {
"col11": "1-1PT-515",
"col12": "XXXXXXX"
},
"col13": 152
}
}
I'm not an expert on json formatting so i have this script in snowflake to flatten this file and obtaining a columnar table with each "colx" as a column and each value as a new row but the performance is really bad
select
tab1.col1::VARCHAR AS col_1
,tab1.col2::VARCHAR AS col_2
,tab1.col3::VARCHAR AS col_3
,sub1.this:col4::VARCHAR AS col_4
,sub1.this:col5::VARCHAR AS col_5
,sub1.this:col6::VARCHAR AS col_6
,sub2.this:col7::VARCHAR AS col_7
,sub2.this:col8::VARCHAR AS col_8
,sub3.this:col9::VARCHAR AS col_9
,sub3.this:col10::VARCHAR AS col_10
,sub4.this:col11::VARCHAR AS col_11
,sub4.this:col12::VARCHAR AS col_12
,sub3.this:col13::VARCHAR AS col_13
FROM
DB.SCHEMA.TABLE_1 tab1
,lateral flatten(input => obj1) sub1
,lateral flatten(input => obj2) sub2
,lateral flatten(input => obj3) sub3
,lateral flatten(sub3.this:obj4) sub4
GROUP BY ALL;
I had to add a "group by all" condition in the end because is generating thousands of rows, for example, with 4.500 rows, generates over 260k repeating each row multiple times,
Thoughts on how to approach this?? I really do hope there's a better way to do this because is taking a lot!
Thanks