Alessandro Mangone
01/25/2023, 4:00 PMrow_number()
window function.
When using the row_number() > 1
condition, I get the error that windows and aggregations are not supported by DeltaTable’s delete.
I tried rewriting that as a subquery, but I get the error that subqueries are not supported by delete.
Is collecting the unwanted values and use Column.isin(unwantedValues)
my only option? Are there any alternatives I might consider?Nick Karpov
01/25/2023, 4:53 PMAlessandro Mangone
01/25/2023, 5:00 PMrow_number()
function.
What I would like to achieve is periodically removing from the table old record versions (using a row_number()
in this case as well), so that our data grows but in a controlled way.Matthew Powers
01/25/2023, 5:06 PMrow_number() > 1
logic in the drop_duplicates_pkey
function in mack: https://github.com/MrPowers/mack/blob/main/mack/__init__.py#L246. You mind find that code useful. cc: @Robert KossendeyAlessandro Mangone
01/25/2023, 5:07 PMLennart Skogmo
01/25/2023, 5:15 PMdf.dropDuplicates("col1", "col2")
Nick Karpov
01/25/2023, 5:16 PMAlessandro Mangone
01/25/2023, 5:16 PMNick Karpov
01/25/2023, 5:18 PMAlessandro Mangone
01/25/2023, 5:18 PMLennart Skogmo
01/25/2023, 5:22 PMMERGE INTO pbc.mrt_leave t
USING
(
SELECT
member_id,
COUNT(*) AS order_cnt,
SUM(CASE WHEN order_status = 'Shipped' THEN 1 ELSE 0 END) AS shipped_order_cnt,
SUM(CASE WHEN order_status = 'Cancelled' THEN 1 ELSE 0 END) AS cancelled_order_cnt,
SUM(CASE WHEN order_status = 'Shipped' THEN price + discount ELSE 0 END) AS revenue
FROM pbc.arc_order
GROUP BY member_id
)
s
ON t.member_id = s.member_id
WHEN MATCHED THEN UPDATE SET
t.order_cnt = s.order_cnt,
t.shipped_order_cnt = s.shipped_order_cnt,
t.cancelled_order_cnt = s.cancelled_order_cnt,
t.revenue = s.revenue
-- Untested delete merge
MERGE INTO table t
USING
(
SELECT id, date, row_number() OVER (PARTITION BY id, date DESC) AS rank
FROM table
)
s
ON t.id = s.id AND t.date = s.date AND s.rank > 1
WHEN MATCHED THEN DELETE
Alessandro Mangone
01/25/2023, 5:23 PMMatthew Powers
01/25/2023, 5:24 PMAlessandro Mangone
01/25/2023, 5:25 PM