Morgan
02/06/2023, 10:59 AMScott Sandre (Delta Lake)
02/06/2023, 4:57 PMlogRetentionDuration
refers to how long "log" aka .json files are kept in the delta log.
However, I can't just write a 5.json
and then immediately clean it up (delete it). We have to wait until we do the next checkpoint (e.g. 10.checkpoint.parquet
) before we can clean up previous logs.
So - are you doing enough transactions to perform a checkpoint?Morgan
02/06/2023, 5:22 PM{
"protocol": {
"minReaderVersion": 1,
"minWriterVersion": 2
}
}
{
"metaData": {
"id": "9824d1d0-ccac-4cec-a9c1-4bb174179b33",
"format": {
"provider": "parquet",
"options": {}
},
"schemaString": "{\"type\":\"struct\",\"fields\":[]}",
"partitionColumns": [],
"configuration": {
"delta.deletedFileRetentionDuration": "interval 0 days",
"delta.appendOnly": "true",
"delta.logRetentionDuration": "interval 0 days",
"delta.dataSkippingNumIndexedCols": "0",
"delta.checkpointRetentionDuration": "0 days"
},
"createdTime": 1675678319032
}
}
{
"commitInfo": {
"timestamp": 1675678319533,
"operation": "CREATE TABLE",
"operationParameters": {
"isManaged": "false",
"description": null,
"partitionBy": "[]",
"properties": "{\"delta.deletedFileRetentionDuration\":\"interval 0 days\",\"delta.appendOnly\":\"true\",\"delta.logRetentionDuration\":\"interval 0 days\",\"delta.dataSkippingNumIndexedCols\":\"0\",\"delta.checkpointRetentionDuration\":\"0 days\"}"
},
"isolationLevel": "Serializable",
"isBlindAppend": true,
"operationMetrics": {},
"engineInfo": "Apache-Spark/3.3.1 Delta-Lake/2.1.1",
"txnId": "86d2a833-d470-4d11-b4dd-1cdeac1be011"
}
}
Scott Sandre (Delta Lake)
02/07/2023, 5:07 PMMorgan
03/13/2023, 7:38 AM