Performance improvements on Delta tables

There was a scenario, when a query was executed on a delta table, by joining one or more delta tables, it was taking long time. 
The reason for this is there are large number of small parquet files are created.
It is possible to avoid these large number of small files and compact those files using the Auto Optimize feature.

For a Delta table we can set the table properties as delta.autoOptimize.optimizeWrite = true and delta.autoOptimize.autoCompact = true in the CREATE TABLE command.
This will avoid those large number of small files and compact those.

Reference:
https://docs.microsoft.com/en-us/azure/databricks/delta/optimizations/auto-optimize


No comments:

Post a Comment

How to run UPDATE/INSERT/DELETE Statements on Azure SQL Database in Microsoft Fabric Notebook...

You can run UPDATE/INSERT/DELETE Statements on Azure SQL Database in Microsoft Fabric Notebook using Python SQL Driver - pyodbc.  For the Fa...