Performance improvements on Delta tables

There was a scenario, when a query was executed on a delta table, by joining one or more delta tables, it was taking long time. 
The reason for this is there are large number of small parquet files are created.
It is possible to avoid these large number of small files and compact those files using the Auto Optimize feature.

For a Delta table we can set the table properties as delta.autoOptimize.optimizeWrite = true and delta.autoOptimize.autoCompact = true in the CREATE TABLE command.
This will avoid those large number of small files and compact those.

Reference:
https://docs.microsoft.com/en-us/azure/databricks/delta/optimizations/auto-optimize


No comments:

Post a Comment

tablename_WriteToDataDestination: Mashup Exception Data Source Error Couldn't refresh the entity...

 Once a Dataflow is created and published on Fabric, got the below error while refreshing the Dataflow. tablename_ WriteToDataDestination: M...