Delta Lake Performance

Databricks Performance: Fixing the Small File Problem with Delta Lake

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue.  One of our customers is a great example – we ingest 0.5TB of JSON and CSV data per day made of 5kb files which equates to millions of files a week in the data lake Raw zone. …

Databricks Performance: Fixing the Small File Problem with Delta Lake Read More »