Deep dive into building a Data Lakehouse with Delta Lake and Spark

Deep dive into building a Data Lakehouse with Delta Lake and Spark

Deep dive into building a Data Lakehouse with Delta Lake and Spark Watch the recorded webinar that deep dives into building a data lakehouse with Delta Lake and Spark. https://youtu.be/mrHfdeH6az0 A big thank you to Jonathan Neo for a great presentation at the October Sydney Databricks meetup. Here’s what was covered: What makes up the Lakehouse architecture […]

Things You Wish You Had Known Earlier About Databricks Performance

Databricks August Meetup - Databricks Performance Tuning with Gin Jia

A big thank you to Jixin Jia (Gin), Databricks Solution Architect for a brilliant presentation, one that I personally found very interesting and learn a few things I didn’t know. Watch the online video recording to learn more about how to improve Databricks performance! To pick your interest, here are some of topics covered: 1. […]

Databricks Performance: Fixing the Small File Problem with Delta Lake

Delta Lake Performance

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue.  One of our customers is a great example – we ingest 0.5TB of JSON and CSV data per day made of 5kb files which equates to millions of files a week in the data lake Raw zone. […]

Subscribed! We'll let you know when we have new blogs and events...