Enterprise Governance on Data Lake with Unity Catalog & Databricks Implementation with R Jobs Migration Use Case

Enterprise Governance on Data Lake with Unity Catalog & Databricks Implementation with R Jobs Migration Use Case A big thank you to Vinny Vijeyakumaar from Databricks and Deon Jacobs from Data-Driven for a great presentation at the first in person Sydney Databricks meetup last July 14, 2022. What was covered: Enterprise Governance on Data Lake with Unity Catalog […]
Deep dive into building a Data Lakehouse with Delta Lake and Spark

Deep dive into building a Data Lakehouse with Delta Lake and Spark Watch the recorded webinar that deep dives into building a data lakehouse with Delta Lake and Spark. https://youtu.be/mrHfdeH6az0 A big thank you to Jonathan Neo for a great presentation at the October Sydney Databricks meetup. Here’s what was covered: What makes up the Lakehouse architecture […]
Things You Wish You Had Known Earlier About Databricks Performance

A big thank you to Jixin Jia (Gin), Databricks Solution Architect for a brilliant presentation, one that I personally found very interesting and learn a few things I didn’t know. Watch the online video recording to learn more about how to improve Databricks performance! To pick your interest, here are some of topics covered: 1. […]
Databricks Performance: Fixing the Small File Problem with Delta Lake

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue. One of our customers is a great example – we ingest 0.5TB of JSON and CSV data per day made of 5kb files which equates to millions of files a week in the data lake Raw zone. […]