Delta Lake

Databricks Meetup Image 1200x627 3 • Data and AI Analytics

Enterprise Governance on Data Lake with Unity Catalog & Databricks Implementation with R Jobs Migration Use Case​

Marketing and Sales Director @Data-Driven. I am a passionate professional who strives to deliver excellent customer service, solve problems using modern technology and deliver practical solutions under resource and time constraints.
Sofia Oropeza

Enterprise Governance on Data Lake with Unity Catalog & Databricks Implementation with R Jobs Migration Use Case A big thank you to Vinny Vijeyakumaar from Databricks and Deon Jacobs from Data-Driven for a great presentation at the first in person Sydney Databricks meetup last July 14, 2022. What was covered: Enterprise Governance on Data Lake with Unity Catalog …

Enterprise Governance on Data Lake with Unity Catalog & Databricks Implementation with R Jobs Migration Use Case​ Read More »

Deep dive into building a Data Lakehouse with Delta Lake and Spark

Deep dive into building a Data Lakehouse with Delta Lake and Spark

Marketing and Sales Director @Data-Driven. I am a passionate professional who strives to deliver excellent customer service, solve problems using modern technology and deliver practical solutions under resource and time constraints.
Sofia Oropeza

Deep dive into building a Data Lakehouse with Delta Lake and Spark Watch the recorded webinar that deep dives into building a data lakehouse with Delta Lake and Spark. https://youtu.be/mrHfdeH6az0 A big thank you to Jonathan Neo for a great presentation at the October Sydney Databricks meetup. Here’s what was covered: What makes up the Lakehouse architecture …

Deep dive into building a Data Lakehouse with Delta Lake and Spark Read More »

Databricks Meetup - Enabling Self-Service Analytics & ML at Transport for NSW with Databricks

Enabling Self-Service Analytics & ML at Transport for NSW with Databricks

Marketing and Sales Director @Data-Driven. I am a passionate professional who strives to deliver excellent customer service, solve problems using modern technology and deliver practical solutions under resource and time constraints.
Sofia Oropeza

Enabling Self-Service Analytics & ML at Transport for NSW with Databricks Sydney Databricks Meetup – December 2020 In this business-focused Databricks Meetup, Shelby Ferson (Sr. Manager ANZ – Databricks), Sandeep Mathur (Program Manager – TfNSW) and Rodney Joyce (Practice Director – Data-Driven) discuss how Transport for NSW (TfNSW) leveraged Databricks to enable Self-Service Analytics and …

Enabling Self-Service Analytics & ML at Transport for NSW with Databricks Read More »

Databricks and Data-Driven Partnership

Databricks and Data-Driven Announce Strategic New Partnership

Marketing and Sales Director @Data-Driven. I am a passionate professional who strives to deliver excellent customer service, solve problems using modern technology and deliver practical solutions under resource and time constraints.
Sofia Oropeza

Sydney, Australia – September 17, 2020 – Data-Driven AI has announced a strategic partnership with Databricks, the leader in Unified Data Analytics to deliver Azure Databricks solutions to it’s customers. This partnership combines Databricks’ simplified approach to data science/analytics with Data-Driven’s precision engineering and consulting services to enable smarter and better outcomes for clients. The business …

Databricks and Data-Driven Announce Strategic New Partnership Read More »

Delta Lake Performance

Databricks Performance: Fixing the Small File Problem with Delta Lake

Anjana Rupasinghege is the Technical Director and Lead
Architect at Data Driven, specialised in Cloud, Security, Data
and Analytics.

With a background in Azure modern data architecture, he
has over 15 years of experience working in Information
Technology in industries such as Government, Banking,
Telecommunication and Consulting.

Anjana Rupasinghege
Latest posts by Anjana Rupasinghege (see all)

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue.  One of our customers is a great example – we ingest 0.5TB of JSON and CSV data per day made of 5kb files which equates to millions of files a week in the data lake Raw zone. …

Databricks Performance: Fixing the Small File Problem with Delta Lake Read More »

Subscribed! We'll let you know when we have new blogs and events...