Spark

Databricks Performance: Fixing the Small File Problem with Delta Lake

Anjana Rupasinghege is the Technical Director and Lead
Architect at Data Driven, specialised in Cloud, Security, Data
and Analytics.

With a background in Azure modern data architecture, he
has over 15 years of experience working in Information
Technology in industries such as Government, Banking,
Telecommunication and Consulting.

Avatar of Anjana Rupasinghege
Latest posts by Anjana Rupasinghege (see all)

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue.  One of our customers is a great example – we ingest 0.5TB …

Databricks Performance: Fixing the Small File Problem with Delta Lake Read More »

Delta Lake Performance

Data Science for Dummies – Data Engineering with Titanic dataset + Databricks + Python (Tech Talk 3 of 9)

Azure-certified Data Architect with a focus on delivering business value and guiding customers through the maze of analytical architectures, design and implementation activities.

Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.


Avatar of Rodney Joyce

I put together a tech talk on Machine Learning and Databricks which is the 3rd part of an 9 part Data Science for Dummies series: Data Engineering with Titanic dataset …

Data Science for Dummies – Data Engineering with Titanic dataset + Databricks + Python (Tech Talk 3 of 9) Read More »

Data Science for dummies

Subscribed! We'll let you know when we have new blogs and events...