Learning

Databricks August Meetup - Databricks Performance Tuning with Gin Jia

Things You Wish You Had Known Earlier About Databricks Performance

Azure-certified Data Architect with a focus on delivering business value and guiding customers through the maze of analytical architectures, design and implementation activities.

Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.


Avatar of Rodney Joyce

A big thank you to Jixin Jia (Gin), Databricks Solution Architect for a brilliant presentation, one that I personally found very interesting and learn a few things I didn’t know. Watch the online video recording to learn more about how to improve Databricks performance! To pick your interest, here are some of topics covered: 1. …

Things You Wish You Had Known Earlier About Databricks Performance Read More »

Delta Lake Performance

Databricks Performance: Fixing the Small File Problem with Delta Lake

Anjana Rupasinghege is the Technical Director and Lead
Architect at Data Driven, specialised in Cloud, Security, Data
and Analytics.

With a background in Azure modern data architecture, he
has over 15 years of experience working in Information
Technology in industries such as Government, Banking,
Telecommunication and Consulting.

Avatar of Anjana Rupasinghege
Latest posts by Anjana Rupasinghege (see all)

A common Databricks performance problem we see in enterprise data lakes are that of the “Small Files” issue.  One of our customers is a great example – we ingest 0.5TB of JSON and CSV data per day made of 5kb files which equates to millions of files a week in the data lake Raw zone. …

Databricks Performance: Fixing the Small File Problem with Delta Lake Read More »

Model Training Data

Visual Machine Learning with Teachable Machine – Not Happy Baby

Today I came across this site called “Teachable Machine” made by some smart people and built with Tensorflow.js. It lets you build and train simple models on sound, video and images and then you can export your models in various formats to integrate with your own app or system! I have done something similar with Microsoft PowerApps and the AI Builder, but this takes it to a new level.

20190502 224959 • Data and AI Analytics

Data Science for Dummies – Titanic survival prediction with Azure Machine Learning Studio + Kaggle (Tech Talk 2 of 9)

Azure-certified Data Architect with a focus on delivering business value and guiding customers through the maze of analytical architectures, design and implementation activities.

Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.


Avatar of Rodney Joyce

Here is the next tech talk in Data Science for Dummies series I am presenting around Sydney:  Part 2 of 9: Titanic survival prediction with Azure Machine Learning Studio + Kaggle. and the slides: In this session we will use the Azure Machine Learning Studio and focus on the prediction model building to get quick …

Data Science for Dummies – Titanic survival prediction with Azure Machine Learning Studio + Kaggle (Tech Talk 2 of 9) Read More »

Databricks-unified-analytics-background

Data Science for Dummies – Data Science Overview with Databricks (Tech Talk 1 of 9)

Azure-certified Data Architect with a focus on delivering business value and guiding customers through the maze of analytical architectures, design and implementation activities.

Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.


Avatar of Rodney Joyce

You might have heard of Spark and how it’s the evolution of Hadoop… great for processing Big Data…. but have you heard of Databricks? Here are the slides for the next tech talk in Data Science for Dummies series I am presenting around Sydney:  Part 1 of 9: Data Science Overview with Databricks† Think Spark-as-a-service, …

Data Science for Dummies – Data Science Overview with Databricks (Tech Talk 1 of 9) Read More »

Data Science for dummies

Data Science for Dummies – Data Engineering with Titanic dataset + Databricks + Python (Tech Talk 3 of 9)

Azure-certified Data Architect with a focus on delivering business value and guiding customers through the maze of analytical architectures, design and implementation activities.

Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.


Avatar of Rodney Joyce

I put together a tech talk on Machine Learning and Databricks which is the 3rd part of an 9 part Data Science for Dummies series: Data Engineering with Titanic dataset + Databricks + Python. Preparing & feature engineering highlighted the importance of domain knowledge, even with something as simple as a 10 column dataset! It …

Data Science for Dummies – Data Engineering with Titanic dataset + Databricks + Python (Tech Talk 3 of 9) Read More »

Subscribed! We'll let you know when we have new blogs and events...