This blog series about Microsoft Azure Synapse will start with a 10 000 foot overview about the data analytics service and take you from what it is and how to get started through to a technical deep-dive to kick the tires. It doesn’t matter if you are a CIO or business decision maker, IT Manager, Data Engineer, Data Scientist, student or merely lost on Google, we have something for you!
By the end of this series you’ll be answer some key questions you are probably facing in your organization:
- What is Microsoft Synapse – is it a bird, is it a plane? No, it’s a DataLakeHouse unified data analytics platform.
- Should I use Azure Synapse or not?
- How much will it cost? (to which you will probably hear the answer: “It depends” – if you don’t, please contact us and our consultants will tell you the same, although we’ll also help you set a firm scope and price)
- Is there anything better and will it last longer than the last analytics service we just finished putting in?
We all know that feeling. The pace of technology is increasing at a break-neck speed and you are the guy or gal who is supposed to know what is happening. And yet, somewhere between work, Netflix, Coronavirus, kids and that thing called life, we suddenly find ourselves hearing buzzwords that we know little about. You’re not getting old (well, technically you are) it’s just that there’s more to know and learn and less time to do it in. It’s ok – that’s the point of blog series such as this. We’re going to do all the hard work so that you can look smart when your boss asks you about Azure Synapse at the watercooler (or Teams Meeting).
The only way to find out if Azure Synapse is vaporware and something the marketing team cooked up after the Christmas party is to put it through it’s paces! We’ll be taking a Covid-19 Big Dataset and defining a PoC advanced analytics use-case so we can (hopefully) do some cool stuff with it – a bit of EDA, some ETL/ELT/transformations at scale, Visualisations and of course some good old fashioned Machine Learning, all on Synapse. We’ll try to find it’s strengths and expose it’s weaknesses.
For the decision makers, we’ll be looking at things like:
- What are good use-cases for Synapse and if your business can benefit from it
- Why we have yet-another-data-analytics-service (YADAS) to choose from
- Market fit – how does Synapse fit into a crowded market with services like Databricks, Snowflake and others?
- What is it’s value prop and differentiator, if any?
- What is a DataLakeHouse and do we need it? (and what happened to the data lake and my fancy data warehouse?)
- Other options for achieving the same end-result from a Total Cost of Ownership, Time to Market, Time to Insights and point of view
For the Data Engineers and Data Scientists and techies we’ll deep dive into:
- How to get started and choose a good advanced analytics use-case
- Find out if it is a good tool for Analysts, Data Engineers AND Data Scientists (Ie. Is it a true unified analytics platform?)
- Run some BIG Data (my Outlook unread Inbox) through the Spark engine and compare it to tools like Databricks, HD Insights, IaaS Spark and others
- Do some low level testing of the Data Warehousing component compared to Snowflake, BigQuery, Redshift, Azure Data Warehouse Gen 2 (Oh hang on, that’s Synapse now… wait, what?)
- Do some bog standard business intelligence reporting and
- Do some EDA and Machine Learning on the data to get some predictive insights
- Compare the final end-to-end solution to other Microsoft and non-Microsoft solutions that achieve the same end-result in terms of:
- Functionality (e.g. real-time streaming and batch, small and big data, machine learning and transformations at scale, etc.)
- Solution Architecture
- Complexity
- Extensibility
- Cost (At rest and running and average load)
- Security
As you can tell, I’m a bullet-point fan. Let’s do one more.
Contents list for the Azure Synapse blog posts:
- Series Overview (you made it this far -10 points!)
- What is Microsoft Azure Synapse – A unified analytics platform or vaporware?
- Choosing a Advanced Analytics use-case and defining the objective
- How to get started with Azure Synapse Analytics
- Exploratory Data Analysis (EDA) with Notebooks in Azure Synapse
- No-Code, Low-Code and Code approaches for transformations – Dataflows VS ADF in Azure Synapse
- Transformation and performance at scale with Big Data & Spark
- Cosmos Link – HTAP with Synapse (No ETL)
- Data Warehousing with Synapse and SQL
- Machine Learning with Synapse
- Business Intelligence Visualisations with Power BI, PowerApps and Azure Synapse
- End to End Demo: Ingestion, Transformation, Staging and Visualization with Microsoft Synapse
- Wrap up – Did we answer all of the questions and objectives?
We’re going to do this in an agile way – so every time we do a blog post we’ll publish it and leave you hanging for the next one (think “Lord of the Rings – Part 1“). It also means our content list is going to be ever-evolving (this is version 0.1) as we think of new talking points or get good questions about Microsoft Synapse to answer.
Follow us on LinkedIn and/or sign up to our Data and AI Blog for the next post on “How to Get Started with Microsoft Synapse”. Let’s make this interactive too – let me know in the comments if I have missed anything or if you have any particular questions you’d like answered (ideally related to Microsoft Synapse).
Experienced in setting up modern data platforms with advanced predictive analytic workloads. Brings strong people skills and a devops-centric, entrepreneurial approach to Enterprise software delivery.
- Hiring: Azure Software Engineer - January 12, 2023
- 10 reasons to use Azure SQL in your next analytics project - November 3, 2020
- A Developer’s Guide to Building AI Application - September 4, 2020