loader
AA_Collage_02 (2)

advancing-spark

Welcome to the Advancing Analytics blog. We’re a bunch of data and AI specialists who build modern platforms and make messy data behave.

You'll find a trove of posts and video here, focused on AI and machine learning, data engineering and analytics. Our blogs are all based on what we’re seeing in the real world - with lessons learned from the sharp end of delivery. 

pattern
Vector (44)
border-traingle

Recent Blogs

Azure CosmosDB for GraphRAG: A Single-Platform Solution for GenAI
Introduction In the evolving landscape of enterprise architecture, organisations often find...
Liquid Clustering 101 - How you should be storing & optimising your...
Liquid Clustering (LC) was designed to replace table partitioning and the ZORDER command, to...
Databricks Delta Cache and Spark Cache
As data sizes and demand increases as time goes on, you often see slowness on Databricks this...
Spark 3.0 Questions and answers from the Data AI Summit
At the Data + AI Summit, Simon delivered a session on “Achieving Lakehouse Models with Spark...
Identifying Data Outliers in Apache Spark 3.0
The secret to getting machine learning to work effectively is in ensuring that the data we are...
Will Koalas replace PySpark?
One of the first of many big announcements at the 2020 Spark and AI Summit was the official...
Vector (45)
Vector (46) dise