ARTICLES
A New Ranking Framework for Better Notification Quality on Instagram | 7 min | ML Applications | Xian Sun, Shawn Pachgade, Abhishek Mandal, Christopher Hay, Nimit Desai, Zhuoran Yu | Meta Engineering Blog
Meta introduces a ranking system to improve notification relevance and cut noise, boosting user engagement and satisfaction.
Why Your Team Doesn’t Trust AI| 6 min | AI | Sjoerd Pieksma | Xebia Blog
Xebia explores why AI adoption fails, from data gaps to explainability issues, and how governance and transparency rebuild trust.
Data Engineering Was Hard Until I Learned These 15 System Design Concepts | 9 min | Data Engineering | Akanksha Singh | Personal Blog
A practical guide to system design essentials like sharding, caching, CAP theorem, and queues, tailored for data engineers.
The Two Versions of Parquet| 5 min | Data Infrastructure | Jéronimo Castrillon | Personal Blog
A quick breakdown of Parquet v1 vs v2, what changed, and why it matters for schema evolution and performance tuning.
TUTORIALS
Building Slack’s Anomaly Event Response| 8 min | Data Infrastructure | Nathan Lehotsky, Ryan Persaud | Slack Engineering Blog
Slack details how they automated anomaly detection and alerting to keep systems reliable at scale.
LLM Evaluation: Practical Tips at Booking.com | 7 min | MLOps | George Chouliaras| MLOps Community
Booking.com shares real-world lessons for evaluating LLMs, from dataset design to balancing automated and human reviews.
How We Migrated 1 Billion Records from DB1 to DB2 Without Downtime | 7 min | Data Infrastructure | Himanshu Singour | Personal Blog
A step-by-step look at moving a billion records between databases with phased cutovers and zero downtime.
NEWS
Launch of Polars Cloud and Distributed Polars | 6 min | Data Engineering | Polars Blog
Polars launches a managed cloud platform with distributed support, bringing Rust-powered performance to big data analytics.
PODCAST
Leaders of Code EP#11 with Ryan J. Salva (Google) | 32 min | AI & Developer Experience | Peter O’Connor, Ryan J. Salva | Stack Overflow YouTube
Google’s Ryan Salva discusses how AI tools are reshaping developer experience, DevOps, and collaboration workflows.
DATA TUBE
Building Realtime End to End Sales Forecasting AI from Scratch| 5 h | ML | Yusuf Ganiyu | CodeWithYu
A hands-on project building a production-ready ML pipeline with Astro and Apache Airflow.
PINNACLE PICKS
Your last week top picks:
Why It’s High Time to Switch from Terraform to OpenTofu | 3 min | DevOps | Nikhil Donthula | KPMG UK Engineering Blog
HashiCorp’s license shift and IBM’s acquisition make Terraform’s future uncertain. OpenTofu, backed by the Linux Foundation, offers a safer, fully open alternative.
OLake | 7 min | Data Engineering | Olake.io
OLake replicates Postgres, MySQL, MongoDB, and Oracle to Apache Iceberg at up to 64K RPS, with CDC, schema discovery, and a lightweight Docker UI.
Starting Power BI Deployment Pipelines from Azure DevOps | 8 min | DevOps | Adrian Chodkowski | Seequality Blog
How to connect Azure DevOps with Power BI deployment pipelines using service principals, extensions, and YAML-based CI/CD.
________________________
Have any interesting content to share in the DATA Pill newsletter?
