ARTICLES
I have built around 300 agents, worked at 5 startups. Here's what I learnt about AI Agent | 8 min | AI Agents | Sai Yashwanth | Personal Blog
Practical takeaways from building agents at five startups, covering prompt design, tool use, memory, and user feedback loops.
ContextClues Graph Builder: Open Source Knowledge Graphs from Messy Docs | 7 min | Data Infrastructure | Eric Lisowski | Personal Blog
An open-source tool that turns unstructured text into knowledge graphs for analytics and retrieval.
7 Free Web Search APIs for AI Agents | 5 min | AI & Tools | Abid Ali Awan | KDnuggets
Quick roundup of free APIs like Tavily, Bing, and Serper to power real-time retrieval in agents.
Engineering Data Governance: From Theory to Production | 6 min | Data Governance | Pavel Kuchin | LinkedIn
A guide to moving governance from policy slides to production systems with lineage, metadata, and role-based controls.
TUTORIALS
Online Feature Store for AI and ML with Apache Kafka and Flink | 8 min | Streaming & AI | Kai Waehner | Personal Blog
Step-by-step on building low-latency feature pipelines for consistent ML training and serving.
Accelerate Data and AI Workflows by Connecting Amazon SageMaker Unified Studio with VS Code | 6 min | MLOps | Lauren Mullennex, Anagha Barve, Anchit Gupta, and Bhargava Varadharajan | AWS Big Data Blog
How to connect SageMaker Unified Studio to VS Code for faster model building and deployment.
TOOLS
An open-source platform that merges streaming and OLAP for real-time analytics with SQL queries and time travel.
A GPU-accelerated vector database built for massive-scale similarity search and RAG pipelines.
PODCAST
Dashboards Must Die, Long Live Dashboards | 1 h 6 min | Data Analytics | Andy Cotgreave, Michael Helbling, Moe Kiss, Tim Wilson | Analytics Power Hour Podcast
Debate on whether dashboards are fading or just evolving as automated insights and new visualization models emerge.Google’s Ryan Salva discusses how AI tools are reshaping developer experience, DevOps, and collaboration workflows.
DATA TUBE
The Big LLM Architecture Comparison | 1 h 26 min | LLM | Sebastian Raschka | Personal Channel
Deep dive into 11 open-weight models of 2025 with a clear walkthrough of attention, MoE, normalization, and design trade-offs.
PINNACLE PICKS
Your last week top picks:
Building Slack’s Anomaly Event Response | 8 min | Data Infrastructure | Nathan Lehotsky, Ryan Persaud | Slack Engineering Blog
Slack details how they automated anomaly detection and alerting to keep systems reliable at scale.
Launch of Polars Cloud and Distributed Polars | 6 min | Data Engineering | Polars Blog
Polars launches a managed cloud platform with distributed support, bringing Rust-powered performance to big data analytics.
Data Engineering Was Hard Until I Learned These 15 System Design Concepts | 9 min | Data Engineering | Akanksha Singh | Personal Blog
A practical guide to system design essentials like sharding, caching, CAP theorem, and queues, tailored for data engineers.
________________________
Have any interesting content to share in the DATA Pill newsletter?
