ARTICLES
Data Streaming: The Key to Tackling Data Challenges for AI Success | 5 min | Data Streaming | Lyndon Hedderly | Confluent Blog
Confluent shows why streaming is the backbone of AI, powering continuous training, dynamic RAG, and real-time agents.
How Python 3.14 t-Strings Differ from f-Strings | 3 min | Data Engineering | Stack Overflow
Python’s new t-strings preserve metadata for safer interpolation in SQL, HTML, and regex contexts.
Expanding the Hive Ecosystem with Iceberg REST | 4 min | Data Infrastructure | Dmitriy Fingerman | Medium
Hive gains modern table management with Iceberg REST, simplifying hybrid architectures via APIs.
Intelligent Kubernetes Load Balancing at Databricks | 10 min | Platform Engineering | Gaurav Nanda, Vincent Cheng and Rohit Agrawal | Databricks Engineering Blog
Databricks built a custom K8s load balancer that adapts to traffic in real time, cutting costs and boosting reliability.
TUTORIALS
Finetuning LLMs Yourself | 7 min | LLM | Jeroen Overschie | Xebia Blog
Hands-on guide to fine-tuning LLMs with parameter-efficient methods, dataset prep, and evaluation tips.
NEWS
Airbyte v2 | 6 min | Data Integration | Airbyte Blog
Airbyte v2 launches with faster syncs, scalable connectors, and cloud-native orchestration for ELT pipelines.
Apache Airflow 3.1.0 | 6 min | Orchestration | Kaxil Naik | Apache Airflow Blog
Airflow 3.1.0 improves scheduling, observability, async ops, and Kubernetes support for large-scale workflows.
Claude Sonnet 4.5 | 5 min | LLM | Anthropic
Claude Sonnet 4.5 brings faster responses, better reasoning, and broader context for enterprise use cases.
DATA TUBE
AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank) | 41 min | AI | Shreya Shankar, Huge Browne Anderson | Vanishing Gradients
Shreya Shankar explains frameworks and guardrails for running agents and LLM judges at massive scale.
EVENTS, CONFS, AND MEETUPS
P99 CONF 2025 | October 22–23, 2025 | Online + NYC
P99 CONF dives deep into performance, low-latency infra, and distributed systems with talks from top engineers.
PINNACLE PICKS
Your last week top picks:
MIT Says 95% of AI Pilots Fail. McKinsey Explains Why. Agentic Engineering Shows How to Fix It | 8 min | AI Strategy | Yi Zhou | Personal Blog
Most GenAI pilots stall after demos. McKinsey’s rules for survival and Zhou’s Agentic Engineering framework aim to make AI scalable and sustainable
Vortex | Data Infrastructure
An open, next-gen columnar file format optimized for analytics and AI, with modern compression and extensibility.
Adding Document Understanding to Claude Code | 7 min | AI Agents | Jerry Liu | LlamaIndex Blog
Practical takeaways from building agents at five startups, covering prompt design, tool use, memory, and user feedback loops.
________________________
Have any interesting content to share in the DATA Pill newsletter?
