DATA Pill feed

DATA Pill #186 – Agent Standards Wars, Real‑Time Pipelines, Free Kafka & DevStral 2. Check what you mis

ARTICLES

8 Best Neptune AI Alternatives to Track Your ML Experiments Better | ZenML | Hamza Tahir | 17 min | MLOps & Experiment Tracking
With Neptune.ai joining OpenAI and winding down its public service, this piece reviews nine experiment‑tracking platforms that can replace and expand on Neptune’s capabilities. The author compares open‑source and commercial alternatives across scalability, developer experience, model registry features and pipeline orchestration, aiming to provide a complete MLOps stack
ZenML argues that the same pipeline abstraction that powers offline ML should now underpin agentic workflows. By turning pipelines into persistent HTTP services with warm state and lineage tracking, teams get deterministic retries, observability and instant rollbacks. The article contends that as AI apps evolve from single‑step calls to multi‑step orchestration, pipelines — not ad‑hoc web frameworks — provide the right execution boundary
Claude Code Is Coming to Slack — and It’s a Bigger Deal Than It Sounds | TechCrunch | Rebecca Bellan | 6 min | AI Productivity
Anthropic’s Claude Code is being integrated directly into Slack, turning the chat platform into a full coding environment. Instead of only generating snippets, developers can tag @Claude to spin up complete coding sessions using contextual information from their threads; the assistant analyzes recent messages to pick repositories, post progress updates and open pull requests.
It’s Just Free Kafka in the Cloud | Aiven | Filip Yonov & Gemma Minihan | 5 min | Streaming & Cloud
Aiven has launched a fully managed, production‑grade Apache Kafka service that’s free forever — no credit card required. The free tier allows up to 250 kb/s throughput, three‑day retention and includes Schema Registry and REST proxy
We Spent 2 Years Building a Data Mesh — It Was a $4M Disaster | Medium | Amįń | 9 min | Data Leadership
In this candid postmortem, a data engineer recounts how a two‑year, $4 million data‑mesh initiative turned into 47 competing fiefdoms. Despite following best practices and even consulting Zhamak Dehghani, the team ended up with worse data quality, unused “data products” and missed deadlines; they are now abandoning the mesh in favor of a centralized platform

NEWS

penAI, Anthropic & Block Form the Agentic AI Foundation (AAIF)
American AI heavyweights OpenAI, Anthropic and Block have co‑founded the Agentic AI Foundation to create open standards for autonomous agents. Each company donated its core framework — OpenAI’s Model Context Protocol (MCP), Anthropic’s Agents.md and Block’s Goose — to seed the project, which is hosted under the Linux Foundation

TUTORIALS & BOOKS

SSO with Authentik | Xebia | Jetze Schuurmans | 9 min
This step‑by‑step guide shows how to deploy Authentik, an open‑source identity provider, as a single sign‑on system for a Kubernetes cluster. It explains the OpenID Connect authentication flow and provides Helm/Kustomize snippets for integrating Authentik with K3s, Sealed Secrets and Argo CD
Quickstart: Run an MCP Server on JVM and Integrate with Copilot | Allegro Tech Blog | Tomasz Gryl | 7 min
Allegro’s tutorial walks through building a Model Context Protocol (MCP) server in Kotlin using Spring AI and making it available to GitHub Copilot. It introduces MCP’s communication protocols and demonstrates creating a custom MCP tool and exposing it via a Streamable‑HTTP server
Cloud Native Geospatial Analytics with Apache Sedona (Book) | Paweł Tokaj, Jia Yu, Mo Sarwat | Wherobots & O’Reilly
This open‑source book and repository by Prashant Sharma provide hands‑on examples for building geospatial analytics pipelines with Apache Sedona. Chapters cover data loading, spatial SQL on points/lines/polygons, raster processing, PyData and Airflow integrations, and building a geospatial lakehouse with Parquet and Iceberg

DATA TUBE

This session explores the shift from monolithic models to agentic systems, focusing on how developers should structure orchestration layers, evaluation harnesses and safety guardrails for autonomous agents. Examples cover multi‑agent planning, tool selection and error handling.
A live demo showing how to augment streaming data pipelines with generative AI and multi‑agent logic. The speaker builds an end‑to‑end pipeline that uses streaming context to feed real‑time agents, illustrating state management, observability and deployment patterns.

TOOL

Concurrent is a free desktop application that lets you benchmark over a thousand AI models across 30+ providers and local runtimes. You type a prompt and the tool sends it to multiple models, then displays responses side‑by‑side with metrics for quality, speed (tokens per second) and cost.
Mistral unveiled DevStral 2, a family of code‑generation models built on a shared architecture. The flagship 123B model targets high‑end agentic development, while the 24B “Small” version runs offline on a single GPU or even a laptop. Both offer 256K context windows and top‑tier SWE‑Bench results, with the small model outperforming many 70B systems. The accompanying Vibe CLI is a terminal‑native interface for real‑time project orchestration, designed to let developers plan, code and review within their shell.

CONFS, EVENTS, WEBINARS AND MEETUPS

From Chaos to Control: MLOps in 2025 | JFrog Webinar | Dec 12, 2025
A practical workshop on stabilizing ML pipelines, packaging models, managing dependencies and reducing drift. Includes demos on automating model lifecycle workflows with artifact repositories.

PINNACLE PICKS

Your last week top picks:
Talk to Your Data Model: Introducing the Power BI Modeling MCP| pbidax | Jeffrey Wang | 7 min | BI Tools
On day one of Microsoft Ignite 2025, the Power BI team introduced powerbi‑modeling‑mcp, their first public Model Context Protocol (MCP) server. Built on the same APIs (TOM for metadata and ADOMD.NET for querying) that underpin Analysis Services and Power BI, it allows users to create and maintain models using natural language. The semantic interface supports synonyms, batch updates and transaction control. New capabilities include multi‑model orchestration, headless TMDL editing and cross‑platform independence
A comprehensive tutorial on constructing a self-healing, agentic data pipeline. CodeWithYu demonstrates multi-agent orchestration for extraction, transformation and monitoring, highlighting automated error handling and continuous optimisation.
_____________________
Have any interesting content to share in the DATA Pill newsletter?
2025-12-11 23:51