DATA Pill feed

DATA Pill #151 - My data governance framework, From MCP to MOA — Model-Oriented Architecture

SURVEY

We’re proud to be a community partner of the Data & AI Monitor 2025! Share your perspective on the evolving world of data & AI by joining this quick 5-minute survey on the latest trends, tools, and technologies.

📊 Get the full Data & AI Monitor 2025 report for free, packed with expert insights — launching May/June.

🎁 Plus, enter to win great prizes like LEGO sets, AI books, Data Expo 2025 tickets, and Amazon gift cards!

Your personal details aren’t required — but sharing them means you’ll get the report straight to your inbox and a chance to win.

ARTICLES

My data governance framework | 13 min | Data Governance | Willem Koenders | ZS Associates Blog
Willem Koenders shares a practical framework built from a decade of experience. It covers strategy, roles, capabilities, and how to embed governance into day-to-day operations.
Agentic AI Will Make Things Worse Before It Makes Them Better | 7 min | AI | Eric Sammer | Decodable Blog
Many AI projects fail from poor context and weak data pipelines. This piece outlines how to build real-time, context-aware AI systems that actually deliver value.
From MCP to MOA — Model-Oriented Architecture | 4 min | LLM | Jing Ge | Personal Blog
MCP helps connect LLMs with services—laying the groundwork for Model-Oriented Architecture (MOA). A concise look at why LLM-native design is becoming essential.
Flash: A Next-gen Vectorized Stream Processing Engine Compatible with Apache Flink | 8 min | Stream Processing | Wang Feng | Alibaba Cloud Blog
Alibaba Cloud’s new Flash engine brings 5–10x faster stream processing while staying Flink-compatible. Learn how it works, where it’s used, and why it matters.

TUTORIALS

Mastering Spark: The Art and Science of Table Compaction | 20 min | Data Engineering | Miles Cole | Personal Blog
A benchmark of compaction strategies in Delta Lake on Fabric Spark. Learn why Auto Compaction + Optimized Writeoffers the best long-term performance.
The Gemini API and the Internet of Things | 3 min | AI | Paul Ruiz | Google for Developers Blog
Use Gemini API to turn ESP32-based devices into voice-controlled smart tools. A clean example of bringing AI and function calls to the edge.

PODCAST

Fraud Networks | Data Science | 42 min | Kyle Polish, Bavo De Cock Campo | Data Skeptic Podcast
Explore how social network analytics and BiRank are used to detect fraud rings in insurance. Featuring Bavo De Cock Campo and the iFraud simulator.

DATA TUBE

Model serving 101 with FastAPI & LitServe | ML | 37 min | Marcin Zabłocki | MOPS Community
Get practical with model deployment using FastAPI and LitServe. Covers serving basics, pitfalls, and how to streamline ML workflows for production.

CONFS, EVENTS AND MEETUPS

MOPS - Meetup #7 | Warsaw | 8th April
Talks on LLM deployment at scale, optimizing ReAct, and budget MLOps — plus networking & pizza. Hosted at Allegro’s Warsaw HQ

PINNACLE PICKS

Your last week top picks:
Parquet pruning in DataFusion | 3 min | Data Engineering | Xiangpeng Hao | Personal Blog
DataFusion boosts query performance with Parquet pruning, cutting I/O and processing time through optimized data retrieval.
UniLink: Your Universal “Tableflow” for Kafka—At Your Fingertips | 15 min | Data Streaming | Sijie Guo | StreamNative Blog
UniLink offers seamless real-time data replication between Kafka clusters and data lakehouses, without vendor lock-in.
TImeplus Proton is a lightweight stream processing engine that leverages ClickHouse to manage multi-stream JOINs and incremental materialized views.
________________________
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
2025-04-03 13:49