Uber explains how Apache Hudi underpins its lakehouse by providing ACID transactions, schema evolution and efficient upserts at massive scale. The post details ingestion patterns, compaction strategies and Hudi’s evolution to handle trillions of records, enabling low‑latency analytics without rewriting entire datasets.
The Author argues that neither a pure data mesh nor a fully centralised warehouse fits enterprise realities. She proposes a hybrid hub‑and‑spoke pattern where domain teams own their data products but connect through a shared integration layer and platform guardrails. This preserves a single source of truth while still allowing local autonomy
Chaudhary tackles the logit bottleneck—the final layer in LLMs where the model multiplies its hidden state by a massive output matrix—showing that it dominates memory usage. By fusing the linear projection and cross‑entropy loss into a single GPU kernel, Triton reduces peak memory by ~84%, unlocking larger batch sizes and longer contexts
With MinIO pivoting away from open source, Moffatt evaluates alternatives for a simple, S3‑compatible storage layer. He lists essential criteria—Docker support, S3 API compatibility, simplicity and active community—and surveys options like SeaweedFS, Cloudflare R2, Zenko and others, comparing trade‑offs
AWS announces a physically and logically separate cloud located entirely within the EU, designed to meet sovereignty and data‑residency requirements. The European Sovereign Cloud will expand with new Local Zones in Belgium, the Netherlands and Portugal, and AWS plans to invest over €7.8 billion in Germany.
A short course for software engineers covering three themes: continual learning with agents.md, token maxxing—designing tasks around token budgets—and parallel workflows using multiple agents.
This updated Airflow manual teaches readers how to build reliable, scalable pipelines with the latest Airflow features. Authors Bas Harenslak and Julian de Ruiter draw on extensive engineering experience and contributions to Airflow’s codebase.
CloudWork is a research preview of a collaborative workspace built on Claude Code. Available to Claude Max subscribers, it lets users give Claude file‑level access to read, edit and create documents on their machine, queue multiple tasks for execution and maintain safety via explicit permissions and sandboxing
A recorded walkthrough of key themes shaping AI in 2026: agentic systems, evaluation, observability, and the shift from models to end-to-end AI system design.
Xebia webinar on how Model Context Protocol (MCP), spec‑driven development and conversational data interfaces enable AI‑assisted data‑lakehouse engineering