Backfilling training data can be slow and clunky. Pinterest built a faster pipeline using Iceberg and Ray—cutting down on delays and letting teams ship better features, faster.
Instagram grew its rec system to 1,000+ ML models with tools like a model registry and auto-launch infra—keeping launches fast and model quality high.
Going distroless, multi-stage builds, and running as non-root helped trim a Node.js Docker image from 380MB to 60MB. Cleaner, faster, and more secure.
The Model Communication Protocol could be the standard for how AI agents talk to tools and APIs. Still early, but full of potential.
Four LLMs tried designing a crypto exchange architecture. Results? Grok stood out, but humans still hold the edge—especially those who use AI well.
Build a system that reads and queries legal contracts using a graph-based RAG setup. Great example of structured retrieval for messy data.
Allegro sped up data pivoting by ditching Pandas and hacking Snowflake’s object_agg. Major boost for ML preprocessing at scale.
Sida Shen explains how StarRocks bridges lakehouse and OLAP needs—real-time joins, low latency, and support for open formats like Iceberg and Delta Lake.
A deep dive into how Confluent is blending batch and streaming to power real-time AI systems. Includes product updates and customer stories.
Join this session to explore the data lakehouse storage layer—formats like Parquet and Avro, open tables like Delta, Hudi, and Iceberg, plus tips on performance tuning, encryption, and GDPR-compliant data handling. A live demo is included.
Bonus: Register and you’ll also get on-demand access to the recording of the previous webinar in the series.
Build an AI-powered data pipeline with Airflow 3 and Gemini, using LLMs to generate and rank tier lists from LoL match data.
How Lyft predicts minute-level demand across millions of locations, balancing latency, noise, and deep vs. time series models.
A real-world example of replacing Kafka pipelines with Rust, cutting CO₂ emissions and cloud costs by 99%.