Uber enhanced Kubernetes with custom schedulers and GPU-aware logic to scale multi-tenant Ray workloads for ML efficiently and reliably.
A breakdown of GitHub Copilot’s three modes, showing how to use each for tasks ranging from quick answers to autonomous code changes.
Learn how ClickHouse replaced Spark and ElasticSearch to power real-time analytics at massive scale for Microsoft Clarity.
Shopify details its transition from ML classifiers to a powerful Vision Language Model system for AI-driven product classification at scale.
A real-world example of replacing Kafka pipelines with Rust, cutting CO₂ emissions and cloud costs by 99%.
GlassFlow simplifies building real-time pipelines between Kafka and ClickHouse with built-in support for deduplication and temporal joins.
An open-source Kafka proxy focused on encryption, multi-tenancy, and schema validation—snappy and production-ready.
LakeVilla introduces multi-query transactional guarantees for Iceberg and Delta Lake, with minimal performance impact.
A talk on how modular open standards like Iceberg and Arrow are reshaping modern, composable data platforms.
Join us for a deep dive into the data lakehouse storage layer. We'll explore the evolution of file formats like Parquet and Avro, the rise of open table formats such as Delta, Hudi, and Iceberg, and how openness shapes modern data architecture. Learn performance tuning tips like partitioning and Z-ordering, plus key topics like deletion vectors, encryption, and GDPR-compliant lifecycle policies. The session includes a live demo.
Topics include:
- What is Cloud Object Storage?
- Overview of Big Data File Formats
- Columnar vs. Row-Oriented: Parquet and Avro
- Benefits of Delta, Hudi, and Iceberg
- Optimizing Files for Query Performance
A raw, honest take on setting up Polaris as an Iceberg REST catalog in production. Spoiler: it hurt, but it worked.
Starlake lets you define extract, load, transform, and test tasks in YAML, and auto-generates DAGs—like Terraform for your data pipelines.
PyCharm merges Community and Pro editions into a single product with a free Pro trial and built-in Jupyter support for all.