DataFusion boosts query performance with Parquet pruning, cutting I/O and processing time through optimized data retrieval.
UniLink offers seamless real-time data replication between Kafka clusters and data lakehouses, without vendor lock-in.
Set up S3 tables and run efficient analytics with Redshift Serverless and Apache Iceberg integration.
Vertex AI pipeline caching improves efficiency by reusing intermediate outputs, reducing costs, and speeding up development.
dbt Copilot, an AI-powered data assistant from dbt Labs, is transforming data engineering by dbt Copilot automates routine data tasks with AI, integrating metadata context to boost accuracy and streamline data engineering workflows.
TImeplus Proton is a lightweight stream processing engine that leverages ClickHouse to manage multi-stream JOINs and incremental materialized views.
A declarative programming model that unifies reasoning-based query pipelines for structured and unstructured data with a Pandas-like API.
LinkedIn engineers share insights on running Kubernetes at scale and lessons learned along the way.
Tableflow simplifies data pipeline management by materializing Kafka topics as Apache Iceberg and Delta Lake tables without the need for ETL.
Join over 600 attendees and 90 speakers for technical sessions, workshops, and networking opportunities in one of the biggest Big Data events of the year.
Explore how LLM confidence scores help filter poor-quality responses, improving AI reliability in customer support and automated workflows.
Discover how Event-Driven Architecture empowers AI agents to communicate asynchronously and scale without rigid dependencies, enabling adaptive and resilient AI systems.
Master building real-time data pipelines by combining Apache Flink’s Java API with SQL for efficient data ingestion and processing.