In a recent data project, Pedro suggested Apache Airflow, but due to past experiences, the company chose a managed platform, leading to a project on Azure Cloud with Databricks. He compared Azure Data Factory and Databricks Workflows with Airflow to determine the best orchestrator/scheduler.
LLM has recently gained significant traction, inspiring innovative use cases and demos. As the hype evolves into practical applications, information retrieval emerges as a focal point, prompting considerations about deployment strategies, data privacy, and accessing information beyond LLMs' parameters.
Optimize data management with a Data Staging Area in your Lakehouse architecture for enhanced security, efficiency, and scalability.
Learn how the Lyft team extended their existing machine learning ecosystem to support Reinforcement Learning models, develop models using Off-Policy Evaluation, and the lessons learned along the way.
- Building Generative AI Based Applications With LLMs and Data Augmentation Architectures
Join a one-day workshop on Generative AI and large language models. This event aims to provide participants in-depth knowledge of the latest advancements in natural language processing, computer vision, and machine learning techniques for Gen AI.
- Data Streaming: Analyze Your Data in Real-Time With Flink
In this one day workshop you will learn how to build streaming analytics apps that deliver instant results in a continuous manner on data-intensive streams. You will discover how to configure streaming pipelines, transformations, aggregations or triggers using SQL and Python in an user-friendly development environment using open source tools of Apache Flink, Apache Kafka and Getindata OSS projects.
- Advanced analytics engineering with Snowflake and dbt
Learn to build and optimize data pipelines using dbt and Snowflake in a one-day workshop. Discover how to enhance performance, quality, and cost-efficiency through materialization techniques, version control, testing, monitoring, and scheduling. Solve common data transformation challenges with modern tools, using hands-on exercises in a public cloud (GCP or AWS).
REMEMBER! Use the DataPill200 code to get the 200 PLN discount.
This article explores how powerful models simplify tasks by writing database queries from questions, building knowledgeable chatbots, and creating custom dashboards for preferred information. It will also uncover the potential of combining Large Language Models (LLMs) with structured data to unlock new possibilities and streamline data interaction.
Arroyo is coming up on its one-year anniversary as an open-source project, and we've got something big planned. Our next release, 0.10, will feature an entirely new SQL engine built around Apache Arrow and the DataFusion SQL toolkit. This post will detail Arroyo's current implementation and why it's changing.
The Apache Flink PMC is pleased to announce the release of Apache Flink 1.19.0. As usual, we are looking at a packed release with a wide variety of improvements and new features. Overall, 162 people contributed to this release completing 33 FLIPs and 600+ issues.
Confluent has launched Confluent Cloud for Apache Flink, which is available on all major cloud platforms. This integration offers a unified, enterprise-grade solution for real-time data processing with Apache Kafka® and Flink. The blog details the unique features of this fully managed service and its readiness for mission-critical use cases at scale.
Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and what’s on the horizon for LLMs.
World class expertise and practical content packed in 3 days. You can look forward to an excellent lineup of 40 international experts in ML and AI business and academic applications at ML Prague 2024. They will present advanced practical talks, hands-on workshops and other forms of interactive content to you.