Read the first review of the introductory article on the Data Dirtiness Score, which explains the key assumptions and demonstrates how to calculate this score. It's the second in a series about cleaning data using Large Language Models (LLMs), with a focus on identifying errors in tabular data sets.
This blog post describes Allegro’s team journey — how they used Kafka protocol sniffing and eBPF to identify and remove the performance bottleneck.
This article thoroughly examines LLM system evaluation, distinguishing between model and system evaluation and scrutinizing online and offline strategies. It focuses on AI assessing AI and Responsible AI metrics. The article highlights the relevance of diverse evaluation tools and frameworks across application scenarios, urging readers to stay informed about evolving metrics and frameworks for a comprehensive understanding.
This transition highlights a user-centric approach, focusing on building a domain-oriented, self-service data platform through experimentation. BackMarket aims to improve user experience and operational efficiency by prioritizing seamless data organization and access policies.
TL;DR
- MLflow is a popular experiment-tracking and end-to-end ML platform
- Since MLflow is open source, it’s free to download, and hosting an instance does not incur license fees
- Hosting MLflow requires multiple infrastructure components and comes with maintenance responsibilities, the cost of which can be difficult to estimate
On AWS, which offers various options for hosting MLflow, a medium-sized instance comes in at about $200 per month, plus storage and data transfer costsL;
Would you like to test one of our courses before investing money in it? Then come to our Data Learning Week, a series of 4 free hands-on workshops. Each session is a free first-trial lesson for the full training. We will also have a special bonus from the Academy for all workshop participants.
Choose your topic, check agenda and sign up:
This tutorial dives into such a custom solution:
- Deploy our ML model using a custom Docker image.
- Use a blue-green deployment strategy to ensure there is no downtime when deploying our model.
- Run smoke tests to see if our deployment is working as expected, before we replace our previous model.
- Use the Azure ML Python SDK to configure and manage deployment to Azure ML.
In this tutorial, Marcin Zabłocki shows how to deploy LLM in your private Kubernetes cluster in 5 simple steps on the Mistral example.
Jay Kreps, Co-creator of Apache Kafka and CEO of Confluent, will present his vision of unifying the operational and analytical worlds with data streams and showcase exciting new product capabilities. During this keynote, the winner and finalists of the $1M Data Streaming Startup Challenge will showcase how their use of data streaming is disrupting their categories.
On challenges for ML in quantitative trading and investing, and telling stories through data.
Join the independent conference with an agenda with presentations arranged into nine categories – find your most desired topics! There are, for example:
- Data Engineering
- Streaming and real-time analytics
- ML & Data Science
- Gen AI
And more! Learn from speakers from companies like Dropbox, IKEA, Cloudera, Allegro, Ververica, and Freenow.
Shhh… Use the DataPill200 code to get the 200 PLN discount!