DATA Pill feed

DATA Pill #084 - MLOps BABY! MLOps -> MLFlops -> LLMOps?

MLOps Baby! MLOps is a huge topic this year, and we believe it will be in ‘24. It will probably evolve more to LLMOps, but still.

So save this pill for 24! Maybe you will need it.

ARTICLES

Building an End-to-End MLOps Pipeline with Open-Source Tools | 11 min | MLOps & Open Source | Grig Duta | Qwak Blog
This article is a focused guide on the transition from experimental machine learning to production-ready MLOps pipelines. It identifies the limitations of traditional ML setups and introduces you to essential open-source tools that can help you build a more robust, scalable and maintainable ML system. How is it different from the traditional setup?
MLOps and MLflops | 9 min | MLOps architecture | Andrew Blance | Better Programming Blog
This is an introduction to standard and modern methods of storing data, creating resources and deploying AI. Let’s make sure your next model deployment isn't an MLflop.
  • How to bring a sales forecasting prototype to production in less than 2 months.
  • The solution architecture
  • Nixtla + Kedro
Declarative Feature Engineering at PayPal | 4 min | Feature Engineering | Marina Lyan | The PayPal Technology Blog
How the declarative feature engineering approach helps our engineers to address scale, TTM and TCO requirements.
5 Levels of MLOps Maturity | 10 min | MLOps | Maciej Balawejder | Toward Data Science Blog
This blog post aims to synthesize and take the best from both MLOps frameworks: Google and Microsoft. Maciej analyzes five maturity levels and shows the progression from manual processes to advanced automated infrastructures. He also argues that some of the points presented by Microsoft and Google should not be followed blindly but rather be adjusted to your needs.
Building a large scale unsupervised model anomaly detection system — Part 1 | 8 min | ML Platform | Anindya Saha, Han Wang, Rajeev Prabhakar | Lyft Engineering Blog
Lyft’s ML Platform is a machine learning infrastructure built on top of Kubernetes that powers diverse applications such as dispatch, pricing, ETAs, fraud detection and support. This post focuses on how Lift utilizes the compute layer of LyftLearn to profile model features and predictions and perform anomaly detection at scale.

DATA LIBRARY

Build Feature Stores Faster. An Introduction to Vertex AI, Snowflake and dbt Cloud | 7 min | MLOps & Feature Store | Jakub Jurczak | GetInData Blog
  • Introduction to MLOps,
  • A step-by-step guide to designing and building a Feature Store,
  • Example of MLOps architecture and workflow,
  • How to integrate GCP with Snowflake using terraform,
  • Vertex.ai platform - how it works in practice.

CLICK THROUGH ARCHITECTURE SCHEME

We didn't know what category to put here, but since there is a lot of content that refers to solution architecture, we thought this would be a good resource - a diagram of the MLOps Platform architecture that you can click through to see the technological details.

TUTORIALS

Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch | 15 min | LLMOps | Sebastian Raschka | Lightning.AI Blog
10 techniques to reduce the memory consumption of PyTorch models. When applying these techniques to a vision transformer, they reduced the memory consumption 20x on a single GPU.
Writing modular MLOPs-ready Python code for easy explainability and interpretation | 7 min | MLOps | Samar Deen, Ceren Altincekic | Data Science at Microsoft Blog
Covers what is required to productionize Python scripts into fully fledged outputs ready for use in actual business cases. An overview of the Python main function and its importance in getting code to be production ready.
How to use LLMs for data enrichment in BigQuery? | 16 min | LLMOps | Piotr Pilis | GetInData Blog
This post details the integration of LLMs with Google's BigQuery for data enrichment. By leveraging Cloud Functions and BigQuery Remote Functions, you can easily interface BigQuery with LLM APIs. How can dbt help with data transformations? How should you address limitations and security concerns of LLMs?
LLMs with BigQuery offer an easy to deploy and cost-effective solution for enhancing data analysis capabilities.

NEWS: COURSE

Building Maintainable Data Pipelines | 12 min intro | MLOps | Kedro
Finally! A free MLOps course from Kedro! On the agenda:
  • How to get started with Kedro
  • How to run Kedro pipelines
  • Kedro project deployed on Apache Airflow
  • Kedro nodes
  • and more
As you can see in the intro video, our DATA Pill community contributes quite a bit to Kedro development. Have you noticed GetInData's mention or Marcin Zabłocki as a committer (active DATA Pill contributor who gets FRIENDS jokes)? Marcin we are proud!
BTW there is the possibility to schedule a free consultation on MLOps with Marcin.
Just saying ;)

DATA TUBE

The story of the MLOps platform that makes you productive, everywhere! | 26 min | MLOps Platform | Marcin Zabłocki | Big Data Warsaw
This is a recording of a presentation from the conference: Big Data Technology Warsaw ‘23.
The selection of managed and cloud-native machine learning services that you can run your data science pipelines and deploy your trained models on is versatile. But there is no single way of interacting with platforms like Amazon Sagemaker, Google Vertex AI, Microsoft AzureML and Kubeflow. In this presentation you will learn how battle-tested technologies such as Kedro, MLflow and Terraform will make your data scientists’ life easier and more productive - regardless of what cloud provider you use.
Building ML pipelines with Kedro and Vertex AI on GCP | 1 h 5 min | MLOps Workshop | Michał Bryś | GetInData
  • Two practical exercises to help you build the ML pipeline yourself in an hour (links to GitHub on YouTube)
  • Why do we need a pipeline for Machine Learning models?
  • Kedro, an open-source Python framework for creating reproducible, maintainable and modular data science code

PODCAST

MLOps in the Cloud at Swedbank - Enterprise Analytics Platform | 55 min | MLOps + Cloud | Adam Kawa & Varun Bhatnagar | Radio Data Podcast
  • An overview of the solution - What is an Enterprise Analytics Platform (EAP)?
  • Evolution of MLOps at Swedbank
  • Iterative development for ML models - How can one improve the iterative development process for ML models?
  • Key take-away points and the lessons learned from ML cloud transformation.
Best Practices for Building LLM-Backed Applications | 53 min | LLMOps | Ben Lorica & Waleed Kadous | The Data Exchange Podcast
  • Open Source LLMs: when and how to use them
  • Code Llama vs. GitHub Copilot
  • Deploying open source LLMs
  • Reimagining "AutoML" in the age of LLMs
  • AMD and other hardware options for LLM inference
2023-12-20 16:50