DATA Pill feed

DATA Pill #092 - MLFlow iceberg, Meta ♥️ Python


What Dagster Believes About Data Platforms | 11 min | Data Engineering | Sandy Ryza | Dagster Blog
At Dagster Labs, their mission is to help organizations build effective data platforms by embracing specific beliefs. In this post, they'll explore them and how they shape the design principles behind Dagster.
From Silos to Standardization: Leveraging DBT for a Democratized Data Framework | 8 min | Data Engineering | Abhishek Pathania | Urban Engineering
This blog explores the challenges encountered by various teams and Urban's journey in addressing them through a unified solution called the Common Computation Framework (CCF).
Comparing LangChain and LlamaIndex with 4 tasks | 10 min | LLM | Ming | Personal Blog
Let’s compare the two frameworks in completing four common tasks:

  1. Connecting to a local LLM instance and building a chatbot.
  2. Indexing local files and building a RAG system.
  3. Combining the two above and making a chatbot with RAG capabilities.
  4. Converting the chatbot into an agent so that it may use more tools and do simple reasoning.


Read about how RAG works, how it helps LLMs, and why it’s so important for making computer programs smarter in understanding and using language.
Advanced RAG using Llama Index | 13 min | LLM | Plaban Nayak | AI Planet
Let's implement a concept to improve retrieval that can be useful for contect aware text processing where we would also consider the surrounding context of a sentence to understand valuable insights.
How to use UMAP dimensionality reduction for Embeddings to show Questions, Answers and their relationships to source documents with OpenAI, Langchain and ChromaDB.


MLflow iceberg: from basics to hidden depths | 43 min | LLM | Marcin Zabłocki | MOPS
A journey through MLflow, a favorite platform among MLOps practitioners. Just like an iceberg, MLflow has its visible features and a trove of powerful tools hidden beneath the surface. We'll explore everything from basics like metrics logging to nuanced deployment challenges and recent LLM features.
How Rivian builds real-time analytics from electric vehicles | 51 min | Real-time | Vidhi Taneja, Anirban Kunbu, Rupesh More | AWS re:Invent 2023
Rivian's data platform is vital for managing vehicle data across commerce and reliability. Rivian quickly identifies issues and insights in their electric vehicles using real-time processing. Learn how they build and maintain this platform with Amazon MSK and AWS services.


Meta loves Python | 38 min | Data Engineering | Pascal Hartig, Itamar Oren, Carl Meyer | Meta Tech Podcast
Meta engineers discuss their contributions to the latest Python release, including custom JITs, type system improvements, and faster comprehensions. They share insights on their collaboration with the Python community.


Radar - The Analytics Edition | Online | 21st March
Join RADAR: The Analytics Edition—a free digital event exploring transformative outcomes with business intelligence and analytics.


  • A Tipping Point in Data Democratization
  • The Art of Data Storytelling: Driving Impact with Analytics
  • ChatGPT & Generative AI: Boon or Bane for Data Democratization?
  • Scaling Data ROI: Driving Analytics Adoption Within Your Organization
  • Building a Learning Culture for Analytics Functions
  • From Data Governance to Data Discoverability: Building Trust in Data Within Your Organization
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
➡ Dig previous editions of DataPill
Made on