DATA Pill #099 - Conventional RAG → Graph RAG, Knowledge Graphs using Neo4j and Vertex AI

ARTICLES

From Conventional RAG to Graph RAG | 13 min | LLM | Terence Lucas Yap | Government Digital Services, Singapore

LLMs, like ChatGPT, often rely on fixed datasets, resulting in outdated responses and challenges in updating knowledge without retraining. RAG, including Graph RAG, addresses this by integrating external knowledge bases and enriching responses with current information for improved accuracy and depth.

How I Built This Data Platform in One Week | 7 min | Data Engineering | Dorian Teffo | DataDrivenInvestor

This project involves applying recently acquired DevOps skills to construct a comprehensive data platform and update analytics daily. Dorian streamlines data processing and workflow orchestration using modern tools like Snowflake, Airbyte, and DBT, prioritizing simplicity and functionality in their approach.

Platform Engineering Essentials: 5 Key Learnings Before You Start | 5 min | DevOps | Bert Rijsdijk | Xebia Blog

Platform engineering offers immense potential for enhancing organizational efficiency and developer experience. Yet navigating its complexities requires addressing challenges such as conflicting objectives, ambiguous goals, and the urgency of adoption. Drawing from firsthand experiences implementing IDPs and CDaaS, this text will highlight five key insights for successful platform initiatives.

Real-Time Twitch Chat Sentiment Analysis with Apache Flink | 8 min | Data Streaming | Volker Janz | Towards Data Science

Learn how to empower creators by real-time sentiment analysis with Apache Flink to decipher audience emotions to steer content for viewer satisfaction.

SKILL LAKE

An End-to-End Framework for Production-Ready LLM Systems by Building Your LLM Twin | Online, 11 lessons | LLM | Paul Iusztin | Decoding ML

The course is split into 11 lessons. Every Medium article will be its own lesson.

An End-to-End Framework for Production-Ready LLM Systems by Building Your LLM Twin
The importance of Data Pipelines in the Era of Generative AI
CDC [Module 1] …WIP
Streaming ingestion pipeline [Module 2] …WIP
Vector DB retrieval clients [Module 2] …WIP
Training data preparation [Module 3] …WIP
Fine-tuning LLM [Module 3] …WIP
LLM evaluation [Module 4] …WIP
Quantization [Module 5] …WIP
Build the digital twin inference pipeline [Module 6] …WIP
Deploy the digital twin as a REST API [Module 6] …WIP

TUTORIALS

Building Knowledge Graphs from Scratch Using Neo4j and Vertex AI | 19 min | ML | Rubens Zimbres | Personal Blog

Rubens explored the free "Knowledge Graphs for RAG" course. This course meticulously details the creation of Knowledge Graphs from SEC forms, defining nodes and relationships. The individual aims to replicate their results by combining code snippets and visualizing the Knowledge Graph using Neo4j Workspace. Check out how it went.

From MongoDB to Dashboards with Dremio and Apache Iceberg | 14 min | Data Engineering | Alex Merced | Dremio blog

This post will explore how Dremio's data lakehouse platform simplifies your data delivery for business intelligence by doing a prototype version that can run on your laptop.

DATA TUBE

GID Data Copilot Demo | 5 min | Gen AI | GetInData | Part of Xebia

GID Data Copilot - An extensible AI programming assistant for SQL and dbt code:
Powered by Large Language Models (SOTA LLMs)
Robust Retrieval Augmented Generation (RAG) architecture
Hybrid search techniques
Fast Vector Database
Curated Prompts
Builtin Data commands

PODCAST

Are long context windows the end of RAG? | 29 min | LLM | Michael Foree, Cassidy Williams | Stack Overflow Podcast

The home team is joined by Michael Foree, Stack Overflow’s director of data science and data platform, and occasional cohost Cassidy Williams, CTO at Contenda, for a conversation about long context windows, retrieval-augmented generation, and how Databricks’ new open LLM could change the game for developers.

CONFS EVENTS AND MEETUPS

Deep Learning on Rails with PyTorch Lightning | Online | 9th April 11 AM ET

Key Takeaways:

Learn how to create and run a deep learning model.
Learn how to perform machine learning workflows in PyTorch Lightning.
See how Lightning Studio can be used for deep learning and AI development.

________________________

Have any interesting content to share in the DATA Pill newsletter?

➡ Join us on G itHub

➡ Dig previous editions of DataPill

2024-04-04 13:33