DATA Pill feed

DATA Pill #125 - Exposing dbt models in Looker, RAG with Postgres

ARTICLES

Beyond the numbers: how a solid business case drives analytics | 8 min | Data Analytics Strategy | Iris Snuverink | Xebia Blog
Analytics projects need more than just data—they need a strong business case. Iris breaks down how to balance costs and benefits, structure your approach, and track the business impact from start to finish.
Kafka Has Reached a Turning Point| 7 min | Real-Time Analytics | Yingjun Wu | Personal Blog
After this year’s Current 2024 conference, the future of Kafka seems to be in flux. Yingjun explores how AI and evolving data streaming technologies influence Kafka’s role in the real-time analytics world.

TUTORIALS

Building RAG with Postgres | 4 min | RAG | Eric Zakariasson | Any Blockers Blog
Learn how to build a robust RAG system with Postgres. Eric walks through every step—setting up data pipelines to optimizing retrieval for high-performance AI-driven responses.
Choosing Between LLM Agent Frameworks | 12 min | LLM | Aparna Dhinakaran | Towards Data Science Blog
AI agents are getting more powerful, but how do you choose the proper framework? Aparna compares new frameworks like LangGraph and LlamaIndex Workflows, giving practical insights on building agents from scratch versus leveraging existing tools.
Exposing dbt models in Looker | 15 min | Analytics Engineering | Silja Märdla | Personal Blog
Want to automate syncing your dbt models with Looker? Silja shows how her team replaced manual LookML writing with an automated process, ensuring the latest models and metrics are always at hand for Looker users.

DATA LIBRARY

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation | RAG | Satyapriya Krishna, Kalpesh Krishna, Anhad Mohananey, Steven Schwarcz, Adam Stambler, Shyam Upadhyay, Manaal Faruqui
This paper introduces FRAMES, a dataset designed to assess the performance of LLMs in retrieval-augmented systems. The team evaluates factual accuracy, retrieval efficiency, and reasoning abilities, making it a must-read for those building cutting-edge RAG solutions.

TOOL

Langfun | LLM
Langfun makes working with language models feel like a breeze. Powered by PyGlove, it treats language as functions and allows object-oriented prompting—offering more control over how you interact with LLMs. Worth checking out for those working on LLM-powered applications!

PODCAST

GraphRAG (beyond the hype) | 55 min | RAG | Chris Benson, Daniel Whitenack, Prashanth Rao | Practical AI Podcast
Curious about GraphRAG and its practical use cases? Prashanth Rao joins the show to break down GraphRAG and what it means for the future of graph data. If you’re hearing a lot of buzz around this topic, this episode is a great place to cut through the hype.

DATA TUBE

500TB of audit logs for 500$/mo and still searchable in real time| 23 min | DevOps | Jacek Marmuszewski | Infoshare
How they managed to store over 500TB of audit data, with a cost under 500/mo and a fast response time, no matter how big the query time range is.

This is the unique journey of locating, classifying, and effectively storing 0.5PB of customer data. A distinctive setup leveraging Google BigQuery technology enabled the support & compliance team to execute complex queries against the entire dataset with a remarkable response time of 11s.

BTW as a community partner, we have a discount for Infoshare DEV; there will be more presentations like this. Buy discounted tickets here.
Databricks VS Code Extension v2: Setup and Feature Demo | 28 min | Data Engineering | Dustin Vannoy | Personal Channel
Databricks + VS Code = a dream for many data engineers. This video covers setting it up and leveraging the new features in the latest Databricks extension. From deploying code to avoiding common pitfalls—this demo has it all.

CONFS EVENTS AND MEETUPS

Whether you're in Gdynia or tuning in online, this AWS User Group event will not be noticed! Connect with professionals, sharpen your skills, and learn about the latest GenAI developments. Save the date!
________________________
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
➡ Dig previous editions of DataPill
2024-10-03 12:15