DATA Pill feed

DATA Pill #094 - PyAirbyte and why Gemini 1.5 are bullish for RAG


12 Golden Signals To Discover Anomalies And Performance Issues on Your AWS RDS Fleet | 5 min | Data Engineering | Dmitry Kolesnikov | Zalando Engineering
The Database per Service pattern in microservices requires managing database instances, monitoring health, and detecting anomalies. Standardization in methodology and tooling is vital for scaling success. A methodology using 12 golden signals has been developed to observe health, with an open-source utility called rds-health facilitating its implementation.
Why Gemini 1.5 (and other large context models) are bullish for RAG | 7 min | RAG | Chia Jeng Yang | Enterprise RAG Blog
The introduction of Gemini 1.5, with a 1 million token context window, has sparked debates in the AI community, particularly regarding its impact on RAG. Contrary to concerns, it's argued that Gemini 1.5 represents a positive development for RAG, highlighting RAG's core strengths in optimizing for cost, accuracy, and latency in non-black box ways.
Production-ready Data Stack in a week | 10 min | Data Engineering | Jeremy Surget | Personal Blog
This article shows a relatively simple yet powerful approach to setting up a data stack:

  • The tools with their pros and cons
  • How to deploy them
  • What’s the total cost of this stack
The mix of Knowledge Graph Databases and Large Language Models has changed RAG apps, solving-knowledge problems like hallucinations and cut-offs. This combo improves QA chatbots by understanding complex relationships between things, making answers brighter.


Running Doom on Snowflake | Daniel Palma | 7 min | Data Engineering | Areca Data
The curiosity of whether Snowflake could host Doom sparked a spontaneous exploration into the realm of Snowpark Container Services.
Flink SQL and the Joy of JARs | 4 min | Data Engineering | Robin Moffatt | Decodable
This tutorial shares some useful tips and tricks for working with JARs, including how to understand which ones you need and how to check the right one is being used.


Klarna announced its AI assistant powered by OpenAI. Now live globally for 1 month, the numbers speak for themselves:

  • The AI assistant has had 2.3 million conversations, two-thirds of Klarna’s customer service chats It is doing the equivalent work of 700 full-time agents
  • It is on par with human agents in regard to customer satisfaction score

and a way more.


Airbyte Winter Release 2024 | 33 min | Data Engineering | Airbyte Blog
Meet PyAirbyte: the newest addition to the Airbyte family. It simplifies data pipeline management for Python users by offering access to Airbyte's connector library without strict dependencies. PyAirbyte aligns with engineers' preference for Python, making data integration more accessible and flexible.


The AI Infrastructure Revolution: From Cloud Computing to Data Center Design | 43 min | AI | Ben Lorica, Bryan Cantrill | The Data Exchange Podcast
This episode explores the evolving landscape of infrastructure for AI, key innovations like new data center designs and virtualization, and how the economics and optimization of these systems will impact the advancement of artificial intelligence.


Big Data Tech 2024 Online Webinar | Online | 14th and 21st March
Join the special events on March 14th and March 21st to meet experts from Fandom, GetInData | Part of Xebia,, and more. Explore the latest in data, analytics, machine learning, and cloud tech to stay ahead in tech.
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
➡ Dig previous editions of DataPill
Made on