ARTICLES
Exploring the Potential of Graph Neural Networks to Transform Recommendations at Zalando | 4 min | Recommendations Systems | Mariia Bulycheva | Zalando Engineering Blog
Zalando leverages Graph Neural Networks to enhance homepage CTR predictions, achieving a 0.6% improvement in ROC-AUC. Discover how user-content interaction patterns drive personalized recommendations.

Scaling Large Language Models for e-Commerce: The Development of a Llama-Based Customized LLM | 12 min | LLM | Christian Herold, Shahram Khadivi | eBay Blog
Learn how eBay’s custom Llama-based models deliver domain-specific capabilities while maintaining efficiency and security, setting new standards for AI in e-commerce.

TUTORIALS
Automate topic provisioning and configuration using Terraform with Amazon MSK | 5 min | DevOps | Vijay Kardile | AWS Blog
Simplify Amazon MSK topic provisioning with Terraform. A step-by-step guide to improving consistency and scalability while reducing manual errors.
Airflow in a multi-teams / multi-tenant environment. Deployment strategies | 22 min | Data Engineering | Kacper Muda | GetInData | Part of Xebia Blog
Explore deployment solutions for Apache Airflow in multi-team environments. Highlights include resource isolation, shared access options, and a glimpse at Airflow 3's upcoming capabilities.

Building an On-Premise Document Intelligence Stack with Docling, Ollama, Phi-4 | ExtractThinker| 7 min | Gen AI | Júlio Almeida | Towards AI
Discover tools like ExtractThinker and Ollama for secure, on-prem document processing. Tailored for industries with strict data privacy regulations, like finance and healthcare.

DATA LIBRARY
The Dawn of AI Agents | AI | Maja Vujinovic, Jensen Huang | 51x
A comprehensive white paper on AI agents’ real-world applications, key metrics, and the convergence of generative AI with blockchain technologies.
How to build effective AI Agents | AI | Rakesh Gohel | Anthropic
Dive into the core components and frameworks like LangChain for constructing AI agents tailored for complex decision-making tasks.
TOOL
drawdata | Data Visualization
A Python library for interactive dataset creation directly in Jupyter notebooks. Perfect for machine learning tutorials and algorithm demos.
NEWS
Paimon 1.0: Unified Lake Format for Data + AI | 4 min | AI | Martin Grund, Stefania Leone | Alibaba Cloud Blog
Introducing Apache Paimon, a groundbreaking data lakehouse solution integrating batch and streaming operations for real-time AI workflows.
CONFS, EVENTS AND MEETUPS
AI Decisioning in Action: Whoop’s Journey to Hyper-personalized Customer Experiences With Hightouch and Snowflake| Online | 28th January
Discover how WHOOP maximizes customer lifetime value with AI-powered personalization using Hightouch and Snowflake.
PINNACLE PICKS
Your last week top picks:
Apache Kafka + Vector Database + LLM = Real-Time GenAI| 12 min | Gen AI | Kai Waehner | Personal Blog
An exploration of how event-driven architectures with Kafka and Flink enable real-time GenAI use cases, combining large language models (LLMs) with vector databases for semantic search.

Building a Unified Healthcare Data Platform: Architecture | 14 min | Data Platform | Alexandre Guitton | Doctolib blog
Doctolib's shift from a centralized monolithic platform to a data mesh architecture that supports scalable AI, analytics, and robust data governance.
The Turning Point: Agentic AI, Inference Optimization, and Society's Next Challenge | 5 min | AI | Wesley Pasfield | Personal Blog
An overview of trends in AI, highlighting agentic workflows, inference optimizations, and the societal impacts of AI-driven automation.
________________________
Have any interesting content to share in the DATA Pill newsletter?