DATA Pill feed

DATA Pill #122 - Master Dashboards, Terraform Databricks, and Boost Your Data Strategy

ARTICLES

Unlocking Insights with High-Quality Dashboards at Scale | 13 min | Data Science | Skyler Johnson | Spotify Engineering Blog
Discover how Spotify uses Tableau and Looker Studio, guided by a Dashboard Quality Framework, to create thousands of dashboards with consistently high standards. The centralized Dashboard Portal makes it easy to access curated, high-quality insights.
Explore common data strategy pitfalls and learn how to avoid them. This article covers key organizational gaps and offers practical steps to help build a more aligned and effective data strategy.

TUTORIALS

Terraforming Databricks #1: Unity Catalog Metastore | 5 min | Data Engineering | Tomasz Kostyrka | Seequality Blog
Learn how to automate Azure Databricks deployment with Terraform. In this first post, Tomasz dives into the Unity Catalog Metastore and explains how to set it up as a central data management hub across workspaces.
This article covers migrating on-prem data infrastructure to Google Cloud Platform, focusing on how Google Cloud Composer functions as a process orchestrator and supports enterprises in expanding data capabilities to meet modern business requirements.
Data Conversations with BigQuery Connectors and Looker Studio | 4 min | Data Science | Łukasz Olejniczak | Personal Blog
Integrate sparse and dense vectors to enhance knowledge retrieval in RAG using Amazon OpenSearch Service | 7 min | RAG | Yuanbo Li, River Xie, Ren Guo, Charlie Yang | AWS Blog
A deep dive into using sparse vectors for better term expansion and interpretability in RAG applications. The tutorial includes code examples and experiments to guide you through integrating sparse and dense vectors with Amazon OpenSearch.
CI/CD for Microsoft Fabric Data Warehouses using GitHub Actions | 6 min | Data Engineering | Kevin Chant | Personal Blog
This guide walks through implementing CI/CD for Microsoft Fabric Data Warehouses via GitHub Actions. Learn how to configure GitHub repositories, build and deploy database projects (dacpac), and optimize your process for security and performance.
Learn how full-text and vector indexes can be combined to enhance retrieval performance in GraphRAG applications. This tutorial covers building a GraphRAG application using the Neo4j GenAI Python library.

PODCAST

No Priors Ep. 80 | With Andrej Karpathy from OpenAI and Tesla | 44 min | Gen AI | Andrej Karpathy, Sarah Guo, Elad Gil | No Priors Podcast
Andrej Karpathy, former Tesla Autopilot leader and OpenAI founding member, joins to discuss self-driving cars, Tesla's Optimus robot, and AI's future. He also shares insights on AI education and his new mission, Eureka Labs.

DATA TUBE

Build Trust in Data through Automatable Data Contracts | 20 min | Data Governance | Max Schultze | Hyperight AB
This presentation falls under the category of Data Governance and Data Quality. It covers topics such as automating data contracts, enforcing data standards, and ensuring the integrity of data processing frameworks, all of which are essential aspects of data governance and quality assurance in a data-driven organization.

CONFS EVENTS AND MEETUPS

Data Council Amsterdam invites you to an evening of learning and networking at Xebia Data's office on Wibautstraat. The event will feature two data governance and democratization talks, covering knowledge graphs and Databricks Unity Catalog.
________________________
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
➡ Dig previous editions of DataPill
Made on
Tilda