DATA Pill feed

DATA Pill #170 - How Agentic AI is Transforming Wall Street, Meta’s AI risk tool, Google’s data agents

ARTICLES

Global Bank Achieves 90% Cost Savings with Mainframe Offloading! | 8 min | Data Infrastructure | Johnson Noel | Ververica Blog
A leading bank kept its mainframe as the source of truth but shifted heavy compute to Flink, cutting MIPS by 90 percent in three weeks and saving $1M annually. Architecture includes COBOL to Java refactors, Kafka, Azure Blob, and exactly–once streaming.
The EU Data Act: Your Roadmap from Regulatory Burden to Business Opportunity | 7 min | Data Governance | Włodzimierz Marat | Xebia Blog
Breaks down new rights, penalties, and access rules under the EU Data Act, plus a timeline to the September 2025 deadline and first steps like APIs and granular access control.
Meta’s Diff Risk Score uses a fine–tuned Llama model to score code diffs and flag risky snippets before release, now integrated into review and deploy workflows.
Redefining enterprise data with agents and AI-native foundations | 3 min | Yasmeen Ahmad | Data Analytics | Google Cloud Blog
Google Cloud adds data agents to BigQuery and Vertex AI, plus toolkits for building your own. Includes agents for engineering, data science, and conversational analytics, all tied into a unified AI native foundation.

TUTORIAL

Build a scalable and up-to-date generative AI chatbot with Amazon Bedrock and Confluent Cloud for business loan specialists | 7 min | Data Streaming | Pascal Vantrepote, Mario Bourgoin, Shruti Arora | Confluent Blog
Production blueprint for a chatbot streaming sessions via Kafka while Amazon Bedrock handles generation, document prep, and real–time masking.

NEWS

BaseModel.ai runs natively in Snowflake to turn behavioral logs into models without data egress, delivering week–one deployments and benchmark wins over top models.

PODCAST

How Agentic AI is Transforming Wall Street | 40 min | AI | Ben Lorica, Josh Pantony | The Data Exchange Podcast
Boosted AI’s Alfa creates persistent AI workers for finance, using multi–LLM architectures and blended datasets for proactive analysis.

DATA TUBE

All you need to know about Databricks One | 10 min | Data Platform | NextGenLakehouse
Databricks One gives business users a single interface for data and AI insights without coding.

EVENTS, CONFS, AND MEETUPS

Intro to lakehouse concepts, problems solved, and design principles like modularity and open standards.
Tour of open table formats, partitioning, tiering, and compliance strategies, plus a live demo.

PINNACLE PICKS

Your last week top picks:
Build a Streaming Lakehouse with Flink, Kafka, Iceberg, and Polaris | 8 min | Data Engineering | Gilles Philippart | Personal Blog
A hands-on guide to setting up a streaming data lakehouse with schema evolution and end-to-end reliability using open-source tools.
Spark pipelines can now read Snowflake data without data movement. This new integration simplifies hybrid workflows and keeps full access control in place.
Introducing LangExtract: A Gemini powered information extraction library | 4 min | NLP | Akshay Goel, Atilla Kiraly | Google for Developers Blog
A lightweight Python library for information extraction with built-in schema validation and few-shot support. Built for fast, type-safe NLP pipelines.
________________________
Have any interesting content to share in the DATA Pill newsletter?
➡ Join us on GitHub
2025-08-13 11:19