About · 25-second overview
Video transcript
- At 0:00: Every morning, grocery stores face the same question: what do we reorder, what do we discount, what do we promote?
- At 0:04: Stockouts lose sales. Overstock turns into waste. Both quietly eat margin.
- At 0:09: Grocery Data Intelligence ingests raw point-of-sale and inventory data into a Bronze → Silver → Gold lakehouse on Databricks Free Edition.
- At 0:14: A Gold risk view scores every product in every store and ranks the actions that matter today.
- At 0:19: Store managers open one dashboard, see the day's plan, and act in minutes — not hours.
- At 0:23: Smarter inventory. Less waste. Happier shelves.
Audio companion — full walkthrough
For when you'd rather listen than read — what this app does, how it's built, and how a store uses it every morning.
0:00/6:59
voice enhanced by ai
A grocery data product
built for DAIS 2026
Grocery Data Intelligence is a Databricks Free Edition project, submitted to the DAIS 2026 Community Virtual Contest. It turns raw store data into a daily plan — what to reorder, what to discount, what to promote — for every store, every morning.
DAIS 2026 Community Virtual Contest entryBuilt by Brahmareddy · Databricks Community Esteemed Contributor·k.b.r••••12@gmail.com
Thank you, Databricks leaders, for your time and feedback
When I shared this project on r/databricks, I never expected senior Databricks leaders to read it carefully and write back. Denny Lee, Youssef Mrini, and Sunmin Lee — thank you for the generosity of your time. Your feedback genuinely changed the product, and the measurable improvements below are because of you.
- 4
- Improvements shipped
- <1s
- Page loads, cached
- —
- Reddit upvotes (live)
- —
- Reddit comments (live)
“I really appreciate you making this app. I completely agree with you that having data and AI work together to build something actionable is exactly where things should go.”
Denny Lee
PM Director, Developer Relations at Databricks
Co-author of "Learning Spark" and "Delta Lake: The Definitive Guide" (O'Reilly) · Apache Spark Contributor · Delta Lake Maintainer
“The idea is brilliant and the execution is great. I really like how you used everything that was available to you starting with Databricks Free Edition to Genie Code. Good Job again.”
Youssef Mrini
Solutions Architect at Databricks
Data+AI Summit 2026 Speaker · Reddit Top 1% Poster
“super cool, very impressive”
Sunmin Lee
DatabricksSenior Solutions Engineer at Databricks
Databricks
What your feedback made better
Every item below was shipped (or is actively in progress) as a direct result of the suggestions above — with the measurable outcome it produced.
Action flow redesigned for transparency
1 → 3 clear actions per itemThanks to Denny Lee
The single Take Action button was replaced with "Create Reorder Plan" and "Review Recommended Actions," and every item now shows quantity, priority, and a confirmation step before anything is committed.
Full architecture written up for the community
Bronze → Silver → Gold, 8 gold objects documentedThanks to Denny Lee
Posted the end-to-end medallion design to Databricks Community so peers can review the modeling choices, the risk logic, and the Gold views that power every screen.
Semantic layer + Lakebase-ready actions
2 read paths · Genie-style live for some intentsThanks to Youssef Mrini
Gold Views via SQL Warehouse now power the dashboard, with a safety-net fallback path. Ask Data + AI runs as an AI/BI Genie-style experience over those approved views — live for some intents today, powered by a Databricks-hosted Foundation Model, with UC Metric Views and Lakebase for action state on the roadmap.
Sub-second page loads, end to end
<1s cached · ~1.5s coldPerformance outcome
Server-side stale-while-revalidate caching, in-flight request dedup, and a token-gated cache-warm endpoint keep every page snappy on both cold and warm loads.
See the full discussion on r/databricks
Original post, expert replies, and follow-up questions
The story — why this exists
Most grocery analytics tools stop at the dashboard. The store manager still has to translate charts into a morning routine: what to pull, what to mark down, what to push. That gap is where shrink and lost sales actually happen.
Grocery Data Intelligence closes that gap. It models the same lakehouse data every chain already has — POS, inventory, expiry, sales velocity — through a clean Bronze → Silver → Gold pipeline, and surfaces the Gold layer as a daily inventory story and a per-product action list.
It was built as a single-builder experiment on Databricks Free Edition for the DAIS 2026 Community Virtual Contest — to prove that an entire decision-grade data product can ship on the Lakehouse alone, without a separate analytics stack.
Medallion architecture on Databricks Free Edition
Every screen in this app is powered by a named Delta object in a single schema: workspace.grocery_data_intelligence.*. The data flows Bronze → Silver → Gold; the app only ever reads Gold.
┌──────────────────────────────────────────────────────────┐
│ BRONZE — raw, append-only │
│ bronze_sales_raw 5,150 rows +150/day │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ SILVER — cleaned + validated masters │
│ silver_stores 10 silver_products 15 │
│ silver_sales 5,150 silver_inventory 150 │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ GOLD — business metrics the app reads │
│ gold_inventory_risk_daily 150 rows / day │
│ + 7 business-logic views (KPI, risk, story, actions) │
└──────────────────────────────────────────────────────────┘gold_inventory_risk_daily
Per-store, per-product daily risk + recommended action (150 rows/day)
vw_dashboard_kpi_summary
Chain-wide KPI tiles on the Dashboard
vw_top_stockout_risk_products
Lowest days of supply, ranked
vw_top_waste_risk_products
Closest to expiry, ranked
vw_store_risk_summary
Per-store attention score
vw_category_risk_summary
Stockout vs waste by category
vw_today_inventory_story
Daily narrative for the Story page
gold_product_action_recommendations
Per-product recommended action (drives the Actions page)
Delta Lake gold views
Business logic lives in SQL, not in the app.
SQL Statement API
Server functions call Databricks directly — no extra warehouse.
SWR + cache-warm
Stale-while-revalidate cache keeps pages sub-second.
Risk model + recommended actions
Every recommendation in the app comes from these deterministic rules in the Gold view — auditable, explainable, and identical across the Dashboard, Actions, and Story pages.
Stockout risk
- High · < 2 days of stock remaining
- Medium · 2 – 5 days remaining
- Low · > 5 days remaining
Waste risk (perishables)
- High · ≤ 1 day until expiry
- Medium · 2 – 3 days until expiry
- Low · > 3 days until expiry
- N/A · non-perishable items
Recommended actions
Reorder Now — stockout imminent
Discount or Donate — expiry within 1 day
Promote Soon — expiry within 2 – 3 days
Monitor Closely — medium stockout risk
No Action Needed — optimal inventory
The daily 6 AM Workflow
One scheduled Databricks Workflow refreshes the entire pipeline every morning before stores open.
Schedule
Daily at 06:00 America/Chicago
cron: 0 0 6 * * ? *
Compute
Serverless Starter Warehouse · PRO 2X-Small · Photon enabled · 10-min auto-stop
Execution flow
Bronze append (+150 sales) → Silver clean & validate → Gold risk + actions → Pipeline summary
Data growth — 90-day projection
The dataset is live and growing. Each morning the Workflow appends another day of sales and recomputes risk for every store-product pair.
Day Bronze Silver Sales Gold Risk ───── ────── ──────────── ────────── Day 1 5,150 5,150 150 Day 30 9,650 9,650 4,500 Day 60 14,150 14,150 9,000 Day 90 18,650 18,650 13,500
The Databricks stack we use
Today the app is a Databricks Lakehouse data product: raw grocery data flows Bronze → Silver → Gold, gets surfaced through SQL Warehouse + Gold Views and a semantic layer, and is answered through an AI/BI Genie-style Ask Data + AI experience. Lakebase, UC Metric Views, Materialized tables, and Databricks Apps are the next steps as the project matures.
Current
Databricks Free Edition
Build and demo without enterprise setup cost.
Lakehouse Architecture
Foundation that moves grocery data from raw to clean to business-ready.
Bronze Layer
Raw stores, products, sales, and inventory data.
Silver Layer
Cleans, standardizes, and joins raw data so it's reliable for analysis.
Gold Layer
Business metrics: days of supply, stockout risk, waste risk, priority, recommended action.
Databricks SQL / SQL Warehouse
Powers the app, dashboard, Ask Data + AI, and approved views.
Gold Views
Surfaces trusted data to the app instead of raw tables.
Unity Catalog
Catalog / schema / table / view governance for trusted data assets.
Semantic Layer (concept)
Consistent business definitions across dashboard, Ask Data, and actions.
AI/BI Genie-style experience
Plain-English questions answered from trusted data — live for some intents today, powered by a Databricks-hosted Foundation Model.
Next steps
UC Metric Views
Native Databricks semantic layer for reusable metrics like days of supply and reorder priority.
Materialized / Serving Tables
Faster dashboard and AI queries as data grows.
Lakebase
Store action state — reorder plans, task status, user decisions, feedback history.
Databricks Apps
Host the app closer to Databricks data and security.
Build this yourself on Databricks Free Edition
If you are a data engineer exploring the Lakehouse, you can stand up the same project in an afternoon. Everything below runs on Databricks Free Edition — no extra services required.
- 1Sign up for Databricks Free Edition and start a Serverless SQL Warehouse.
- 2Create a Unity Catalog schema:
CREATE SCHEMA workspace.grocery_data_intelligence. - 3Land raw POS into a Bronze Delta table (
bronze_sales_raw) — append-only, one row per transaction. - 4Clean into Silver:
silver_stores,silver_products,silver_sales,silver_inventorywith ETL timestamps. - 5Compute the Gold risk + recommended action in one SQL statement (sample below). Wrap derived metrics in views so the app stays thin.
- 6Schedule a daily Workflow at 06:00, and point any front-end at the result via the Databricks SQL Statement API.
-- Gold: daily inventory risk + recommended action
CREATE OR REPLACE TABLE workspace.grocery_data_intelligence.gold_inventory_risk_daily AS
SELECT
current_date() AS report_date,
s.store_id, s.store_name, s.city, s.state,
p.product_id, p.product_name, p.category,
i.current_stock,
COALESCE(d.avg_daily_demand, 0) AS avg_daily_demand,
i.days_until_expiry,
ROUND(i.current_stock / NULLIF(d.avg_daily_demand, 0), 2) AS days_of_stock_remaining,
CASE
WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'High'
WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Medium'
ELSE 'Low'
END AS stockout_risk_level,
CASE
WHEN i.days_until_expiry IS NULL THEN 'N/A'
WHEN i.days_until_expiry <= 1 THEN 'High'
WHEN i.days_until_expiry <= 3 THEN 'Medium'
ELSE 'Low'
END AS waste_risk_level,
CASE
WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'Reorder Now'
WHEN i.days_until_expiry <= 1 THEN 'Discount or Donate'
WHEN i.days_until_expiry <= 3 THEN 'Promote Soon'
WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Monitor Closely'
ELSE 'No Action Needed'
END AS recommended_action,
current_timestamp() AS gold_created_at
FROM workspace.grocery_data_intelligence.silver_inventory i
JOIN workspace.grocery_data_intelligence.silver_stores s USING (store_id)
JOIN workspace.grocery_data_intelligence.silver_products p USING (product_id)
LEFT JOIN (
SELECT store_id, product_id, AVG(quantity_sold) AS avg_daily_demand
FROM workspace.grocery_data_intelligence.silver_sales
WHERE sale_date >= current_date() - INTERVAL 14 DAYS
GROUP BY store_id, product_id
) d USING (store_id, product_id);Everything you see in this app comes from those six steps — no extra services, no second warehouse, no separate BI stack.
Project journey
Bronze — raw POS
Landed 5,150 raw sales transactions (Jan 24 – May 17, 2026) into bronze_sales_raw on Delta. Grows by +150 rows every day.
Silver — cleaned masters
Cleaned and validated into silver_stores (10), silver_products (15), silver_sales (5,150), and silver_inventory (150) with ETL timestamps.
Gold — risk + actions
Built gold_inventory_risk_daily plus seven business-logic views — KPI summary, stockout risk, waste risk, store/category risk, product actions, and a daily inventory story.
Ask Data + AI (Genie-style)
An AI/BI Genie-style experience answers plain-English questions from approved Gold views via SQL Warehouse — live for some intents today and powered by a Databricks-hosted Foundation Model. No raw SQL crosses the boundary.
Caching & performance
Server-side stale-while-revalidate cache with in-flight request dedup, plus a token-gated cache-warm endpoint for sub-second page loads.
DAIS 2026 entry
Recorded a short demo, wrote a Community Articles post, and submitted the form between May 15–31, 2026.
See today's inventory story
Open the dashboard, or read the Methodology page to see every Gold view, field, and decision rule behind this submission.