About · 25-second overview

25-second overview · no sound

Audio companion — full walkthrough

For when you'd rather listen than read — what this app does, how it's built, and how a store uses it every morning.

0:006:59

voice enhanced by ai

A grocery data product
built for DAIS 2026

Grocery Data Intelligence is a Databricks Free Edition project, submitted to the DAIS 2026 Community Virtual Contest. It turns raw store data into a daily plan — what to reorder, what to discount, what to promote — for every store, every morning.

DAIS 2026 Community Virtual Contest entry

Built by Brahmareddy · Databricks Community Esteemed Contributor·k.b.r••••12@gmail.com

A heartfelt thank you

Thank you, Databricks leaders, for your time and feedback

When I shared this project on r/databricks, I never expected senior Databricks leaders to read it carefully and write back. Denny Lee, Youssef Mrini, and Sunmin Lee — thank you for the generosity of your time. Your feedback genuinely changed the product, and the measurable improvements below are because of you.

4
Improvements shipped
<1s
Page loads, cached
Reddit upvotes (live)
Reddit comments (live)
I really appreciate you making this app. I completely agree with you that having data and AI work together to build something actionable is exactly where things should go.
DL

Denny Lee

PM Director, Developer Relations at Databricks

Co-author of "Learning Spark" and "Delta Lake: The Definitive Guide" (O'Reilly) · Apache Spark Contributor · Delta Lake Maintainer

The idea is brilliant and the execution is great. I really like how you used everything that was available to you starting with Databricks Free Edition to Genie Code. Good Job again.
YM

Youssef Mrini

Solutions Architect at Databricks

Data+AI Summit 2026 Speaker · Reddit Top 1% Poster

super cool, very impressive
SL

Sunmin Lee

Databricks

Senior Solutions Engineer at Databricks

Databricks

What your feedback made better

Every item below was shipped (or is actively in progress) as a direct result of the suggestions above — with the measurable outcome it produced.

  1. Action flow redesigned for transparency

    1 → 3 clear actions per item

    Thanks to Denny Lee

    The single Take Action button was replaced with "Create Reorder Plan" and "Review Recommended Actions," and every item now shows quantity, priority, and a confirmation step before anything is committed.

  2. Full architecture written up for the community

    Bronze → Silver → Gold, 8 gold objects documented

    Thanks to Denny Lee

    Posted the end-to-end medallion design to Databricks Community so peers can review the modeling choices, the risk logic, and the Gold views that power every screen.

  3. Semantic layer + Lakebase-ready actions

    2 read paths · Genie-style live for some intents

    Thanks to Youssef Mrini

    Gold Views via SQL Warehouse now power the dashboard, with a safety-net fallback path. Ask Data + AI runs as an AI/BI Genie-style experience over those approved views — live for some intents today, powered by a Databricks-hosted Foundation Model, with UC Metric Views and Lakebase for action state on the roadmap.

  4. Sub-second page loads, end to end

    <1s cached · ~1.5s cold

    Performance outcome

    Server-side stale-while-revalidate caching, in-flight request dedup, and a token-gated cache-warm endpoint keep every page snappy on both cold and warm loads.

See the full discussion on r/databricks

Original post, expert replies, and follow-up questions

Live from Reddit

The story — why this exists

Most grocery analytics tools stop at the dashboard. The store manager still has to translate charts into a morning routine: what to pull, what to mark down, what to push. That gap is where shrink and lost sales actually happen.

Grocery Data Intelligence closes that gap. It models the same lakehouse data every chain already has — POS, inventory, expiry, sales velocity — through a clean Bronze → Silver → Gold pipeline, and surfaces the Gold layer as a daily inventory story and a per-product action list.

It was built as a single-builder experiment on Databricks Free Edition for the DAIS 2026 Community Virtual Contest — to prove that an entire decision-grade data product can ship on the Lakehouse alone, without a separate analytics stack.

Medallion architecture on Databricks Free Edition

Every screen in this app is powered by a named Delta object in a single schema: workspace.grocery_data_intelligence.*. The data flows Bronze → Silver → Gold; the app only ever reads Gold.

  ┌──────────────────────────────────────────────────────────┐
  │  BRONZE — raw, append-only                               │
  │  bronze_sales_raw                  5,150 rows  +150/day  │
  └──────────────────────────────────────────────────────────┘
                              │
                              ▼
  ┌──────────────────────────────────────────────────────────┐
  │  SILVER — cleaned + validated masters                    │
  │  silver_stores      10    silver_products      15        │
  │  silver_sales    5,150    silver_inventory    150        │
  └──────────────────────────────────────────────────────────┘
                              │
                              ▼
  ┌──────────────────────────────────────────────────────────┐
  │  GOLD — business metrics the app reads                   │
  │  gold_inventory_risk_daily        150 rows / day         │
  │  + 7 business-logic views (KPI, risk, story, actions)    │
  └──────────────────────────────────────────────────────────┘

gold_inventory_risk_daily

Per-store, per-product daily risk + recommended action (150 rows/day)

vw_dashboard_kpi_summary

Chain-wide KPI tiles on the Dashboard

vw_top_stockout_risk_products

Lowest days of supply, ranked

vw_top_waste_risk_products

Closest to expiry, ranked

vw_store_risk_summary

Per-store attention score

vw_category_risk_summary

Stockout vs waste by category

vw_today_inventory_story

Daily narrative for the Story page

gold_product_action_recommendations

Per-product recommended action (drives the Actions page)

Delta Lake gold views

Business logic lives in SQL, not in the app.

SQL Statement API

Server functions call Databricks directly — no extra warehouse.

SWR + cache-warm

Stale-while-revalidate cache keeps pages sub-second.

Risk model + recommended actions

Every recommendation in the app comes from these deterministic rules in the Gold view — auditable, explainable, and identical across the Dashboard, Actions, and Story pages.

Stockout risk

  • High · < 2 days of stock remaining
  • Medium · 2 – 5 days remaining
  • Low · > 5 days remaining

Waste risk (perishables)

  • High · ≤ 1 day until expiry
  • Medium · 2 – 3 days until expiry
  • Low · > 3 days until expiry
  • N/A · non-perishable items

Recommended actions

Reorder Now — stockout imminent

Discount or Donate — expiry within 1 day

Promote Soon — expiry within 2 – 3 days

Monitor Closely — medium stockout risk

No Action Needed — optimal inventory

The daily 6 AM Workflow

One scheduled Databricks Workflow refreshes the entire pipeline every morning before stores open.

Schedule

Daily at 06:00 America/Chicago
cron: 0 0 6 * * ? *

Compute

Serverless Starter Warehouse · PRO 2X-Small · Photon enabled · 10-min auto-stop

Execution flow

Bronze append (+150 sales) Silver clean & validate Gold risk + actions Pipeline summary

Data growth — 90-day projection

The dataset is live and growing. Each morning the Workflow appends another day of sales and recomputes risk for every store-product pair.

Day      Bronze     Silver Sales    Gold Risk
─────    ──────     ────────────    ──────────
Day 1     5,150          5,150            150
Day 30    9,650          9,650          4,500
Day 60   14,150         14,150          9,000
Day 90   18,650         18,650         13,500

The Databricks stack we use

Today the app is a Databricks Lakehouse data product: raw grocery data flows Bronze → Silver → Gold, gets surfaced through SQL Warehouse + Gold Views and a semantic layer, and is answered through an AI/BI Genie-style Ask Data + AI experience. Lakebase, UC Metric Views, Materialized tables, and Databricks Apps are the next steps as the project matures.

Current

  • Databricks Free Edition

    Build and demo without enterprise setup cost.

  • Lakehouse Architecture

    Foundation that moves grocery data from raw to clean to business-ready.

  • Bronze Layer

    Raw stores, products, sales, and inventory data.

  • Silver Layer

    Cleans, standardizes, and joins raw data so it's reliable for analysis.

  • Gold Layer

    Business metrics: days of supply, stockout risk, waste risk, priority, recommended action.

  • Databricks SQL / SQL Warehouse

    Powers the app, dashboard, Ask Data + AI, and approved views.

  • Gold Views

    Surfaces trusted data to the app instead of raw tables.

  • Unity Catalog

    Catalog / schema / table / view governance for trusted data assets.

  • Semantic Layer (concept)

    Consistent business definitions across dashboard, Ask Data, and actions.

  • AI/BI Genie-style experience

    Plain-English questions answered from trusted data — live for some intents today, powered by a Databricks-hosted Foundation Model.

Next steps

  • UC Metric Views

    Native Databricks semantic layer for reusable metrics like days of supply and reorder priority.

  • Materialized / Serving Tables

    Faster dashboard and AI queries as data grows.

  • Lakebase

    Store action state — reorder plans, task status, user decisions, feedback history.

  • Databricks Apps

    Host the app closer to Databricks data and security.

Build this yourself on Databricks Free Edition

If you are a data engineer exploring the Lakehouse, you can stand up the same project in an afternoon. Everything below runs on Databricks Free Edition — no extra services required.

  1. 1Sign up for Databricks Free Edition and start a Serverless SQL Warehouse.
  2. 2Create a Unity Catalog schema: CREATE SCHEMA workspace.grocery_data_intelligence.
  3. 3Land raw POS into a Bronze Delta table (bronze_sales_raw) — append-only, one row per transaction.
  4. 4Clean into Silver: silver_stores, silver_products, silver_sales, silver_inventory with ETL timestamps.
  5. 5Compute the Gold risk + recommended action in one SQL statement (sample below). Wrap derived metrics in views so the app stays thin.
  6. 6Schedule a daily Workflow at 06:00, and point any front-end at the result via the Databricks SQL Statement API.
-- Gold: daily inventory risk + recommended action
CREATE OR REPLACE TABLE workspace.grocery_data_intelligence.gold_inventory_risk_daily AS
SELECT
  current_date()                              AS report_date,
  s.store_id, s.store_name, s.city, s.state,
  p.product_id, p.product_name, p.category,
  i.current_stock,
  COALESCE(d.avg_daily_demand, 0)             AS avg_daily_demand,
  i.days_until_expiry,
  ROUND(i.current_stock / NULLIF(d.avg_daily_demand, 0), 2) AS days_of_stock_remaining,
  CASE
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'High'
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Medium'
    ELSE 'Low'
  END                                         AS stockout_risk_level,
  CASE
    WHEN i.days_until_expiry IS NULL          THEN 'N/A'
    WHEN i.days_until_expiry <= 1             THEN 'High'
    WHEN i.days_until_expiry <= 3             THEN 'Medium'
    ELSE 'Low'
  END                                         AS waste_risk_level,
  CASE
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'Reorder Now'
    WHEN i.days_until_expiry <= 1                            THEN 'Discount or Donate'
    WHEN i.days_until_expiry <= 3                            THEN 'Promote Soon'
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Monitor Closely'
    ELSE 'No Action Needed'
  END                                         AS recommended_action,
  current_timestamp()                         AS gold_created_at
FROM workspace.grocery_data_intelligence.silver_inventory i
JOIN workspace.grocery_data_intelligence.silver_stores   s USING (store_id)
JOIN workspace.grocery_data_intelligence.silver_products p USING (product_id)
LEFT JOIN (
  SELECT store_id, product_id, AVG(quantity_sold) AS avg_daily_demand
  FROM workspace.grocery_data_intelligence.silver_sales
  WHERE sale_date >= current_date() - INTERVAL 14 DAYS
  GROUP BY store_id, product_id
) d USING (store_id, product_id);

Everything you see in this app comes from those six steps — no extra services, no second warehouse, no separate BI stack.

Project journey

Phase 1

Bronze — raw POS

Landed 5,150 raw sales transactions (Jan 24 – May 17, 2026) into bronze_sales_raw on Delta. Grows by +150 rows every day.

Phase 2

Silver — cleaned masters

Cleaned and validated into silver_stores (10), silver_products (15), silver_sales (5,150), and silver_inventory (150) with ETL timestamps.

Phase 3

Gold — risk + actions

Built gold_inventory_risk_daily plus seven business-logic views — KPI summary, stockout risk, waste risk, store/category risk, product actions, and a daily inventory story.

Phase 4

Ask Data + AI (Genie-style)

An AI/BI Genie-style experience answers plain-English questions from approved Gold views via SQL Warehouse — live for some intents today and powered by a Databricks-hosted Foundation Model. No raw SQL crosses the boundary.

Phase 5

Caching & performance

Server-side stale-while-revalidate cache with in-flight request dedup, plus a token-gated cache-warm endpoint for sub-second page loads.

Submission

DAIS 2026 entry

Recorded a short demo, wrote a Community Articles post, and submitted the form between May 15–31, 2026.

See today's inventory story

Open the dashboard, or read the Methodology page to see every Gold view, field, and decision rule behind this submission.