About · 25-second overview

25-second overview · no sound

Audio companion — full walkthrough

For when you'd rather listen than read — what this app does, how it's built, and how a store uses it every morning.

0:006:59

voice enhanced by ai

A grocery data product
built for DAIS 2026

Grocery Data Intelligence is a Databricks Free Edition project, submitted to the DAIS 2026 Community Virtual Contest. It turns raw store data into a daily plan — what to reorder, what to discount, what to promote — for every store, every morning.

DAIS 2026 Community Virtual Contest entry

Built by Brahmareddy · Databricks Community Esteemed Contributor·k.b.r••••12@gmail.com

The story — why this exists

Most grocery analytics tools stop at the dashboard. The store manager still has to translate charts into a morning routine: what to pull, what to mark down, what to push. That gap is where shrink and lost sales actually happen.

Grocery Data Intelligence closes that gap. It models the same lakehouse data every chain already has — POS, inventory, expiry, sales velocity — through a clean Bronze → Silver → Gold pipeline, and surfaces the Gold layer as a daily inventory story and a per-product action list.

It was built as a single-builder experiment on Databricks Free Edition for the DAIS 2026 Community Virtual Contest — to prove that an entire decision-grade data product can ship on the Lakehouse alone, without a separate analytics stack.

Medallion architecture on Databricks Free Edition

Every screen in this app is powered by a named Delta object in a single schema: workspace.grocery_data_intelligence.*. The data flows Bronze → Silver → Gold; the app only ever reads Gold.

  ┌──────────────────────────────────────────────────────────┐
  │  BRONZE — raw, append-only                               │
  │  bronze_sales_raw                  5,150 rows  +150/day  │
  └──────────────────────────────────────────────────────────┘
                              │
                              ▼
  ┌──────────────────────────────────────────────────────────┐
  │  SILVER — cleaned + validated masters                    │
  │  silver_stores      10    silver_products      15        │
  │  silver_sales    5,150    silver_inventory    150        │
  └──────────────────────────────────────────────────────────┘
                              │
                              ▼
  ┌──────────────────────────────────────────────────────────┐
  │  GOLD — business metrics the app reads                   │
  │  gold_inventory_risk_daily        150 rows / day         │
  │  + 7 business-logic views (KPI, risk, story, actions)    │
  └──────────────────────────────────────────────────────────┘

gold_inventory_risk_daily

Per-store, per-product daily risk + recommended action (150 rows/day)

vw_dashboard_kpi_summary

Chain-wide KPI tiles on the Dashboard

vw_top_stockout_risk_products

Lowest days of supply, ranked

vw_top_waste_risk_products

Closest to expiry, ranked

vw_store_risk_summary

Per-store attention score

vw_category_risk_summary

Stockout vs waste by category

vw_today_inventory_story

Daily narrative for the Story page

gold_product_action_recommendations

Per-product recommended action (drives the Actions page)

Delta Lake gold views

Business logic lives in SQL, not in the app.

SQL Statement API

Server functions call Databricks directly — no extra warehouse.

SWR + cache-warm

Stale-while-revalidate cache keeps pages sub-second.

Risk model + recommended actions

Every recommendation in the app comes from these deterministic rules in the Gold view — auditable, explainable, and identical across the Dashboard, Actions, and Story pages.

Stockout risk

  • High · < 2 days of stock remaining
  • Medium · 2 – 5 days remaining
  • Low · > 5 days remaining

Waste risk (perishables)

  • High · ≤ 1 day until expiry
  • Medium · 2 – 3 days until expiry
  • Low · > 3 days until expiry
  • N/A · non-perishable items

Recommended actions

Reorder Now — stockout imminent

Discount or Donate — expiry within 1 day

Promote Soon — expiry within 2 – 3 days

Monitor Closely — medium stockout risk

No Action Needed — optimal inventory

The daily 6 AM Workflow

One scheduled Databricks Workflow refreshes the entire pipeline every morning before stores open.

Schedule

Daily at 06:00 America/Chicago
cron: 0 0 6 * * ? *

Compute

Serverless Starter Warehouse · PRO 2X-Small · Photon enabled · 10-min auto-stop

Execution flow

Bronze append (+150 sales) Silver clean & validate Gold risk + actions Pipeline summary

Data growth — 90-day projection

The dataset is live and growing. Each morning the Workflow appends another day of sales and recomputes risk for every store-product pair.

Day      Bronze     Silver Sales    Gold Risk
─────    ──────     ────────────    ──────────
Day 1     5,150          5,150            150
Day 30    9,650          9,650          4,500
Day 60   14,150         14,150          9,000
Day 90   18,650         18,650         13,500

Build this yourself on Databricks Free Edition

If you are a data engineer exploring the Lakehouse, you can stand up the same project in an afternoon. Everything below runs on Databricks Free Edition — no extra services required.

  1. 1Sign up for Databricks Free Edition and start a Serverless SQL Warehouse.
  2. 2Create a Unity Catalog schema: CREATE SCHEMA workspace.grocery_data_intelligence.
  3. 3Land raw POS into a Bronze Delta table (bronze_sales_raw) — append-only, one row per transaction.
  4. 4Clean into Silver: silver_stores, silver_products, silver_sales, silver_inventory with ETL timestamps.
  5. 5Compute the Gold risk + recommended action in one SQL statement (sample below). Wrap derived metrics in views so the app stays thin.
  6. 6Schedule a daily Workflow at 06:00, and point any front-end at the result via the Databricks SQL Statement API.
-- Gold: daily inventory risk + recommended action
CREATE OR REPLACE TABLE workspace.grocery_data_intelligence.gold_inventory_risk_daily AS
SELECT
  current_date()                              AS report_date,
  s.store_id, s.store_name, s.city, s.state,
  p.product_id, p.product_name, p.category,
  i.current_stock,
  COALESCE(d.avg_daily_demand, 0)             AS avg_daily_demand,
  i.days_until_expiry,
  ROUND(i.current_stock / NULLIF(d.avg_daily_demand, 0), 2) AS days_of_stock_remaining,
  CASE
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'High'
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Medium'
    ELSE 'Low'
  END                                         AS stockout_risk_level,
  CASE
    WHEN i.days_until_expiry IS NULL          THEN 'N/A'
    WHEN i.days_until_expiry <= 1             THEN 'High'
    WHEN i.days_until_expiry <= 3             THEN 'Medium'
    ELSE 'Low'
  END                                         AS waste_risk_level,
  CASE
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) < 2 THEN 'Reorder Now'
    WHEN i.days_until_expiry <= 1                            THEN 'Discount or Donate'
    WHEN i.days_until_expiry <= 3                            THEN 'Promote Soon'
    WHEN i.current_stock / NULLIF(d.avg_daily_demand, 0) <= 5 THEN 'Monitor Closely'
    ELSE 'No Action Needed'
  END                                         AS recommended_action,
  current_timestamp()                         AS gold_created_at
FROM workspace.grocery_data_intelligence.silver_inventory i
JOIN workspace.grocery_data_intelligence.silver_stores   s USING (store_id)
JOIN workspace.grocery_data_intelligence.silver_products p USING (product_id)
LEFT JOIN (
  SELECT store_id, product_id, AVG(quantity_sold) AS avg_daily_demand
  FROM workspace.grocery_data_intelligence.silver_sales
  WHERE sale_date >= current_date() - INTERVAL 14 DAYS
  GROUP BY store_id, product_id
) d USING (store_id, product_id);

Everything you see in this app comes from those six steps — no extra services, no second warehouse, no separate BI stack.

Project journey

Phase 1

Bronze — raw POS

Landed 5,150 raw sales transactions (Jan 24 – May 17, 2026) into bronze_sales_raw on Delta. Grows by +150 rows every day.

Phase 2

Silver — cleaned masters

Cleaned and validated into silver_stores (10), silver_products (15), silver_sales (5,150), and silver_inventory (150) with ETL timestamps.

Phase 3

Gold — risk + actions

Built gold_inventory_risk_daily plus seven business-logic views — KPI summary, stockout risk, waste risk, store/category risk, product actions, and a daily inventory story.

Phase 4

Intent-mapped Ask Data

Natural-language questions are classified into a small set of safe business intents that resolve to Gold-view queries — no raw SQL ever crosses the boundary.

Phase 5

Caching & performance

Server-side stale-while-revalidate cache with in-flight request dedup, plus a token-gated cache-warm endpoint for sub-second page loads.

Submission

DAIS 2026 entry

Recorded a short demo, wrote a Community Articles post, and submitted the form between May 15–31, 2026.

See today's inventory story

Open the dashboard, or read the Methodology page to see every Gold view, field, and decision rule behind this submission.