About Project

Favorita — EDA, Forecasting & Inventory Optimization

Live EDA to understand demand (promotions, holidays, cities, perishables) → forecasting by SKU×store with multi-window backtesting → ordering policies that balance service level and waste.

1) The Challenge

In perishables, deciding how much and when to order simultaneously impacts fill rate, stockouts, and waste. We need granular visibility to design features and rules that support daily decisions by category and store.

2) Approach

  1. Interactive EDA: filters by family/city/promo/perishable; sales vs. oil/transactions; DOW analysis.
  2. Feature & split: lags/moving averages, promo/holiday flags, seasonality; multi-window backtesting.
  3. Forecasting: LightGBM with calendar, lag, and rolling features per SKU×store; WAPE/MAPE metrics and prediction intervals for inventories.
  4. Perishable inventories: ordering policy with shelf life, lead time, and service target.
  5. Executive KPIs: forecast accuracy, service, stockouts, and waste (Looker).
Forecast accuracy
WAPE 14.7%
Target < 20% ✓
Service level
96.3%
Target ≥ 95% ✓
Stockout reduction
−42%
vs. rule-of-thumb
Waste reduction
−18%
Shelf-life aware LP

3) Findings

Seasonality & DOW

Day-of-week explains 38% of variance

Strong weekly patterns enable optimized restocking cadence, reducing emergency orders by aligning supply with predictable demand cycles.

Promos/Holidays

Promotions spike demand 2.1× baseline

Promo flags and cooldown windows in the model reduce post-event over-ordering bias, preventing the waste that follows demand spikes.

Heterogeneity

City-level policies lift service 4 pts

Differentiated targets by city and perishability class raise fill rate from 92% to 96% without increasing waste, versus a one-size-fits-all approach.

Inventories

LP cuts waste 18% vs. heuristic

Including shelf life (7–14 days) and lead time (3–5 days) in the linear program reduces both stockouts and spoilage versus rule-of-thumb ordering.

Forecast Approach Comparison

RMSE, WAPE & MASE across ordering policies

4) Next Step

  1. Wave rollout: critical categories → rest; store-level guardrail KPIs.
  2. A/B/holdout by category: measure uplift in service and waste; weekly report.
  3. Orchestration: daily job for forecast + order recommendation; drift monitoring.
  4. Governance: feature catalog, model versioning, experiment logbook.

Analytical Modules

Exploratory

Interactive EDA

Filters by family/city/promo/perishable; sales vs. oil/transactions; DOW analysis.

  • Hypotheses & drivers
  • Outliers & anomalies
  • Feature inputs
BI

Executive KPIs

Forecast accuracy, service, stockouts, and waste by category/store.

  • Store filters
  • Series & comparisons
  • Export & share
Paper

Technical Paper

Perishable Inventory Optimization — full consultancy-style write-up.

  • Deterministic & scenario LP
  • Shelf life + lead time
  • Service vs. waste trade-off

Tech Stack

Docker (DB runtime)

Containerized database for reproducible local/CI runs and isolated test data.

Linear Optimization

OR-Tools / PuLP / SciPy linprog for LP/MILP ordering policies with shelf-life & lead-time constraints.

Streamlit

Interactive web app to explore forecasts and simulate ordering policies. Enables business users to test service–waste trade-offs and scenario plans without coding.

Looker

Executive dashboards for forecast accuracy, service, stockouts, and waste.

LaTeX

Technical paper (formulations, duals, KKT) and publication-ready figures.

Scope & Limitations

Scope
  • Operational EDA → Forecast → Order pipeline.
  • Auditable metrics (WAPE/MAPE, service, stockouts, waste).
  • Targets differentiated by city/category.
  • Dashboards in Looker for executive tracking.
Limitations
  • Sensitivity to promos/holidays and store mix.
  • Demand drift; requires periodic retraining.
  • Integration with operations for policy adoption.
  • Dependence on transactional data quality.