Unsupervised — EFA + Clustering
We discover segments and their drivers, and define actions per profile.
- EFA (KMO/Bartlett) → latent factors
- K-Means / PAM / CLARA / hierarchical
- Silhouette, stability, and naming
SQL architecture to consolidate facts/dims → segmentation with EFA + clustering → calibrated repurchase models to drive targeted campaigns by probability × ticket.
Multi-product marketplace with heterogeneous customers. We need to identify segments and predict repurchase to focus campaigns and offers, increasing frequency and ticket while reducing churn.
vw_customer_features (R/F/Monetary, logistics, reviews, categories).The top 20% capture most expected repurchases → prioritize contacts.
Curve close to the perfect diagonal; per-segment thresholds maximize ROI.
Recency, logistics rating, and post-sale experience stand out; ticket is modeled separately.
Messaging, discounts, and cross-sell vary by profile and probability × ticket.
features catalog, model versioning, and experiment logbook.We discover segments and their drivers, and define actions per profile.
Calibrated repurchase probability and expected ticket to prioritize campaigns.
avg_ticketExecutive sequence that integrates dashboards and key findings.
SQL (Snowflake)
Snowflake schema (facts/dims), analytic views, and warehouse orchestration.
scikit-learn
Clustering & supervised models, calibration, validation, and evaluation.
Tableau
Executive story, dashboards, KPI tracking, and sharing.
pandas
Feature engineering, time windows, joins, and I/O.
NumPy
Vectorized math and numerical helpers.