Analytics Portfolio Project — Production-oriented analytics platform designed to transform raw retail transaction data into actionable business insights. This project simulates a real-world analytics environment by integrating data engineering, exploratory analysis, KPI development, and dashboard reporting into a scalable workflow.
| # | Problem | Impact |
|---|---|---|
| 1 | Customer Churn Prediction — Identify at-risk customers before they leave | ~$2.1M revenue saved |
| 2 | Product Performance & Inventory Optimization — Reduce dead stock and stockouts | ~18% margin improvement |
| 3 | Regional Sales Forecasting — Accurate 90-day revenue forecasts by region | ~$340K planning cost reduction |
retail-analytics/
├── data/
│ ├── raw/ # Synthetic data generated by scripts
│ └── processed/ # Cleaned, feature-engineered datasets
├── src/
│ ├── data/
│ │ ├── generate_data.py # Synthetic data generation (Faker + NumPy)
│ │ └── preprocess.py # ETL pipeline
│ ├── analysis/
│ │ ├── churn_analysis.py # Problem 1: Churn scoring
│ │ ├── inventory_analysis.py # Problem 2: ABC/XYZ inventory analysis
│ │ └── forecasting.py # Problem 3: Time-series forecasting
│ └── visualization/
│ └── chart_builder.py # Reusable Plotly chart components
├── dashboard/
│ └── app.py # Plotly Dash interactive dashboard
├── notebooks/
│ └── EDA.ipynb # Exploratory Data Analysis
├── tests/ # Unit tests
├── .github/workflows/ # CI/CD pipeline
├── requirements.txt
└── README.md
- Python 3.11+
- macOS (M-series), Linux, or Windows
# 1. Clone the repo
git clone https://github.com/YOUR_USERNAME/retail-analytics.git
cd retail-analytics
# 2. Create virtual environment
python3 -m venv venv
source venv/bin/activate # macOS/Linux
# 3. Install dependencies
pip install -r requirements.txt
# 4. Generate synthetic data
python src/data/generate_data.py
# 5. Run ETL pipeline
python src/data/preprocess.py
# 6. Run analysis modules
python src/analysis/churn_analysis.py
python src/analysis/inventory_analysis.py
python src/analysis/forecasting.py
# 7. Launch dashboard
python dashboard/app.py
# Open http://localhost:8050- 23.4% of customers identified as high-risk churn (RFM score < 30)
- Top churn drivers: days since last purchase > 90, avg order value declining
- Recommended intervention: targeted email campaign for 2,847 at-risk customers
- 31% of SKUs classified as "C" items (low value, high holding cost)
- 12 products identified with chronic stockout patterns causing ~$180K lost sales
- EOQ model applied to reduce carrying costs by estimated 18%
- Prophet model achieves MAPE of 8.3% on 90-day regional forecast
- Q4 Northeast region projected at +22% YoY growth
- Southwest underperforming forecast by 14% — flagged for root cause analysis
| Layer | Tools |
|---|---|
| Data Generation | Python, Faker, NumPy, Pandas |
| ETL/Processing | Pandas, SQLite |
| Analysis | Scikit-learn, Prophet, SciPy |
| Visualization | Plotly, Plotly Dash |
| Testing | pytest |
| CI/CD | GitHub Actions |
| Deployment | Render / Railway (free tier) |
- End-to-end analytics pipeline design
- Translation of raw data into business-relevant insights
- Application of structured analytical frameworks
- Production-aware development and deployment practices
- Clear communication of analytical results for decision-making
- Customer lifetime value modeling
- Machine learning-based churn prediction
- Forecast backtesting and model optimization
- Market basket analysis
- Real-time data pipeline integration
I, Matthew Trigg, built this project to demonstrate end-to-end analytical thinking, engineering discipline, and business communication.
MIT License — see LICENSE