Job Market Insights Pipeline
An end-to-end data pipeline that ingests job listings from the Adzuna API, transforms data using dbt, and presents interactive insights through a Streamlit dashboard.
Project Links
Overview
This project builds a complete data pipeline to collect, process, store, and visualize job market data. The pipeline ingests job listings from the Adzuna API, transforms and stores them in PostgreSQL via dbt models, and presents insights through an interactive Streamlit dashboard.
Tech Stack
- Python
- AWS S3
- PostgreSQL
- dbt
- Streamlit
- GitHub Actions
Architecture
Data Flow: Adzuna API → Raw Data (JSON) → Data Transformation → S3 (raw + processed) → PostgreSQL (data warehouse) → dbt (staging + marts) → Streamlit Dashboard

Features
- Automated data ingestion from Adzuna API
- Data transformation and schema standardization
- Storage of raw and processed data in S3
- Structured data loading into PostgreSQL
- Data modeling using dbt (staging + marts)
- Interactive dashboard for job market insights
- End-to-end pipeline execution with a single script
Sample Insights
- Total number of job listings
- Top hiring companies
- Top job locations
- Work type distribution (full-time, part-time, etc.)
Key decisions
- Used Adzuna API for comprehensive, real-time job market data across multiple regions.
- Implemented S3 as an intermediate storage layer for both raw and processed data.
- Adopted dbt for data transformation to ensure maintainability and reusability of models.
- Built interactive Streamlit dashboard for real-time exploration of job market trends.
- Designed for future scalability with plans to deploy PostgreSQL on AWS RDS and automate with scheduled workflows.