Job Market Insights Pipeline

An end-to-end data pipeline that ingests job listings from the Adzuna API, transforms data using dbt, and presents interactive insights through a Streamlit dashboard.

Project Links

GitHub Repository

Overview

This project builds a complete data pipeline to collect, process, store, and visualize job market data. The pipeline ingests job listings from the Adzuna API, transforms and stores them in PostgreSQL via dbt models, and presents insights through an interactive Streamlit dashboard.

Tech Stack

Python
AWS S3
PostgreSQL
dbt
Streamlit
GitHub Actions

Architecture

Data Flow: Adzuna API → Raw Data (JSON) → Data Transformation → S3 (raw + processed) → PostgreSQL (data warehouse) → dbt (staging + marts) → Streamlit Dashboard

Features

Automated data ingestion from Adzuna API
Data transformation and schema standardization
Storage of raw and processed data in S3
Structured data loading into PostgreSQL
Data modeling using dbt (staging + marts)
Interactive dashboard for job market insights
End-to-end pipeline execution with a single script

Sample Insights

Total number of job listings
Top hiring companies
Top job locations
Work type distribution (full-time, part-time, etc.)

Key decisions

Used Adzuna API for comprehensive, real-time job market data across multiple regions.
Implemented S3 as an intermediate storage layer for both raw and processed data.
Adopted dbt for data transformation to ensure maintainability and reusability of models.
Built interactive Streamlit dashboard for real-time exploration of job market trends.
Designed for future scalability with plans to deploy PostgreSQL on AWS RDS and automate with scheduled workflows.