1. Purpose
This guide describes how to ingest, model, and query NBA statistics data using the NBA Stats API into a structured data warehouse. It aims to help data engineers build scalable ETL pipelines and empower data analysts with reliable, queryable insights.
🧱 Common Endpoints:
|
Endpoint Name |
Description |
Frequency |
|---|---|---|
|
leaguegamelog |
All games played in a season |
Daily |
|
boxscoretraditionalv2 |
Box score per game |
Daily |
|
teamdetails |
Static metadata for teams |
Occasional |
|
playercareerstats |
Career stats per player |
Weekly |
|
scoreboardv2 |
Daily game schedule and results |
Daily |
🗂️ 4. Data Warehouse Schema Design
📁 Schema: nba
|
Table Name |
Description |
|---|---|
|
dim_game |
One record per game (game ID, date, teams, venue) |
|
dim_team |
Team metadata (location, abbreviation, conference) |
|
dim_player |
Player metadata (name, team, position) |
|
fact_boxscore |
Game-level player stats |
|
fact_team_stats |
Aggregated team-level stats |
|
dim_season |
Season metadata |
📌 Fact-Table Example: fact_boxscore
|
Column |
Description |
|---|---|
|
game_id |
Foreign key to dim_game |
|
player_id |
Foreign key to dim_player |
|
team_id |
Foreign key to dim_team |
|
pts |
Points scored |
|
reb |
Rebounds |
|
ast |
Assists |
|
tov |
Turnovers |
|
min |
Minutes played |
📈 6.
⚠️ 8. Challenges & Tips
|
Issue |
Recommendation |
|---|---|
|
API rate limits |
Use delay + rotating user agents |
|
Changing endpoint schemas |
Store raw payloads in raw tables |
|
Nested JSON |
Use flattening libraries or SQL UNNEST equivalents |
|
Missing games/data |
Implement retries + logging |