Case Study — Portfolio Project
中華職棒進階數據分析作品集
Domain
Baseball Analytics
中華職棒進階數據
Status
Live · 2025–2026
持續每日更新
Type
Portfolio Project
作品集 + 研究
Role
Data Engineer + Analyst
全端獨立開發
為什麼做這個
CPBL analytics is years behind MLB in public tooling. No public Statcast equivalent. No equivalent to FanGraphs or Baseball Reference with league-specific metrics. Most analysis uses raw batting average and ERA from the official site. I wanted to know what the numbers actually say — so I built the infrastructure to find out.
台灣棒球圈值得跟 MLB 球迷一樣深度的數據分析。FanGraphs、Baseball Savant 都不涵蓋中華職棒——所以我自己建一個。
量化成果
377
Games analyzed
CPBL 2024–2025
28K
Plate appearances
Batted ball events
112K
Pitches tracked
Pitch-level data
84%
Data completeness
vs CPBL official
17
Analysis modules
Across 5 categories
10
Chart types
ECharts interactive
19
API endpoints
FastAPI backend
168
Players profiled
Batters + pitchers
分析模組
Seventeen modules across five analytical categories: batting, pitching, defense, game states, and season trends. Each module is independently queryable and visualized with ECharts interactive charts.
Batting Average on Balls in Play
BABIP by batter and pitcher. Identifies regression candidates and outliers across the full season.
xFIP & ERA- Analysis
Defense-independent pitching metrics. Surfaces pitchers outperforming or underperforming their true skill level.
Pitch Mix Breakdown
Usage rates and outcomes by pitch type. Visualized as stacked bars and polar charts per pitcher.
Zone Contact Maps
Spray charts and zone-based contact rates. Shows where each batter hits the ball and where pitchers attack.
Lineup Efficiency
wOBA and wRC+ by batting order position. Identifies misaligned lineups and platoon opportunities.
Bullpen Load Monitor
Tracks appearances, IP, and rest days for relief pitchers. Flags overused arms before performance drops.
Run Expectancy Matrix
Base-out state run expectancy built from CPBL-specific data. More accurate than applying MLB RE24 to CPBL.
Season Trajectory
Rolling 15-game averages for key metrics. Visualizes hot/cold streaks and team-level performance trends.
技術棧
Data Layer
Backend
Frontend
Infrastructure
學到了什麼
Building from raw data teaches you what analytics platforms hide. CPBL API has field inversions, missing values, and non-standard encoding that only appear when you query at scale.
ECharts is the right tool for data-dense, interactive sports charts. Recharts and Chart.js both hit walls when rendering 112K data points with hover and drill-down.
A FastAPI + SQLite stack handles 19 endpoints and 28K rows trivially. Over-engineering the backend is the most common mistake on portfolio analytics projects.
Presenting sabermetrics to a CPBL audience requires translation — xFIP means nothing without a reference to familiar ERA context and a plain-language explanation.