Pipeline Documentation

Pivot/TBL Re-Rates
Pipeline Guide

A comprehensive look at how over 60 real-world data sources are transformed into analytically-grounded player ratings for Eastside Hockey Manager. Built for transparency, designed for the community.

1,675
Players Rated
20+
Data Sources
10
Modeling Phases
27,800+
Lines of Code

What This Is & Why It Exists

For years, EHM database attribute updates were done by hand. Ratings were subjective, best-effort, and not evenly grounded to data. The Re-Rates pipeline changes that by leveraging the same rich analytics that NHL front offices use — expected goals models, isolated impact metrics, tracking data, and WAR-style value frameworks — to produce player ratings that reflect what's actually happening on the ice.

The pipeline takes raw statistical data from sources like MoneyPuck, Evolving Hockey, NHL EDGE tracking, LB Hockey, HockeyStatCards, and more, then transforms it through a series of modeling phases that each handle a specific domain of hockey ability. The result is a complete set of EHM attribute ratings for every NHL and AHL player with sufficient data — ratings that can be explained and defended with evidence rather than gut feel.

This is the vNext release, built during the February 2026 Olympic break. It represents the most significant overhaul to the pipeline since its inception — every script was audited, refactored, and improved, with two entirely new modeling phases added and the data foundation expanded dramatically.

AI Disclosure

This project was created with the assistance of AI (originally ChatGPT, now Claude by Anthropic). All scripting, mathematical modeling, and much of this documentation was AI-assisted. The human (xECK29x) provided project oversight, hockey expertise, design direction, and validation throughout. AI is a powerful tool, but it requires deep domain knowledge to guide — it will make assumptions, hit boundaries, and needs constant course correction. The hockey knowledge driving every design decision is human.

Design Principles

📊
Analytics First, Hockey Always

Ratings are driven by data, but every formula was designed with hockey knowledge. A shutdown defenseman who suppresses chances gets rewarded even if he doesn't rack up blocked shots. A fourth-liner trusted by his coaching staff gets a deployment floor even if his shot metrics are modest.

⚖️
Rates Over Totals

The pipeline overwhelmingly prefers per-60 and per-game rate metrics over raw counting stats. A player who missed 20 games to injury shouldn't be penalized for lower totals — their per-minute impact is what matters for EHM ratings.

🛡️
Conservative by Default

Extreme ratings (1–3 or 18–20) should be rare and earned. The pipeline uses reliability weighting, Bayesian shrinkage, and PA-gated elite tail boosts to ensure only truly elite or truly deficient players hit the extremes.

🔍
Transparent & Auditable

Every modeling decision can be traced to specific data signals with specific weights. The pipeline generates QA viewers, distribution reports, and per-player decision logs. This documentation exists so anyone can challenge or improve the methodology.

Pipeline Architecture

The pipeline runs in two parallel tracks — NHL and AHL — that merge at the end into a single combined database. Each track follows the same principle: collect data, build a master file, then run a series of modeling phases that each handle a specific domain of hockey skill. The full pipeline spans 24 Python scripts totaling nearly 28,000 lines of code.

The Simple Version

Think of it like an assembly line. Raw data goes in one end. Each station on the line adds one type of rating — physicality, defense, skating, offense, faceoffs, goaltending. At the end, a quality inspector (Phase 8) checks the overall picture and suggests whether the player's CA needs adjusting. Then the NHL and AHL lines merge into one database.

NHL Track

NHL Pipeline
EDGE Collection
Master Build
Phase 2: Physicality
Phase 3: Defense
Phase 4: Skating
Phase 5: Offense & Cognition
Phase 5b: Career Mentals
Phase 6: Faceoffs
Phase 7: Goalies
Phase 7b: Goalie Mentals
Phase 8: CA & Roles

AHL Track

AHL Pipeline
AHL Master Build
AHL Skaters Model
AHL Goalies Model
AHL Phase 3: CA & Roles

Final Merge

Final Output
NHL Modeled Master
+
AHL Modeled Master
Quality-Preferred Merge
Combined Database (1,675 players)

Shared Mathematical Foundations

Every modeling phase in the pipeline shares a common set of mathematical tools. Understanding these once will help you follow the logic in any phase.

Per-60 Rate Conversion

Raw counting stats are converted to rates per 60 minutes of ice time. This prevents ice-time bias — a first-liner who plays 22 minutes per game and a fourth-liner who plays 8 minutes are compared on an equal footing.

Formula
value_per60 = (raw_count / TOI_minutes) × 60

Position-Group Z-Scoring

Signals are standardized within position groups (forwards, defensemen, goalies separately). A defenseman's hit rate is compared to other defensemen, not to forwards. This produces a z-score: how many standard deviations a player is from the average at their position.

Formula
z = (player_value − position_mean) / position_std_dev

A z-score of +1.5 means the player is 1.5 standard deviations above average for their position group. A z-score of 0 means exactly average.

Reliability Weighting

Players with small sample sizes (few games, low ice time) are "shrunk" toward a neutral average. This prevents a player who scored on his only shot attempt from getting a 20 Wristshot rating. The more ice time a player has, the more we trust the data.

Formula
reliability = (TOI / soft_threshold)^0.75 // capped at 1.0 final_score = (reliability × data_score) + ((1 − reliability) × neutral_midpoint)

Soft thresholds: Forwards ~200 min, Defensemen ~300 min. Full trust: Forwards ~428 min, Defensemen ~604 min.

Normal-CDF Mapping (vNext Improvement)

The old pipeline used percentile-rank mapping, which produced unrealistic "flat" distributions — roughly equal numbers of players at every rating from 1 to 20. The vNext pipeline uses z-score → normal CDF mapping, which creates natural bell curves. Most players cluster around 10–14, with very few earning extreme ratings.

Formula
percentile = Φ(z_score) // normal CDF rating = round(1 + percentile × 19) // maps 0–1 to 1–20 scale

This is one of the most impactful vNext changes. It means an 18 rating is genuinely rare and represents truly elite performance, not just "top 15%."

Bayesian Shrinkage

For percentage-based metrics (like shooting percentage by shot type), raw percentages from small samples are unreliable. Bayesian shrinkage blends the player's observed rate with the league average, weighted by sample size.

Formula
smoothed_pct = (player_goals + prior_rate × prior_shots) / (player_shots + prior_shots) // More shots → player's own rate dominates // Fewer shots → pulls toward league average

Example: A player who scored 1 goal on 2 wrist shots (50%) is shrunk heavily toward the league average (~12%). A player who scored 15 on 120 (12.5%) barely moves.

Elite Tail Boost (PA-Gated)

Franchise-level talents (PA ≥ 180) receive a gentle boost into the 17–20 range for key offensive and cognitive skills — but only if their data already shows above-average performance. This ensures McDavid and MacKinnon separate from very good players without inflating everyone.

Logic
// Only applies to players already above average // Only applies to offensive-lean roles // Scales from PA 180 (partial) to PA 200 (full strength) if PA ≥ 180 AND score > average AND role is offensive: boost_strength = (PA − 180) / 20 // 0.0 at 180, 1.0 at 200 score += tail_boost × boost_strength

Data Sources

The vNext pipeline ingests data from 20+ independent analytical sources across 60+ individual files. Each source brings something different to the table — some measure what a player does (counting stats), some measure how well they do it (expected goals models), some isolate their individual impact from their teammates (RAPM), and some provide direct physical measurements (EDGE tracking).

vNext Change

The Natural Stat Trick dependency was fully eliminated in this release. All NST signals were replaced by equivalent or superior columns from MoneyPuck and Evolving Hockey, removing a redundant data layer and tightening the analytical foundation.

Source What It Provides Columns Feeds
MoneyPuck
Skaters + Goalies
Expected goals models (xGF, xGA, xG±), shot locations, rebound creation, on/off-ice differentials across 5v5, 5v4, and 4v5 strength states. Also provides raw shot-level event data for per-player xG aggregation and average shot distance. 54+ Phases 2–5, 7, 8
Evolving Hockey
Skaters + Goalies
The deepest analytics ecosystem in the pipeline. Individual rates, on-ice rates, relative rates, score-venue adjusted rates — plus the full GAR (Goals Above Replacement), xGAR (expected GAR), and RAPM decompositions. RAPM isolates a player's individual impact from teammates, opponents, zone starts, and score state. 67+ Phases 2–5, 7, 8
NHL EDGE Tracking
Skaters + Goalies
Real measured physical data from puck and player tracking cameras. Skating speeds (top speed, burst counts), skating distances (per shift, per game), shot speeds (release velocity), shot locations, zone time shares, and goalie-specific metrics (save % by shot location, lateral movement). 6 categories Phases 4, 5, 7, 8
LB Hockey
Skaters + Goalies
Graded context metrics including WAR components, Chance Suppression, Retrievals, D-Zone Pressure Mitigation, Transition play, Entry Volume, Puck Management, and Zone Offense/Defense ratings. Provides a "scouting-like" analytical layer that captures skills traditional stats miss. 25+ Phases 2–5, 7, 8
HockeyStatCards
Skaters + Goalies
GameScore-based player ratings, offensive/defensive ratings, and usage data (zone starts, competition quality). Provides an independent composite view of player quality. 15+ Phases 5, 7, 8
NHL.com Statistics
Skaters + Goalies
Official NHL data: penalties, faceoffs, miscellaneous stats (hits, blocks, takeaways, giveaways), shots by type, puck possession, goalie advanced stats, rest splits, saves by strength, shootout stats, and started vs. relieved data. ~40 Phases 2–7
Hockey ELO Ratings
Skaters
Faceoff-specific ELO ratings that account for opponent quality and situational context — a more sophisticated measure than raw faceoff win percentage. 1 Phase 6
Elite Prospects
Skaters + Goalies
Scouting-intelligence style tags (Sniper, Playmaker, Power Forward, etc.) that provide an independent human observation layer on top of analytics. Used as soft supplemental signals for role suggestion. Tags Phase 8
AHLTracker
AHL Only
The primary (and largely only) data source for AHL players. Provides scoring (5v5, PP), on-ice metrics (5v5, PP, SH), advanced stats, and penalties. More limited than the NHL data ecosystem — no xG models, no RAPM, no tracking data. ~175 AHL Pipeline
Career History
Skaters + Goalies
Multi-year, multi-league career data from the EHM database export (~43,700 season records across 1,871 players). Provides playoff vs. regular season splits, season-to-season production variance, and career trajectory signals. Derived Phase 5b

Modeling Phases

Each phase is responsible for a specific domain of hockey skill. Phases run sequentially — each reads the output of the previous phase and adds its own _model columns. Click any phase to expand its details.

Phase 2 Physicality WorkRate · Hitting · Fighting · Aggression · Dirtiness · Agitation · Bravery · Sportsmanship · Temperament
The Simple Version

Phase 2 answers: "How does this player engage physically?" It measures behavioral tendencies — how often and how willingly a player hits, fights, plays through contact, and battles in front of the net. These aren't performance ratings (that's later phases) — they're about what type of game a player plays.

Attributes Modeled

WorkRate Hitting Fighting Aggression Dirtiness Agitation Bravery Sportsmanship Temperament

Key Data Signals

NHL.com Hits/60 and Hits Taken/60
LB Hockey: Play Through Contact
LB Hockey: Net-Front Play grade
LB Hockey: Hits Absorbed/60
NHL.com Penalties (fighting majors, roughing)
Evolving Hockey RAPM net impact
MoneyPuck on-ice xG differential
NHL.com Penalty type breakdown (minor/major/misconduct)
LB Hockey: D-Zone Pressure Mitigation

How It Works

Every attribute follows the same pattern: select observable behavioral signals (events, not outcomes), convert to per-60 rates, normalize within position groups, apply reliability weighting, apply role-based context, then map to the 1–20 EHM scale with conservative bounds.

Hitting is driven primarily by hit rates (Hits/60) — a pure frequency measure. A player who throws 10 hits per 60 minutes will rate higher than one who throws 3, regardless of their other skills. The formula also considers hits taken (players who absorb contact are typically physical players) and LB Hockey's Net-Front Play grade.

Bravery received the biggest upgrade in vNext, with 26+ new signals from LB Hockey and Evolving Hockey. It now captures Play Through Contact (willingness to maintain puck control under pressure), Hits Absorbed/60 (absorbing contact rather than avoiding it), and D-Zone Pressure Mitigation. An NHL-wide lift constant was added to better reflect that simply playing in the NHL requires a baseline level of bravery.

WorkRate combines engagement metrics (RAPM net impact, on-ice involvement) with effort signals. It's designed to capture the "motor" — players who are consistently engaged shift after shift.

Aggression and Dirtiness separate clean physical play from chippy/nasty play. Aggression is driven by overall PIM/60 rate; Dirtiness rises with the misconduct and match-penalty share of those PIMs. Sportsmanship and Temperament work in the inverse direction — players with low penalty rates and clean profiles rate higher in these attributes. Agitation captures drawing-penalty tendency from penalty-drawn metrics where available.

Design Note

Phase 2 attributes are behavioral tendencies, not performance multipliers. Per the EHM research notes, they influence the frequency and likelihood of actions — a high Hitting rating means the player attempts more hits, not that his hits are more effective. This is why Phase 2 uses event rates rather than outcome metrics.

A Note on Strength

Strength is a real EHM attribute, but it's not modeled in this pipeline. It's set via a separate algorithm built directly into the EHM Editor — primarily driven by player size and weight — and applied as a downstream step rather than through the analytical modeling phases. The pipeline doesn't override the Editor's Strength values.

Role Context

Roles matter in Phase 2 more than most phases. An Enforcer has Fighting and Aggression as KEY attributes (floor: 14) with Hitting and Bravery as ESSENTIAL (floor: 13). A Playmaker doesn't have physicality attributes as key, so they follow the standard distribution. This ensures role-appropriate ratings without artificially inflating everyone.

Phase 3 Defensive Skills Positioning · Checking · Pokecheck · DefensiveRole
The Simple Version

Phase 3 answers: "How effectively does this player defend?" Not effort or willingness (that's Phase 2) — this is skill execution. Can this player close passing lanes? Does he suppress chances? Is he in the right position? The vNext overhaul replaced proxy stats with direct defensive impact measures.

Attributes Modeled

Positioning Checking Pokecheck DefensiveRole

Key Data Signals

LB Hockey: Chance Suppression
LB Hockey: Retrievals grade
LB Hockey: D-Zone Pressure Mitigation
Evolving Hockey RAPM xGA/60
Evolving Hockey Giveaway/Takeaway rates
MoneyPuck on-ice xGoals Against
MoneyPuck Corsi differentials
NHL.com Blocked Shots/60
NHL.com Takeaways/60

How It Works

Positioning — the most important defensive attribute per the research notes — is now anchored by LB Hockey's Chance Suppression (direct measure of preventing quality chances against), Evolving Hockey's RAPM xGA/60 (isolated individual defensive impact), and Evolving Hockey giveaway/takeaway rates. A shutdown defenseman who suppresses chances and mitigates zone pressure is properly recognized even if he doesn't rack up blocked shots.

Checking captures body-checking effectiveness — driven by hit rates and contact engagement signals from NHL.com plus zone-time context from EDGE deployment data. Pokecheck captures stick-on-puck defensive execution, driven by takeaway rates, blocked shots, and chance suppression. DefensiveRole reflects deployment trust and willingness to commit to defensive play, anchored by penalty-kill TOI share and short-handed deployment.

Phase 3 applies a global +2 lift to skater defensive skill attributes. This reflects the research notes guidance that NHL-level players should generally have respectable defensive fundamentals — the floor of NHL defensive ability is higher than the EHM scale's midpoint.

vNext Change

NST-derived xGA/60 and CA/60 were fully replaced by MoneyPuck equivalents (on-ice xGoals Against and Corsi differentials). LB Hockey Chance Suppression, Retrievals, and D-Zone Pressure Mitigation were added as first-class signals — these directly measure defensive outcomes that traditional stats only approximate.

Phase 4 Skating & Conditioning Acceleration · Pace · Agility · Balance · Stamina · NaturalFitness
The Simple Version

Phase 4 answers: "How does this player move, and how long can he keep it up?" This is where NHL EDGE tracking data really shines — we're using actual measured skating speeds and distances from the puck tracking cameras, not estimates. When the data says McDavid hit 24.3 mph, that's a real measurement, not a proxy.

Attributes Modeled

Acceleration Pace Agility Balance Stamina NaturalFitness

Key Data Signals

EDGE: Top skating speed & percentile
EDGE: Speed bursts ≥22 mph count
EDGE: Average skating speed
EDGE: Total skating distance per game
EDGE: Distance per shift
TOI-based workload metrics
BMI z-scores (Agility and Balance)

How It Works

Acceleration and Pace are directly driven by EDGE tracking data. Top speed feeds Pace (how fast can this player go?), while speed bursts (the number of times a player hits ≥22 mph) feed Acceleration (how explosive is their first few strides?). These are the most data-pure attributes in the entire pipeline — direct physical measurements rather than statistical inferences.

Stamina and NaturalFitness combine EDGE skating distance data (total distance per game, distance per shift) with traditional workload metrics (TOI, shifts per game). A player who covers more ground per shift and maintains consistent shift lengths deep into games rates higher on conditioning.

Agility and Balance are derived from BMI z-scores calculated from actual player height and weight data — and they pull in opposite directions. Lighter, smaller players trend toward higher Agility (more nimble); heavier, larger players trend toward higher Balance (harder to knock off the puck). This is why the pipeline hard-fails if height/weight data is missing. Per the EHM research notes, skating attributes don't significantly affect goalie performance, so goalies receive conservative ratings here with a goalie-specific Agility floor for athletically elite netminders.

EDGE Data Coverage

EDGE tracking data is available for most NHL players but not all — some players (especially those called up mid-season or with very few games) may be missing. When EDGE data is unavailable, Phase 4 falls back to TOI-based proxy metrics with a reliability penalty. The edge_bulk_collect.py script and patch merge utility exist specifically to maximize EDGE coverage.

Phase 5 Cognitive & Offensive Skills Anticipation · Decisions · Creativity · Passing · Movement · Stickhandling · Wristshot · Slapshot · Deking · Deflections · Flair · OffensiveRole · PassTendency
The Simple Version

Phase 5 answers: "How does this player create offense and make plays?" This is the biggest and most analytically complex phase — it shapes playmaking instincts, attacking awareness, shooting technique, and on-ice creativity. This phase received the most dramatic vNext improvements: z-score mapping, Bayesian shrinkage, and the elite tail boost all landed here.

Attributes Modeled

Anticipation Decisions Creativity Passing Movement Stickhandling Wristshot Slapshot Deking Deflections Flair OffensiveRole PassTendency

Key Data Signals

Evolving Hockey: Primary Assists/60
Evolving Hockey: Individual xG/60
Evolving Hockey: relative xGF/60 and xGA/60
RAPM: xGF/60 (offensive impact)
RAPM: xGA/60 (defensive context)
LB Hockey: Transition & Entry Volume
LB Hockey: Puck Management & OZ Pressure
LB Hockey: Zone Offense/Defense ratings
MoneyPuck: Shot type conversion rates
EDGE: Shot speed (release velocity)
EDGE: Shot location (dangerous area %)
HockeyStatCards: Offensive/Defensive ratings

How It Works

Cognitive attributes (Anticipation, Decisions, Creativity) are the hardest to model from stats alone — you can't directly measure "hockey sense" or "vision." Creativity in EHM is effectively the vision/playmaking-IQ attribute, and the pipeline uses proxy signals to capture it: expected goals differentials (if a player consistently creates more xGF than average, they're making good decisions), transition play grades (players who successfully carry the puck through the neutral zone demonstrate anticipation), and relative on-ice metrics (players who make their teammates better show creative playmaking). When EH and LB Hockey both flag a player as elite in transition and zone offense, that's a Creativity signal even when the box score doesn't show it.

Passing is anchored to distribution impact: Primary Assists/60, assist ratios, and LB Hockey's Puck Management and OZ Pressure grades. It's deliberately separated from shot volume — a great passer creates quality looks for teammates, which shows up in assist rates and team expected goals rather than personal shot counts.

Shooting attributes (Wristshot, Slapshot) use Bayesian-shrunk shot-type conversion rates from MoneyPuck, supplemented by EDGE shot speed data. This is where the Bayesian shrinkage improvement matters most — the old pipeline would give a player who scored on 1 of 2 wrist shots a massive Wristshot rating. Now, low-sample players are shrunk toward the league average, and only players with substantial evidence of elite finishing earn 18+ ratings.

Flair is offensive-only and specifically tied to franchise-level talent (PA ≥ 180). It is not modeled for goalies.

vNext Changes

Z-score mapping replaced percentile-rank mapping, creating realistic bell curves instead of flat distributions. Bayesian shrinkage was added for shot-type percentages to prevent small-sample inflation. Elite tail boosts (PA-gated) ensure franchise talents separate from very good players. Sprint 4a wired in 20+ new signals from Evolving Hockey individual rates, RAPM, and LB Hockey context grades.

Guardrails

Phase 5 has the most extensive guardrail system in the pipeline:

Key/Essential floors — attributes flagged as KEY for a player's role have a floor of 14; ESSENTIAL attributes floor at 13.
PA floors — even non-Key/Essential attributes have minimum ratings scaled to PA (a high-PA player won't have a 3 in anything).
Defensive-archetype soft cap — shutdown specialists are capped at 16 for pure offensive attributes like Passing and Creativity, preventing noise from creating false elite passers.
Elite tail boost — PA-gated, role-gated, only for players already above average.

Phase 5b Career-Informed Mental Attributes Consistency · ImportantMatches · Pressure · Determination · Adaptability · Professionalism
The Simple Version

Phase 5b answers: "What does this player's career history tell us about their character?" Single-season analytics can't tell you if a player elevates in the playoffs or crumbles under pressure. But a decade of career data — comparing playoff production to regular season, measuring season-to-season consistency, tracking career trajectory through league levels — can. This is an entirely new phase for vNext.

New for vNext

Phase 5b did not exist in previous releases. Mental attributes like Consistency, Important Matches, Pressure, and Determination were previously left at database defaults or set manually. These attributes are performance multipliers in the EHM engine — Consistency directly controls what percentage of games a player performs at their CA level. Getting them right has enormous gameplay impact.

Attributes Modeled

Consistency ImportantMatches Pressure Determination Adaptability Professionalism

How It Works

Consistency is modeled from detrended season-to-season PPG variance (skaters) or SV% variance (goalies). The "detrended" part is important — we remove the expected production arc (players improve through their early 20s and decline in their mid-30s) so we're measuring true inconsistency, not normal career development. Seasons are compared only within the same league tier to avoid cross-tier noise.

Critical EHM Mechanic

Consistency directly controls game-to-game performance: Rating 5 = performs to CA in only 25% of games. Rating 10 = 50%. Rating 15 = 75%. Rating 20 = 100%. A 160 CA player with Consistency 5 will produce like a fringe NHLer most nights. The pipeline defaults NHL regulars to 12–14, reserving sub-10 for documented extreme cases.

Important Matches measures the playoff delta — how much a player's production changes in the postseason versus the regular season. A player who goes from 0.8 PPG in the regular season to 1.1 PPG in the playoffs gets a positive delta. Importantly, the EHM scale treats 10 as "performs the same in big games" — above 10 means they elevate, not just maintain. Deltas are tier-weighted (NHL playoffs worth more than AHL playoffs) and require minimum 5 playoff GP.

Pressure mirrors Important Matches for skaters (with additional Cup bonus and award signals). Goalies already have Pressure modeled in Phase 7, so Phase 5b skips them.

Determination combines Important Matches (45%), Elite Longevity signals (35%), and career path grinding (20%). Stanley Cup winners receive scaled bonuses: 1 Cup = +2, 2 Cups = +3, 3+ Cups = +4. Career award winners (Hart, Norris, Vezina, etc.) get cumulative bonuses with diminishing returns per award type.

Adaptability is modeled from geographic diversity (number of distinct regions played in) and nationality signals. A European player who has played in multiple leagues across continents rates higher than someone who has only played in one system.

Professionalism uses career PIM rate and longevity as primary signals, correlated with the Sportsmanship attribute. Players with long careers and clean disciplinary records rate higher.

Phase 5b Also Adjusts Existing Attributes

Beyond creating new _model columns, Phase 5b applies bounded nudges (±3 max) to Phase 5 attributes based on career signals:

Anticipation & Decisions nudged by Consistency signal (consistently good players demonstrate sustained hockey sense)
WorkRate nudged by Consistency + IM signal (reliable grinders who show up every game)
Bravery nudged by Important Matches signal (players who elevate in playoffs demonstrate courage)
Teamwork nudged by Consistency signal (±2 max)

Phase 6 Faceoffs Faceoffs
The Simple Version

Phase 6 is intentionally simple: faceoff ability is a standalone mechanical skill. A great faceoff man might be a terrible skater or a defensive liability — the dot is its own world. The phase uses Faceoff ELO ratings (which account for opponent quality) with position-specific ranges and a logistic curve for realistic separation.

How It Works

The primary signal is Faceoff ELO from hockeyeloratings.com — a more sophisticated measure than raw FO% because it accounts for who you're taking faceoffs against. A player who wins 52% against elite faceoff opponents is more impressive than one who wins 55% against weak competition.

Position-specific ranges ensure appropriate ratings:

Centers: 9–20 range, logistic curve mapping from ELO
Wingers: 4–15 range, softened shrinkage toward 7.2 baseline
Defensemen: 1–6 tight range (they rarely take faceoffs)

WorkRate and Stamina z-scores from earlier phases provide small modifiers — the idea being that faceoff consistency requires sustained effort and conditioning.

Phase 7 Goaltending Reflexes · Positioning · Glove · Blocker · Recovery · Rebounds · OneOnOnes · Agility · Pressure · Stamina · WorkRate
The Simple Version

Phase 7 answers: "How good is this goalie, and what are they good at?" Goalie ratings are now the most analytically grounded they've ever been, drawing from five independent goalie-specific data ecosystems. The challenge with goalies is that small samples can swing wildly — a goalie who faces 10 high-danger chances and stops 9 looks elite, but 10 chances is noise. The entire phase is built around Bayesian smoothing and reliability weighting to prevent this.

Attributes Modeled

Reflexes Positioning Glove Blocker Recovery Rebounds OneOnOnes Agility Pressure Stamina WorkRate

Five Data Ecosystems

🏒
NHL.com Advanced

Sv%, GAA, SA/60, quality starts, rest splits, saves by strength, shootout stats, started vs. relieved data.

📈
MoneyPuck Goalies

GSAx (Goals Saved Above Expected), high-danger Sv%, expected saves model — the single best measure of goalie quality in modern analytics.

🎯
Evolving Hockey GAR

Goalie-specific Goals Above Replacement — how many goals this goalie prevented compared to a replacement-level netminder.

📡
EDGE Goalie Tracking

Save % by shot location (left, right, high danger zones), lateral movement metrics, positioning data from tracking cameras.

LB Hockey Goalie WAR

Wins Above Replacement with component breakdowns — another independent evaluation of goalie quality.

Goalie-Specific Math

Phase 7 uses specialized mathematical tools beyond the standard pipeline foundations:

Four-leg reliability — goalie reliability is computed from GP, starts, TOI, and a shots-against proxy (SA/60 × TOI). All four legs are blended to determine how much to trust the data. A goalie with 50 starts and 1,500+ shots against gets full trust; a goalie with 8 starts in relief gets heavy shrinkage.

Bayesian save% smoothing — for segment-specific save percentages (high-danger, glove side, blocker side), raw percentages are blended with the goalie's own overall Sv% as a prior. With 200–300 prior shots of weight, small-sample segments don't swing wildly.

"Shrink to average+" — when evidence is weak, attributes shrink toward 0.55 (not 0.50). This reflects that NHL goalies as a pool should not cluster as "below average" — they're already the best in the world. The target is "average NHL goalie," which is a very good goalie.

Tail-only boost — after shrinkage, a gentle "tail-only boost" allows elite goalies to reach 19–20 ratings on core skills without inflating the entire population. Values below a pivot point barely move; values above get a lift proportional to how far above they already are.

Specific Attribute Logic

Positioning (the most important goalie attribute per research notes) is driven by EDGE shot-location save percentages, GSAx, and overall positioning metrics. Reflexes feeds from high-danger save percentage and GSAx. Glove and Blocker use EDGE left/right save percentage data with Bayesian smoothing. Stamina uses rest splits (how goalies perform on back-to-backs). OneOnOnes is fed by shootout save percentage. WorkRate for goalies uses workload metrics (starts, SA/60, games played pace) rather than the skater engagement formula.

vNext Change

The "uniform ladder" distribution bug was fixed — the old rank-based mapping produced roughly equal numbers of goalies at every rating level. Normal-CDF shaping now creates realistic distributions. Goalie Agility floor raised to ~12 with an 18–20 tail for the most athletic netminders. Flair confirmed as irrelevant for goalies.

Phase 7b Goalie Career Mentals Consistency · ImportantMatches · Pressure · Determination · Leadership · Loyalty · Ambition · Professionalism · Adaptability
The Simple Version

Phase 7b is the goalie equivalent of Phase 5b — it uses multi-year career history to model mental and character attributes for goalies specifically. It runs after Phase 7 (goalie analytics) so career-informed mentals layer on top of analytics-driven attributes, exactly how skaters get analytics first (P2–P5) then career mentals (P5b).

New for vNext

Phase 7b did not exist in previous releases. Goalie mental attributes were set to defaults or inherited from the generic goalie model. The new phase mirrors the skater Phase 5b architecture but with goalie-specific signal handling — SV% coefficient of variation for Consistency, playoff SV% delta for Important Matches, and a Pressure blend that combines Phase 7 analytics (60%) with career signals (40%).

Attributes Modeled (Goalies Only)

Consistency ImportantMatches Pressure Determination Leadership Loyalty Ambition Professionalism Adaptability

How It Differs from Skater Mentals (5b)

Consistency uses SV% coefficient of variation rather than PPG CV. Goalie SV% is the appropriate metric, but it requires careful league filtering — many leagues don't report saves/GA reliably, so only data from confirmed-reliable leagues is used.

Important Matches uses playoff SV% delta (playoff SV% minus regular season SV%). A positive delta means the goalie raised their game in the postseason. Cup winners receive scaled bonuses, and Conn Smythe winners get additional recognition.

Pressure is unique among goalie mentals — it blends Phase 7's analytics-derived Pressure (60%) with career signals (40%). This reflects that goalie pressure is partly captured by current-season shot volume and situation data, and partly by career playoff history. A PA floor ensures high-potential goalies don't get underrated.

Leadership is conservatively modeled for goalies — captaincy is extremely rare for netminders, so the rating is capped at 14 unless the goalie has actual C/A history in their career data.

Loyalty and Ambition are modeled in Phase 7b (unlike Phase 5b where they remained unmodeled for skaters), using tenure ratios, team changes, international breadth, and draft pedigree signals from the EP career data.

Phase 8 CA Suggestion & Role Engine CA · Tier · Role Suggestion · EP Tags
The Simple Version

Phase 8 is the "quality inspector" at the end of the assembly line. After all individual attributes have been modeled, it steps back and asks: "Does this player's overall rating make sense?" It compares the full attribute profile to what the player's current CA implies, suggests adjustments, and also evaluates whether the player's assigned role matches their analytics profile. This is entirely new for vNext.

New for vNext

Phase 8 did not exist in previous releases. CA and roles were manually set. The new engine uses composite z-scores, position-pool percentiles, a 28-tier CA system, and an archetype scoring engine with 30+ skater role types and 3 goalie styles.

CA Suggestion Pipeline

Processing Order
Composite Z-scores → Percentile (per position pool) → Tier Assignment → Spread Within Tier → GP Confidence Blend → Box Score Production Floor (F only) → Career Floor (all positions) → Star Power Floor (named list) → Tier Label Re-derivation → CA Delta → Write to Master

The system uses a rate-first approach — per-60 and per-game metrics rather than counting stats, so injured players (like a Brayden Point or Connor Bedard with limited games) aren't penalized for missing time. Their per-minute impact drives the CA suggestion.

Deployment Floors

Players trusted by NHL coaching staffs receive minimum CA ratings regardless of analytics:

• Goalies with 30+ starts → minimum CA 130
• Goalies with 20+ starts → minimum CA 128
• Skaters with regular NHL deployment get position-specific floors

This reflects the reality that a fourth-liner playing 12 minutes a night in the NHL is there for a reason — physicality, penalty killing, locker room presence — that analytics may not fully capture.

Star Power System

A named list of franchise-caliber players receives explicit CA floors to prevent analytics from underrating historically elite talent during a down stretch. A player like Connor McDavid with a 200 CA shouldn't drop to 170 because of a 30-game sample during an odd season.

Skater Role Suggestion Engine

The engine scores every skater against 30+ archetype fingerprints — each defined by a set of weighted attribute expectations. For example, a "Sniper" archetype expects high Wristshot, Slapshot, and OffensiveRole with moderate Creativity; a "Grinder" expects high WorkRate, Hitting, and Checking with defensive orientation.

Elite Prospects tags provide an independent scouting layer. When EP scouts call someone a "Power Forward" and the analytics agree it's a reasonable fit, the system trusts that human observation. Tags add 5–15 point bonuses to matching archetype scores, with confidence scaling by tag count (1 tag = 60%, 2 tags = 85%, 3+ tags = 100%). Conflicting tags are resolved using goal share as a tiebreaker.

Goalie Role Suggestion

Goalies have 3 possible EHM roles: Butterfly, Mixed, and Hybrid. The system scores each style using analytics signatures (positioning metrics, lateral movement, EDGE data) and EP goalie style tags (Athletic Goalie, Butterfly Goalie, Positional Goalie, etc.). EP tags receive a +15 bonus to the matching style score, reflecting that playing style is best observed by scouts rather than inferred from stats.

EP goalie tags also drive small attribute nudges: Athletic Goalie → Agility +1, Reflexes +1, Recovery +1. Positional Goalie → Positioning +1, Anticipation +1, Decisions +1. These are bounded scouting-intelligence nudges — small directional adjustments that analytics can't fully capture.

CA Tier System (v2.1)

The CA tier system maps analytics-derived composite scores to specific Current Ability values. It's designed to create realistic separation between tiers while respecting the EHM engine's expectations for how a player at each CA level should perform.

Skater Tiers — Forwards

TierCADescriptionRange
Franchise (High) 200 Generational talent — the absolute best in the world
Franchise (Med) 190 Perennial Hart Trophy contender
Franchise (Low) 182 Franchise cornerstone, top-5 at position
Elite (High) 174 All-Star caliber, top-line driver
Elite (Low) 168 Consistent first-line / top-pair impact
Top 6 (High) 164 Strong top-6 forward, quality 2nd liner
Top 6 (Mid) 156 Solid top-6 contributor
Top 6 (Low) 148 Fringe top-6 / strong 3rd liner
Bottom 6 (High) 144 Quality bottom-6, strong role player
Bottom 6 (Mid) 140 Reliable bottom-6 forward
Bottom 6 (Low) 134 Fourth-liner, limited but trusted
Replacement (High) 128 NHL/AHL tweener, 13th forward
Replacement (Mid) 124 Emergency callup, AHL regular
Replacement (Low) 120 AHL regular, NHL depth
Design Note

The v2.1 tier system deliberately has no "Top 9 Forward" label. Analysis showed this tier was misleading — in practice, the distinction between a low Top 6 player and a high Bottom 6 player is better captured by the continuous spread within the 134–164 CA range. Defenseman tiers mirror forward tiers with appropriate label changes (Top Pair, Bottom Pair).

Career Floor System

The career floor prevents analytics from underrating established players during a single bad season. Maximum drops from established CA are age-relaxed:

Established LevelBase Max DropAge 33+Age 35+Age 37+
Franchise (≥182)−8−10−12−14
Elite (≥170)−12−15−18−21
Top 6 / Top Pair (≥150)−16−20−24−28
Established (≥130)−20−25−30−35

Example: A Franchise player (CA 190) under age 33 can drop to 182 minimum, no matter how bad the analytics say this season was. After 37, that floor loosens to 176, allowing for legitimate late-career decline.

Role Suggestion Engine

Roles matter in EHM — they affect which attributes the engine prioritizes for a player's performance calculations, which deployment slots the AI coach considers, and how the player develops over time. A misassigned role can cost a player substantial effective rating points.

How Skater Roles Are Assigned

The engine scores every skater against 30+ archetype fingerprints. Each archetype defines a "signature" — a set of attribute expectations with weights. The player's modeled attributes are compared against each archetype's signature, producing a fit score. The highest-scoring archetype becomes the suggested role.

Archetypes span the full spectrum of hockey roles:

🎯
Offensive Archetypes

Sniper, Playmaker, Playmaker (Finesse), Offensive, Power Forward, Dangler, and their sub-variants. These emphasize Wristshot, Slapshot, Passing, Stickhandling, Creativity, and OffensiveRole.

🛡️
Defensive Archetypes

Defensive, Defensive (Finesse), Two-Way, Two-Way (Physical), Checking, Shutdown. These emphasize Positioning, Checking, Pokecheck, DefensiveRole, and WorkRate.

💪
Physical Archetypes

Grinder, Enforcer, Power Forward (Physical), Energy. These emphasize Hitting, Aggression, Fighting, Bravery, and WorkRate.

⚖️
Balanced Archetypes

All Around, Standard, Utility. These balance offensive and defensive attributes without strong emphasis in either direction.

Elite Prospects Tag Integration

EP tags serve as scouting intelligence on top of the analytics. When EP's scouting team labels a player with a specific type tag, the system uses it in two ways:

Direct role mapping — tags like "Sniper," "Power Forward," or "Defensive Center" map directly to an EHM archetype. If the analytics confirm it's a reasonable fit (within 12 points of the best score), the EP-suggested role wins.
Soft score boosts — tags that imply a style without naming a specific role (e.g., "Heavy Shooter," "Offensive Forward") add 5–15 point bonuses to matching archetype scores.

Conflicting tags (e.g., "Sniper" + "Playmaker" on the same player) are resolved using goal share as the primary signal — if a player scores more than 45% of their points as goals, the scorer tag wins.

Deployment Compatibility

The engine includes a deployment compatibility layer that prevents role suggestions from breaking EHM's AI coaching logic. For example, tagging a power play specialist as "defensive" would make OffensiveRole irrelevant to the AI coach, meaning the player would never see PP time. The system adjusts scores after archetype scoring to ensure suggested roles are compatible with real deployment patterns.

NHL/AHL Merge Logic

When a player appears in both the NHL and AHL modeled outputs (think a callup who played 25 games in each league), the pipeline needs to decide which set of ratings to use. This sounds simple but turned out to be one of the most critical — and most buggy — parts of the pipeline.

Critical Bug Fixed in vNext

The previous merge script had a silent bug where the AHL GP column name had changed during an earlier schema cleanup (from ahl_onice_5v5__games_played to ahl_onice__games_played), causing all AHL GP to resolve as zero. This meant NHL data always won for overlap players, regardless of actual games played. The bug was invisible because the "correct" answer (NHL wins) happened to be right most of the time by coincidence.

Quality-Preferred Merge (vNext Default)

Even with the bug fixed, the old "highest GP wins" logic was wrong in principle. A player with 13 NHL games has data from 20+ independent analytical models (MoneyPuck xG, Evolving Hockey GAR/RAPM, EDGE tracking, HockeyStatCards, LB Hockey WAR). A player with 30 AHL games has data from 4–7 AHLTracker categories with no xG models, no isolated impact metrics, and no tracking data. The NHL sample is objectively superior in analytical depth.

Quality-Preferred Decision Logic
// Skater overlap: IF NHL_GP ≥ 10 → NHL wins ELIF NHL_GP > 0 AND NHL_GP ≥ AHL_GP → NHL wins (tiebreak) ELSE → AHL wins // Goalie overlap: IF NHL_GP ≥ 5 → NHL wins ELIF NHL_GP > 0 AND NHL_GP ≥ AHL_GP → NHL wins (tiebreak) ELSE → AHL wins

Validation against the current data: 137 overlap players → 117 NHL wins, 20 AHL wins (18 skaters with 8–9 NHL GP, 2 goalies with 4 NHL GP). Every overlap decision is logged with the reasoning for easy auditing.

AHL Pipeline

The AHL pipeline follows the same principles as the NHL pipeline but with a critical constraint: the data is much thinner. Where NHL players have 20+ independent analytical sources, AHL players have essentially one — AHLTracker.com. There are no expected goals models, no RAPM, no EDGE tracking data, and no WAR frameworks at the AHL level.

This constraint drives every design decision in the AHL pipeline. The system must produce plausible AHL-level ratings without letting limited data create false precision or accidentally NHL-grade attributes.

Six-Layer Guardrail System

The AHL skater model uses an aggressive multi-layered approach to prevent unrealistic attribute inflation:

1. Reliability shrink — uses GP (AHL master has no TOI) with sqrt scaling so low-sample players don't swing wildly
2. Role contracts — define per-role Key/Essential/Irrelevant importance, applied in score space
3. PA banding — enforces AHL-specific ceilings and floors to create separation even when league data is limited
4. AHL technical ceiling enforcement — squish + evidence-gated caps + hard caps prevent AHL technicals from rolling NHL-grade
5. League-specific caps — ensure AHL ratings stay below NHL equivalents for comparable percentile performance
6. GP-proportional weighting — prevents players with minimal AHL time from getting full-confidence ratings

AHL Goalie Model

AHL goalies are even more data-limited than skaters — the primary inputs are basic Sv%, GAA, GP, and limited strength-state splits from AHLTracker. The model uses the same Bayesian smoothing and shrink-to-average+ approach as Phase 7, but with wider confidence intervals and more conservative rating bounds.

AHL Phase 3: CA & Role Suggestion

New for vNext

The AHL pipeline now has its own dedicated CA suggestion and role engine, mirroring NHL Phase 8 but adapted for AHL data availability. This ensures AHL players get analytically-informed CA suggestions and role assignments rather than relying entirely on manual settings.

AHL Phase 3 produces CA suggestions for both skaters and goalies, role suggestions for skaters (12 archetypes — a streamlined subset of the NHL's 30+), and goalie style suggestions (Butterfly/Mixed/Hybrid) with EP tag integration. The same Elite Prospects scouting tags used in the NHL pipeline flow through to AHL players who have coverage.

The key difference from NHL Phase 8 is that AHL composite scores are built from fewer signal pillars — the absence of xG models, RAPM, and tracking data means the system relies more heavily on traditional rate stats and PA banding for tier placement. AHL-specific tier tables ensure ratings stay calibrated for the minor-league context.

Utility & QA Scripts

Beyond the core modeling phases, the pipeline includes several utility and quality assurance scripts that support data collection, pre-processing, validation, and visualization.

Data Collection & Pre-Processing

📡
edge_bulk_collect.py

Bulk-collects NHL EDGE tracking data from public endpoints across all teams. Supports team mode (full roster), player mode (individual patches for missing data), and all-teams mode (baseline collection). Produces merge-ready CSV files for all six EDGE metric categories with rate-limited API calls and deduplication.

🔗
merge_edge_patch_into_all_teams.py

Cleanly inserts missing player EDGE data into the ALL_TEAMS baseline files. Uses full-replacement merge logic — patched player IDs completely replace any existing rows, preventing duplicate or stale data. Essential for players added mid-season or with initially missing tracking data.

📋
career_signals_prep.py

Derives career signals from Elite Prospects raw player stats (EP2EHM format). Produces ~55 columns including captaincy history, tournament-level international detail, detrended PPG variance, playoff deltas, loyalty/ambition signals, and award bonuses. Consumed by Phase 5b and Phase 7b for career-informed mental modeling.

🏷️
ep_style_integration.py

Processes Elite Prospects player style tags and maps them to EHM archetype affinities. Handles tag conflict resolution (e.g., "Sniper" + "Playmaker" resolved via goal share), confidence scaling by tag count, and direct role overrides. Produces the EP tag data consumed by Phase 8 and AHL Phase 3.

🌐
capwages_depth_chart_collect.py

Collects current depth chart data from CapWages.com for all 32 NHL teams. Provides independent deployment context (line assignments, PP/PK units) used for validating role assignments and deployment floor decisions.

Pipeline Orchestration

⚙️
run_pipeline.py

Orchestrates the full pipeline execution. Runs the correct sequence of scripts with proper input/output chaining and stdin piping for interactive prompts. Supports running NHL only, AHL only, or both tracks plus final merge. Includes --dry-run mode for testing and --start-from for resuming after failures.

Execution Order
NHL Track: Build → P2 → P3 → P4 → P5 → P5b → P6 → P7 → P7b → P8 AHL Track: Build → Skaters → Goalies → Phase 3 (CA/Role) Final: NHL Master + AHL Master → Quality-Preferred Merge → Combined Output

Quality Assurance & Visualization

🔍
qa_attribute_diff_viewer.py

Generates an interactive HTML viewer showing every player's database attribute values side-by-side with modeled values. Color-coded delta cells, filterable by Role, Club, and NHL/AHL source, with search-by-name. Now also integrated directly into the final merge script for automatic generation. Mental attributes (Phase 5b/7b) are displayed in a separate group.

📊
qa_role_attribute_crosstab.py

Diagnostic QA that cross-tabulates role archetypes against rating buckets (Low/Mid/High/Elite) for every modeled attribute. Identifies role types landing in unexpected ranges — Grinders with Elite Wristshot, Snipers with Low Passing — that indicate potential role mismatches or formula issues. Outputs console summaries plus full CSV crosstabs and outlier reports.

📋
generate_depth_charts.py

Reads the final merged master and generates a self-contained interactive HTML depth chart for all 32 NHL teams. Players sorted by CA within positional columns (LW/C/RW, LD/RD, G) with click-to-expand attribute panels, color-coded 1–20 values, Evolving Hockey GAR summaries, and team color theming. Optionally includes AHL farm system players when provided with AHL master data.

Draft Class System

Every NHL draft class in the database — from the upcoming 2026 class through early scouting on 2030 — is built using a structured, multi-source scouting pipeline that assigns Potential Ability (PA) tiers based on real-world prospect rankings, position-aware fill rules, and a League Signal Bonus system that incorporates CHL draft history, junior league rights, and USNTDP status.

The Simple Version

Every draft class needs the right mix of franchise talents, first-rounders, mid-rounders, and filler prospects — with realistic position balance. We cross-reference multiple scouting sources (NHL Central Scouting, Elite Prospects, Scott Wheeler, HFBoards community), validate each prospect's eligibility against the September 15 birth cutoff, then assign PA tiers using a composite ranking that rewards players appearing on multiple lists. Players drafted by CHL teams or selected for the USNTDP get an additional credibility boost.

Scouting Source Hierarchy

Sources are ranked in strict descending priority. When filling tier slots, players backed by higher-priority sources are always selected first. Players appearing on multiple independent lists receive a composite score boost — if both Central Scouting and Elite Prospects rank you, that's significantly more credible than a single-source mention.

PrioritySourceDescription
1CSS — Central ScoutingNHL Central Scouting midterm/final rankings. Separate lists for North American skaters, North American goalies, International skaters, and International goalies. The gold standard.
2EP — Elite ProspectsEP Consolidated rankings including consensus lists, top prospect lists, and other notable mentions. Broad international coverage.
3HFB — HFBoardsCommunity scouting mentions, ranked lists, and discussion threads. User-curated lists carry extra weight.
4Lines — DobberProspects/LinesDraft-eligible prospect rankings from independent scouting content.
5HP — Hockey ProspectsAdditional prospect ranking coverage.
6NZ — NHL Numbers/ZoneAnalytics-leaning prospect evaluation.
7TSN — The Scouting NewsLargest single-source database with letter grades (A+ through D). Provides the bulk of the mid-to-late round population.
8HTML-DB — Database DefaultPlayers already in the EHM database. Restricted: never placed at -8 or above; displaced by TSN for ≤2027 classes.

PA Tier Structure

Each draft class follows hard position-distribution requirements across six tiers. The PA values use EHM's negative-PA system, where the game engine resolves each negative value into a randomized range at game start — this creates natural variance where some -19 prospects become stars and others become solid NHLers.

PALabelRange ResolvedPer ClassDescription
-10 Franchise170–2002 Generational/franchise talent. Tight range ensures these prospects reliably become elite NHLers.
-20 Wide Elite150–2005 Could be generational or merely elite — the wider range models true top-5 draft uncertainty.
-19 High 1st Round130–18035 High first-round picks with star upside. Ceiling allows franchise outcomes, floor keeps them as NHL regulars.
-8 1st Round130–16026 Solid first-rounders. Lower ceiling than -19 reflects "safe pick" profile — good NHL player, not a star.
-17 Mid Rounds90–140342 The bulk of the class. Range allows both bust (90 = minor leaguer) and surprise breakout (140 = quality NHLer).
-12 Filler40–100All remaining Undrafted/unranked prospects. Max out at AHL level, providing realistic draft depth.

Non-filler total: 410 players per class. Position targets are hard requirements — the fill algorithm enforces exact counts per position at each tier (e.g., -19 must have exactly 4G, 6LD, 6RD, 6LW, 7C, 6RW). PA is always resolved as the maximum of the range, per project standards.

Tier Assignment Process

The fill algorithm works top-down with position awareness:

1. Eligibility filter — remove players born after September 15 of the primary birth year (e.g., for 2026 draft: born after Sept 15, 2008 → goes to 2027 class). This caught 106 players in the 2026 class audit alone, including top 2027 prospects like Dima Zhilkin and Jaxon Jacobson.
2. Deduplication — merge accent/spelling variants (e.g., "Björck" + "Bjorck" → single entry with combined source tags).
3. Force-assign franchise players (user-specified, e.g., McKenna/Stenberg = -10 for 2026).
4. Composite ranking — sort by source priority, with TSN grade and random tiebreaking for same-priority players.
5. Position-aware fill — for each tier, fill each position's slots from the sorted pool until targets are met. Goalies are promoted from deeper in the pool since EP Consolidated notoriously underranks netminders.
6. Validate HTML-DB restrictions and assign all remaining to -12 filler.

League Signal Bonus

Beyond scouting rankings, the system incorporates an independent signal: how professional hockey organizations value each prospect. A player rated A- by TSN who also went Round 1 in the OHL Priority Selection has significantly more institutional validation than an identically-graded player with no draft history.

Each player receives one league signal tier — the highest applicable, no stacking:

TierCriteriaBonusImpact
USNTDPUSA Hockey NTDP Juniors+3National-level selection by USA Hockey — equivalent to CHL Rd 1
CHL Draft Rd 1OHL/WHL/QMJHL/Import Rd 1+3A CHL franchise invested a premium pick on this player
CHL Draft Rd 2–3OHL/WHL/QMJHL/Import Rd 2–3+2Meaningful draft capital spent
CHL Draft Rd 4–6OHL/WHL/QMJHL Rd 4–6+1.5Mid-round CHL draft investment
CHL Draft Rd 7+OHL/WHL/QMJHL Rd 7++1Late-round CHL selection
Major Junior RightsOHL/WHL/QMJHL/USHL rights (not drafted)+1A junior team listed this player on their rights roster
AAA Junior RightsBCHL/AJHL/NAHL/CCHL/OJHL etc.+0.5Lower-tier junior validation

The bonus adds to the player's effective source count, which feeds the composite ranking algorithm. A TSN-only player (1 source) with a CHL Rd 1 draft effectively becomes a 4-source player — likely bumping from -12 filler to -17 mid-round territory. This is not a direct PA adjustment; it flows through the existing ranking pipeline as additional source credibility.

Draft Class Coverage

🏆
2026 Class

Fully scouted with CSS midterm rankings, EP consolidated, Scott Wheeler, and multiple community sources. 2,538 total prospects, 410 non-filler. Updated with CSS Mid-Season Players to Watch and EP Players to Watch lists. 106 late-2008 births correctly moved to 2027.

📋
2027–2028 Classes

Early notables set from EP scouting lists and emerging community consensus. Top 4 tiers populated (-10 through -8); mid-round fill from TSN grades with League Signal Bonus differentiation. Position-flexible forward assignments for undeclared positions.

🔭
2029–2030 Classes

Early scouting phase. 2029 has top 4 tiers set with Prospect Generator populating remaining slots. 2030 is a complete 50/50 non-filler prospect database (529 total players) with tier-based color-coded import Excel. Both delivered via full 2026–2030 dark-themed HTML dashboard (1,656 players).

Nationality Targets

Each draft class targets approximately 66% North American / 33% European+Other representation, reflecting real-world NHL draft composition. For international players not naturally surfacing through rankings, an international scoring formula identifies Europeans who deserve inclusion:

International Surfacing Formula
Score = IntlApps × 8 + (CA − 30) × 0.8 + league_tier_bonus

Qualifying internationals swap into the ranked pool, replacing the weakest CHL/USHL tail entries to maintain the target ratio.

Known Limitations & Roadmap

Current Limitations

📉
AHL Data Depth

AHL ratings are constrained by the single-source AHLTracker data. No xG models, no RAPM, no tracking data. AHL ratings will always be less precise than NHL ratings until better data sources emerge.

🌍
European Leagues

The pipeline currently covers only NHL and AHL. European leagues (SHL, Liiga, DEL, NL) with robust analytics data are natural extension targets but require league-specific calibration.

🧠
Mental Attributes

Ambition, Loyalty, and Leadership remain at database defaults. These personality-type attributes lack reliable data proxies and are better left to game engine randomization for realistic variety.

📊
Single-Season Snapshot

The pipeline primarily models current-season performance. Players having career-worst or career-best seasons will be rated accordingly, moderated only by career floors and deployment minimums.

What's Next

Diff tracking between runs — a tool to show exactly what changed for each player between data refreshes, making targeted manual adjustments easier
European league models — extending the pipeline to SHL, DEL, Liiga, and NL using the established architecture
Additional AHL data sources — investigating Elite Prospects for TOI supplementation
GitHub integration — all scripts and data documentation published for community review and contribution
Phase 5b v2 — enhanced mental modeling using Elite Prospects career data (captaincy, awards, draft history) for Leadership, Loyalty, and Ambition