NHL play for contract audit

Intro

This page reports checks that test whether the loyalty-discount finding is real and stable. Findings states what the result is. Methods states how it was produced. Audit shows how hard the result was pushed.

Falsification test

The test keeps the same model but randomizes contract grouping 2,000 times. If a random split could reproduce the same-team effect, the model structure might be creating the signal.

Observed same-team coefficient: +0.703. Null mean from random relabeling: +0.001. Empirical p-value: <0.0001 (0 of 2,000 random draws as extreme as the real estimate).

Takeaway: a random grouping does not reproduce the same-team advantage, so the effect is not an artifact of the model form.

Source: output/tables/audit_falsification.csv

Subsample robustness

The model was re-estimated inside position, trajectory, tier, and era slices. Holds is TRUE when the subgroup coefficient keeps the full-sample sign and has p<0.05.

Subgroup	Value	n	Coefficient	Std. error	p-value	Holds
position	defensemen	602	+0.977	0.215	<0.0001	TRUE
position	forwards	1,105	+0.526	0.138	0.0001	TRUE
trajectory	declining	389	+0.392	0.239	0.1018	FALSE
trajectory	insufficient_history	566	+0.561	0.280	0.0458	TRUE
trajectory	rising	543	+0.712	0.192	0.0002	TRUE
trajectory	stable	209	+0.531	0.294	0.0724	FALSE
tier	fringe	416	+0.528	0.252	0.0368	TRUE
tier	middle	539	+0.829	0.187	<0.0001	TRUE
tier	top	434	+0.916	0.241	0.0002	TRUE
tier	unknown*	318	+0.486	0.278	0.0814	FALSE
era	early_2012_2021	971	+0.641	0.178	0.0003	TRUE
era	late_2022_2024	736	+0.527	0.161	0.0011	TRUE

Selection-relevant read: the same-team coefficient remains positive and significant in key tier groups (fringe, middle, top) and in trajectory groups with enough precision (rising and insufficient history). That pattern is evidence against a pure selection story, but it is not a full causal identification.

Non-holding cells are shown directly: declining trajectory, stable trajectory, and the residual unknown tier (see note).

* Unknown is a residual category, not a player type. A contract lands here when its walk-year usage could not be placed in a tier, for example when walk-year position group or walk-year time on ice per game is missing and it cannot be ranked against same-season position peers. The non-holding result for the unknown row reflects that mixed, unclassified group. It is not a real tier where the effect fails.

Source: output/tables/audit_subsample_robustness.csv

Outlier sensitivity

The model was re-fit after trimming outcome tails to test whether a small number of extremes drives the result.

Trim fraction	n	Coefficient	p-value
0.00	1,707	+0.703	<0.0001
0.01	1,671	+0.593	<0.0001
0.05	1,535	+0.503	<0.0001

Takeaway: the same-team effect survives removal of the tails. The estimate shrinks but stays positive and statistically strong.

Source: output/tables/audit_outlier_sensitivity.csv

Data integrity

Coverage tables from Phase 5 confirm the sample inputs used here: eligible walk-year n=1,624 and eligible overpay n=1,760.

Retention distribution in the panel is same_team 1,507, new_team 857, entry 728, unknown 83. This unknown count is also residual, not a retention category. Signing-year counts are explicit in the coverage table, with sparse edges in 2012 to 2014 and a partial 2025.

Sources: output/tables/coverage_eligible_sample_sizes.csv, output/tables/coverage_retention_distribution.csv, output/tables/coverage_contract_count_by_signing_year.csv

Interpretation limit

For the selection caveat and interpretation boundaries, see the Methods page discussion in What the model cannot see.