Methodology
Three legs of the forecast
The VoteROI.com Mood Forecast is not a mood-only model. "Mood" is one of three legs, and the race forecast rests on all three:
- Structure (fundamentals). The partisan lean of the state, incumbency and the incumbent's prior personal vote, and candidate-level challenger money (the Democratic share of two-party pre-election Senate receipts). These are the slow-moving facts of a race.
- Behavioral mood. The cross-sectional state-mood index built from publicly reported behavioral data. This is the layer the name points to, and it moves the structural baseline, it does not replace it.
- The thermostat (long-run swing). A deep-estimated midterm correction: the president's party tends to lose ground in midterms, applied as a calibrated national swing.
Structure anchors the forecast, mood adjusts it, and the thermostat applies the long-run midterm regularity. Where a large mood signal runs against a decisive structural and incumbency lean, the race page says so plainly and treats the call with caution.
Core Premise
Polling measures stated opinion. We measure revealed behavior. People can lie to a stranger on the phone; they cannot lie to their unemployment filing, their voter registration record, their vehicle miles, or their gasoline receipt. The composite index and the forecast model are built entirely on publicly reported behavioral data, calibrated against historical election outcomes.
Composite Index (descriptive layer)
The state composite mood score for each state is built from 13 behavioral dimensions normalized cross-sectionally against the other fifty states. The score is descriptive: it tells you the current mood reading, not a prediction. The "National Weather Service for civic emotions" voice belongs to this layer.
The dimensions
| Dimension | What it measures | Representative source |
|---|---|---|
| Mobility | Vehicle miles, transit ridership, household movement | FHWA TVT, DOL |
| Security | Fear, threat preparation | FBI UCR, FBI NICS |
| Economy | Unemployment, gas prices, initial claims | BLS LAUS, EIA, DOL |
| Spending | Retail and household economic commitment | BLS CES, Census ACS |
| CivicTrust | Voter registration, civic participation | Census CPS Voting Supplement, MIT EL turnout |
| DomesticStability | Household formation, migration persistence | IRS SOI migration, Census ACS |
| Isolation | Withdrawal, alienation | CDC mortality (suicide), Census ACS |
| Risk | Speculation, bet-on-future behavior | Multiple (composite) |
| Escape | Stress and fatigue retreat | CDC overdose, Google Trends |
| Information | Search interest, civic-anxiety queries | Google Trends |
| Information_candidates | Candidate-quality signal from polling/coverage | VoteHub aggregates |
| Pulse (reserved) | Dimension reserved for high-cadence stress signals | (currently empty, see caveats) |
Normalization (cross-sectional z-score)
Each indicator reading is normalized against the cross-section of the other fifty states, not against its own time series. This is a deliberate switch from the earlier v1.0 24-month rolling z-score because:
- Election outcomes are state-versus-state contests. The relevant baseline is "how anxious is Wisconsin relative to Michigan right now," not "how anxious is Wisconsin relative to its own 2-year average."
- Cross-sectional z is invariant to constant scaling, so swapping a denominator (e.g., turnout source) does not silently re-rank states.
For slow-cadence dimensions (CivicTrust, DomesticStability, Isolation, Risk, Escape) we use latest-wins within a 60-day window rather than averaging, so a stale reading does not dilute a fresh one.
Forecast Model (experimental layer)
The race-level forecast is L2-regularized logistic regression mapping a 15-feature state vector to P(Democrat wins). It is labeled experimental everywhere it appears and always carries its current Brier score on a held-out test split. The forecast layer is the only place predictive language is permitted.
The 15 features
| # | Feature | What it carries |
|---|---|---|
| 1 | state_lean | Continuous partisan baseline from average of 2020 and 2024 presidential margin (D-R), scaled to [-1, +1] |
| 2 | Mobility (z) | Cross-sectional z of state Mobility composite |
| 3 | Security (z) | Cross-sectional z of Security composite |
| 4 | Economy (z) | Cross-sectional z of Economy composite |
| 5 | Spending (z) | Cross-sectional z of Spending composite |
| 6 | CivicTrust (z) | Cross-sectional z of CivicTrust composite |
| 7 | DomesticStability (z) | Cross-sectional z of DomesticStability composite |
| 8 | Isolation (z) | Cross-sectional z of Isolation composite |
| 9 | Risk (z) | Cross-sectional z of Risk composite |
| 10 | Escape (z) | Cross-sectional z of Escape composite |
| 11 | Information (z) | Search and engagement signal |
| 12 | Information_candidates (z) | Candidate-quality / polling-aggregate residual |
| 13 | incumbent_party_d | Indicator: +1 if D incumbent on the ballot, -1 if R, 0 if open or independent |
| 14 | Economy_x_incumbent | Interaction: Economy * incumbent_party_d (penalizes incumbent of party in power when economy is bad) |
| 15 | personal_vote_spread | D candidate's prior-cycle margin minus prior-cycle presidential margin, for incumbents |
Display dimensions vs model features
All 13 behavioral dimensions are computed daily and surfaced on the dashboard, state detail panels, and the public API. A strict subset is fed to the calibrated logistic regression that produces P(Dem) per race. Two dimensions are intentionally display only:
- Information is shown for state-level context (Google Trends fixed-terms search interest, indicator #92) but the live ingestor has only ~30 days of history. The bare-Information dimension was effectively absent from historical composite_scores when the published model was trained, so its coefficient there would be noise. Backfill the indicator across the train and test cycles, then re-train, to graduate it to feature status.
- Pulse (VoteHub state-race polling) is also display only. State-race polls cluster in a small set of competitive contests, so any given week sees fewer than 51 states, and historical race-week coverage is too sparse to train a logistic coefficient honestly. Shown on the dashboard, excluded from the forecast model.
The model's behavioral feature vector consists of state_lean, ten behavioral dimensions (Mobility, Security, Economy, Spending, CivicTrust, DomesticStability, Isolation, Risk, Escape, Information_candidates), incumbent_party_d, an Economy x incumbent interaction, personal_vote_spread, and a challenger two-party fundraising-share term (with a senate-money-present flag). On top of that block the live model is two-speed: it adds a deep-estimated national midterm thermostat term, then a selective Platt recalibration that holds a prediction unsharpened when the structural and mood signals disagree. Adding Information or Pulse later requires both a backfill and a fresh calibrated model_version.
Calibration
Coefficients are fit by gradient descent with per-feature L2 regularization. The live serving weights are fit on the 2022 and 2024 senate and governor races. The wider corpus of race observations across cycles 2014, 2016, 2018, 2020, 2022, and 2024 is the backtest and Brier-reporting set: every cycle is scored out-of-sample to track accuracy over time.
How the headline Brier is computed: it is not a single 2024 holdout. The reported number is a pooled out-of-sample score under a leave-one-cycle-out (prior-window) policy: each scored cycle is predicted by an identically specified model that did not train on that cycle, pooled across the 2014 to 2024 backtest and four pre-election cutoffs. Per-cycle Brier across 2014-2024 is the backtest record. The live model lr-2026-06-10-chal-recal-twospeed reports a pooled out-of-sample Brier of 0.119.
Brier history
| Version | Date | Test Brier | What changed |
|---|---|---|---|
| v0 (baseline) | 2026-05-26 | 0.247 | Coin-flip baseline is 0.250. v0 used Economy + state_lean only. |
| v3 / v4 | 2026-05-29 | 0.155 | Added Mobility, Security, Information dimensions. |
| v5 | 2026-05-29 | 0.181 | Added candidate fit feature (broke calibration; rolled back). |
| v8 / v9 | 2026-05-30 | 0.137 | Fixed Information_candidates clip on uncontested seats. |
| v10 | 2026-05-31 | 0.135 | Added personal_vote_spread (incumbent personal-vote feature). |
| v11c | 2026-06-01 | 0.114 (retired) | Apparent improvement was a broken-data ceiling. Inverse-signal CivicTrust registration_rate formula meant the model trained on noise that happened to correlate with outcomes. |
| v11d | 2026-06-01 | 0.135 | CivicTrust formula corrected (voters presumed registered per CPS skip pattern); turnout source switched from CPS delta to MIT EL absolute; state_lean made continuous from presidential margins; backtest corpus extended back to 2014. The 2 pp regression from v11c is the inverse-signal artifact going away. |
| A-complete (historical) | 2026-06-04 | 0.137 | Open-seat encoding (retiring=true treated as incumbent_party_d=0.5 and Economy_x_incumbent=0) applied symmetrically at training and inference, with the full state_incumbents backfill (49 confirmed retirements 2014-2026). Race universe expanded from 54 to 59 with the four missing 2026 governor races (AR, CT, HI, ID) and IL senate 2026 added. NC senate moved from 28.1% to 32.5% (Tillis retiring, no longer encoded as defended R incumbency). The +0.002 Brier vs v11d is a methodology cost the 2024 split could not recover; the open-seat encoding is correct for the six 2026 retirements the prior model mis-encoded. Superseded by the two-speed model below. |
| two-speed (current) | 2026-06-11 | 0.119 | Current serving model. Adds a challenger two-party fundraising-share feature and a deep-estimated national midterm thermostat term to the behavioral block, then a selective Platt recalibration that holds a prediction unsharpened when the structural and mood signals disagree. The 0.119 is a pooled leave-one-cycle-out out-of-sample Brier across the 2014-2024 backtest and four pre-election cutoffs (not a single 2024 holdout); the served weights are fit on 2022 and 2024. |
This table is the historical version ladder. The Test Brier column is each version's reported score at the time it shipped; the current serving model is the two-speed row, whose 0.119 is a pooled leave-one-cycle-out out-of-sample number (see Calibration above), not a single-split test.
For each prior version, the trained coefficients remain queryable via API; /changelog lists the responsible commit.
Honest Caveats
- Pulse dimension is empty. It is reserved for high-cadence stress signals (CDC near-real-time mortality syndromic, ER admissions, EMS call volume) but those state-level feeds are not yet ingested. The current model does not depend on Pulse; the dimension is a placeholder.
- ME gap on personal-brand incumbents. Susan Collins consistently runs 8 to 12 pp ahead of presidential R margin in Maine; our personal_vote_spread feature captures most of this but the model still slightly under-prices her. Equivalent issue with brand-defying incumbents in tossup states.
- FL under-calibration history. Prior versions (v9, v10) under-priced the Republican lean in Florida because the state's demographic shift outpaced the model's assumptions. v11d widened the backtest corpus back to 2014, which surfaces more of this; we will know how well after the 2026 cycle.
- WI history. Prior versions over-priced Republican strength in Wisconsin because the broken CivicTrust formula gave high-turnout states (which lean D in WI's case) low civic-trust scores. v11d corrects this; WI governor P(D) moved from 31% (wrong) to 63% (matching Evers-incumbent consensus).
- No polls. The model uses zero polling inputs by design. This is its purpose, not a bug; the index complements polling rather than replaces it. Where polls and behavior diverge sharply, that gap is itself signal.
- Probabilistic, not deterministic. A 75% P(D) is not a call; it is a 75% probability. A model that calls every race correctly with 75% confidence will still be wrong 25% of the time, by design.
Forecasting Accuracy (Brier Backtest)
The Brier score measures forecast calibration. A perfect predictor scores 0. A random coin flip scores 0.25. Lower is better. Cited Brier values throughout this site refer to the 2024 held-out test score from model lr-2026-06-10-chal-recal-twospeed, trained 2026-06-11.
Training and Test Split
Trained on 2022 + 2024 Senate and Governor races (n=464 race-cutoff cells across four pre-election cutoffs). Held out 2026 (n=236). For each race we reconstruct the state-level dimension Z-scores at minus-90, minus-60, minus-30, and minus-7 days before the election using only raw readings dated on or before that cutoff. No leakage.
Model
L2-regularized logistic regression over 9 behavioral dimensions, a structural state-lean prior, an incumbent-party indicator (D / R / open), and an Economy x incumbent interaction term. The prediction is the probability the Democratic candidate wins.
The interaction feature was added in v4 to capture the textbook 'incumbent reward / blame' dynamic the prior model could not see (state_lean is a static structural prior, not who currently holds the seat). The positive coefficient on Economy x incumbent means a stronger economy boosts the incumbent's party, and a weak economy hurts them, independent of which party that is.
Learned Coefficients
| Feature | Weight | What the sign says |
|---|---|---|
| state_lean | +0.619 | Dem-leaning states predict Dem wins |
| Economy | -0.383 | Stronger economy favors Republicans (direct effect, after incumbent interaction) |
| CivicTrust | +0.353 | Higher civic participation favors Dems |
| DomesticStability | +0.319 | Stable households favor Dems |
| Isolation | -0.134 | Isolated populations favor Republicans |
| Security | -0.116 | Favors Republicans |
| Spending | -0.062 | Higher political spending favors Republicans |
| Mobility | -0.003 | Higher mobility favors Republicans |
Brier by Cycle
| Cycle | Brier | n cells |
|---|---|---|
| 2014train | 0.117 | 288 |
| 2016train | 0.116 | 184 |
| 2018train | 0.167 | 284 |
| 2020train | 0.078 | 184 |
| 2022train | 0.110 | 280 |
| 2024train | 0.105 | 184 |
Brier by Office
| Office | Brier | n cells |
|---|---|---|
| governor | 0.141 | 568 |
| senate | 0.104 | 836 |
Last calibrated: 2026-06-11. Full methodology in docs/brier_backtest_v1.md.
National Mood Index
The VoteROI National Mood Index aggregates the 50 state composite scores (plus DC) into a single number on a -50 R / +50 D scale, where 0 is a dead tie. The headline figure shown on /forecast is the population-weighted reading; an electoral-vote-weighted lens is published as a secondary view.
| Weighting | Role | Weight basis | What it captures |
|---|---|---|---|
| Population-weighted | Headline | 2020 Census apportionment | Where Americans live. The default "how does the country feel" reading. |
| EV-weighted | Secondary lens | 2024 Electoral College allocation | Where presidential elections are actually decided. Useful for handicapping 2028, less so for everyday mood. |
| Competitive-weighted | Internal diagnostic | 2026 competitive race count per state | Where current-cycle action concentrates. Surfaced in the API but not in the hero readout. |
| Ensemble | Internal diagnostic | Equal-weight mean of the three | Held in the API for backward compatibility with v1.0 consumers. |
Each state contributes its composite score minus 50 (the temporal baseline), weighted by the method. The sign for the state is +1 if the state's relevant 2026 incumbent is Democratic, -1 if Republican, 0 if independent or unknown. The per-state pull is then (composite - 50) * sign * weight. A high mood reading in a Democratic-incumbent state reads as Democratic reward (positive on the index); the same high mood reading in a Republican-incumbent state reads as Republican reward (negative). This captures the textbook incumbent-reward and incumbent-blame dynamic at state granularity rather than averaging it out.
The headline convention (population-weighted) reflects everyday civic mood. The electoral-vote lens is reported alongside because the same mood translates differently into a presidential outcome depending on which states feel it most. When the two diverge, that gap is itself information.
vs. Kalshi KPOW
Both VoteROI National Mood Index and the Kalshi American Power Index (KPOW) aim to be a single-number readout of where US political power sits. They use different signals and answer different questions:
| VoteROI National Mood Index | Kalshi KPOW | |
|---|---|---|
| Input | Behavioral data (FEC, BLS, Census, CDC, FHWA, etc.) | Prediction-market prices (Kalshi) |
| Refresh | Daily, after composites recompute | Real-time, tick by tick |
| Reads | State-level fundamentals (the "why") | Market consensus (the "what") |
| Scale | -50R / +50D, signed by federal incumbent | -50R / +50D, market consensus mapping |
| Strength | Captures structural shifts before markets price them; works without an active liquid market | Captures news shocks in real time; price discovery in liquid contracts |
| Weakness | Mood-to-partisan translation needs an incumbent-sign convention; lagged by composite cadence | Reflexive (reads its own market); requires liquid contracts to exist |
The two indices are complementary, not competing. KPOW tells you where the market thinks power is. VoteROI tells you where the behavioral fundamentals say it ought to be heading. When they disagree, that gap is itself information.
What We Are Not
- We do not generate AI commentary. Editorial copy is human-written. See /editorial.
- We do not make predictions outside the labeled forecast layer.
- We do not call races. We report probabilities. A 75% probability is not a call.
- We are not affiliated with any candidate, party, PAC, or government entity. See conflict-of-interest disclosure for the AM PAC relationship.
Model Versioning and Audit
Every composite score, forecast, and fit calculation carries a model_version field. The current version is the two-speed model (model_version lr-2026-06-10-chal-recal-twospeed), calibrated June 2026. Methodology changes are documented in /changelog and any change to a coefficient or feature definition is logged before going live. Prior versions remain queryable so independent reviewers can reproduce older calls.
Methodology version: two-speed (lr-2026-06-10-chal-recal-twospeed). Questions: press@voteroi.com.