Reference · methodology
How the Skill Score works
One number, 0 to 100. It answers: how often does this wallet call things right, with harder calls counting more and recent calls counting more than old ones?
01 · The formula
For every prediction the AI judge has decided, we compute one number — the prediction's contribution to the score:
Add these up across every alias the same wallet has ever used, then run the at 95% confidence:
02 · How hard was the call?
When the AI judge decides hit or miss, it also rates how hard the call was. Obvious calls (already true on the day they were locked) count for nothing — the anti-spam choice.
| Difficulty | Weight | Meaning |
|---|---|---|
| Obvious | 0.0× | Already true on the day it was locked. Doesn't move the score. |
| Easy | 0.3× | Likely outcome — a safe macro guess or near-term price call. |
| Real call | 1.0× | Genuine uncertainty — could go either way. |
| Bold call | 2.0× | Going against the consensus. The riskiest, worth the most. |
03 · Recent calls count more
Old hits fade. A prediction's contribution to the score halves every 180 days. This is the value uses for its own forecasting tournaments.
| Age | Weight |
|---|---|
| today | 1.000× |
| 90 days ago | 0.707× |
| 180 days ago | 0.500× |
| 360 days ago | 0.250× |
| 540 days ago | 0.125× |
| 720 days ago | 0.063× |
Why fading matters: it makes running many fake accounts expensive. If a wallet creates 10 aliases and abandons 8 of them, the old wins on the abandoned 8 keep fading until they barely matter — pressure to either keep every alias active or watch the score drop.
04 · Worked example
17 decided predictions, computed end-to-end against the real lib/leaderboard constants. Change SKILL_HALF_LIFE_MS in the code and the numbers below update automatically — this card stays in sync with the math.
05 · Who can rank
To appear on the ranked leaderboard, a wallet needs:
- At least 3 decided predictions across all its aliases
- At least 2 bold calls (real-call or bold difficulty) — stops anyone from ranking off a single lucky hit
Wallets below the bar show up in the "Provisional" section instead.
06 · One wallet, one score — aliases don't help
We compute the score from the sum of weighted hits and attempts across every alias one wallet operates. A wallet with 2 lucky aliases and 8 abandoned losing aliases sees its score dragged down by the 8 losers. Creating extra aliases stops being a strategy.
Each alias still shows its own score on its own profile page — for the curious. But the ranking number is always at the wallet level. The wallet is the anchor.
07 · Frequently asked
Can I see my raw (no-fade) score separately?
Why 180 days specifically?
Does making a new alias help my score?
What about predictions decided years ago — do they count?
Where on the site will I see this score?
/api/wallet/[publisher]/stats.How is difficulty decided?
08 · Why Wilson, not raw hit-rate?
A wallet with 3 hits out of 3 calls has a 100% hit rate — but that's a tiny sample. The at 95% gives a statistically careful answer that grows with sample size: a 3-for-3 profile scores around 30, while a 100-for-100 scores near 96. Forecasters earn the high score by being right consistently, not by getting lucky once.
The formula is Wilson (1927), most famously used for product ratings on Reddit and Yelp. We extend it to non-integer counts (difficulty weights × recency factors produce decimal values) — at hackathon scale the basic formula is close enough; a fully rigorous version would use a .
This Skill Score is one of four V4 anti-gaming pieces. The others:
- By-wallet leaderboardwhere the score is ranked
- Wallet provenance footershows every alias one wallet owns
- Trust badgesSingle · Multi · Churner · Spam