Methodology

How QGI works

QGI reads open data against history. We learn the signal shapes that tend to run ahead of the events that matter, then, across 180+ countries and territories, we measure how closely each country is tracing those shapes right now, and show you the real historical cases it most resembles. Evidence you can open, not a verdict.

The data lake, by the numbers

Updated Apr 29, 2026

180+

Countries & territories

Scored every release

109

Indicators

across 3 providers

Browse all indicators →

Indicator providers

World Bank

69 indicators

V-Dem

35 indicators

FAOSTAT

5 indicators

From a pattern to a recipe

Start with a real one. QGI’s pattern detector found that Portugal and Spain moved almost in lockstep across 26 indicators (a Pearson correlation near 0.87) for the half-century after both shed authoritarian rule in the mid-1970s. Each indicator that co-moves between two countries is a single co-movement we call an SCDI; the bundle of them is a pattern.

Portugal↔ Spain·26 SCDIs·r ≈ 0.87·1976 onward

One pattern is not yet a recipe. QGI gathers many patterns whose countries went on to share the same kind of event, lines them up against the verified record, and distils the movement they had in common. That distilled, canonical shape is a recipe: the fingerprint that tends to run ahead of, say, a democratic transition, a currency crisis, or a coup. The two methods below are the two ways QGI does this at scale.

Explore the patterns we have found →

Five terms used on this page

Event: A historical incident QGI tracks against a country's indicator trajectory: for example, Brazil's 2018 presidential election, the 2008 Icelandic banking collapse, the 2011 Bahraini uprising. Every event is verified against a public source.
Indicator: A single time-series QGI ingests: GDP per capita, V-Dem polyarchy score, Gini coefficient, FAOSTAT food-price index. Currently 109 indicators across 3 sources.
SCDI: Single Correlating Dyadic Index. The atomic unit of co-movement: when country A's trajectory on one indicator matches country B's, year-aligned at offset N, with Pearson correlation ≥ 0.5. Billions of these get computed per pipeline run.
Pattern: A repeated co-movement across multiple indicators, when several SCDIs link the same country pair at the same time offset. Patterns are stronger evidence than any single SCDI.
Recipe: The canonical shape distilled from multiple Patterns. A recipe says: this country's current trajectory resembles the trajectory of a country that went on to experience a given kind of event. Recipes are the basis for every country reading on the live site.

Two ways we get the insights

Everything QGI does rests on one idea: history rhymes, and the rhyme shows up in the data before the headline. Two methods turn that idea into a reading you can act on.

The Cascade method

Indicators

109 public, open-data signals

SCDIs

signals that move together across countries

Patterns

countries whose paths rhyme

SCDIs

the signal relationships

Verified events

curated & fact-checked

Recipes

the shape that runs ahead of an event

The Trajectory Engine

A recipe's shape

from the Cascade method

A country's recent signals

how it's moving now

Resemblance score

how closely it's tracing the recipe

The Cascade method

Builds risk shapes from the ground up. Starting from 109 public indicators and decades of curated history, it finds the signal relationships that recur across countries, then the country pairs whose whole trajectories rhyme. Fold those relationships together with verified, fact-checked events and you distil a recipe: the shape a country’s signals tend to trace in the years before a given kind of event.

The Trajectory Engine

Puts those recipes to work. Each quarter it takes a recipe’s shape and a country’s most recent signals and measures how closely the country is tracing that shape now. The result is a resemblance reading, not a forecast: a structured way to say what today rhymes with, alongside the real cases it most echoes.

Inside the Trajectory Engine

For readers who want the machinery, here is what the Trajectory Engine actually does per recipe. Nothing below is a black box; it is a deliberately conservative pipeline built to avoid fooling itself.

Per-recipe model

Each event type gets its own supervised model trained on the countries that did and did not experience it. With a healthy positive base we use gradient-boosted trees (XGBoost); where positives are scarce we fall back to regularised logistic regression, so a thin recipe cannot pretend to a complexity its data will not support.

Greedy indicator selection

Indicators are chosen by a forward walk: start from none, repeatedly add the single indicator that most improves cross-validated separation, and stop when the gains plateau. The walk also picks the window length, in years, over which the trajectory is read. A typical recipe ends up resting on a handful of indicators, not all 109.

Resemblance scoring

A country’s recent trajectory on the selected indicators is standardised (z-scored against its own history) and compared to the recipe’s canonical shape by cosine similarity, weighted by each indicator’s contribution (SHAP). The closer the match, the higher the resemblance score.

Validation, honestly

Every recipe is scored by leave-one-country-out cross-validation, so it is judged on countries it never trained on, with 95% confidence intervals from 2,000 bootstrap resamples. We also check calibration, run-to-run reliability, temporal confounding, and fold stability. Recipes that clear the bar are marked Validated; those that only discriminate are Directional; those that fail are shown as Experimental, never dressed up as more.

We use leave-one-country-out rather than random splits on purpose: random test sets leak regional structure and flatter the model. A recipe has to generalise to a country it has never seen, or it does not earn a Validated mark.

What we read

Two catalogues sit underneath every reading: the indicators we track, and the events we hold them against. Both are public, both are organised so you can see exactly what goes in.

Indicators

109 open-data indicators from the World Bank, V-Dem and FAOSTAT, spanning the economy, governance, rights, conflict and society.

A filled dot means the indicator currently feeds at least one recipe; a grey dot means it is tracked but not yet used by one.

Browse all indicators →

Events

Every reading is held against real, verified events, organised into a three-layer taxonomy from broad master groups down to specific event types.

Every verified event is mapped to one of 71 canonical leaf categories, organised in 8 master groups and 52 sub-categories. Click a sub-category to reveal its leaf-level types.

Counts are verified events only. Categories with zero events have not yet had any verified events curated against them.

Browse all verified events →

How an event earns its place

A recipe is only as trustworthy as the events behind it. Here is how every event earns its place before it can shape a single reading.

Three independent stages between finding an event on the web and adding it to the record. The same event, the 2008 Icelandic banking collapse, is shown at every stage.

Stage 1Fetch

Pull the candidate event from a public source.

An LLM agent does a guided web search across Wikipedia, IMF, HRW, UN, government archives, and news of record. The raw text is captured along with the URL.

Raw fetch from Wikipedia

Country: Iceland

Year: 2008

Body: "In October 2008, the three major Icelandic banks ...

were placed in receivership ..."

Stage 2Curate

Reduce to a structured event matching QGI's schema.

The agent extracts a single canonical title, year, category from QGI's 71-category taxonomy, and a 1-2 sentence description. No web access during this stage; it is pure structural cleanup.

After E1–E5 cleanup rules applied

"year": 2008,

"title": "Banking System Collapses in Three Days"

"category": "banking_and_financial_crisis"

"description": "Glitnir, Landsbanki, Kaupthing into receivership ..."

Stage 3Verify

Independent web-search pass confirms every event.

A separate LLM agent re-checks each curated event with 1–2 web queries. It records a verdict (TRUE / FALSE), a 1-2 sentence reason, and the cited source URL. Only TRUE-verdict events land in the data lake.

Verdict written to events/verification/iceland.json

"verdict": "TRUE",

"reason": "Glitnir nationalized 2008-09-29; Landsbanki + Kaupthing 10-07/09 ...",

"source": "https://en.wikipedia.org/wiki/2008–2011_Icelandic_financial_crisis"

10,459

Verified events

after the gate

96.5%

Pass rate

TRUE / (TRUE + FALSE)

7,378

Distinct source URLs

no fabricated links

785

Source domains

wikipedia + ~785 others

An event is only kept once it has passed verification with a true verdict and a cited public source. Events that fail, or that no source can confirm, are dropped.

A worked example

Take the currency crisis recipe. From the indicators that moved together in the run-up to past currency crises, reserves drawing down, short-term debt climbing, confidence in the currency slipping, the Cascade method distils the shape those crises shared beforehand.

Each quarter the Trajectory Engine asks: which of today’s 180+ countries are tracing that same shape? It ranks every country by resemblance and, for the closest ones, shows the historical cases they most echo, each with the real event that followed and a public source.

You see the working at every step: the signals moving, the canonical shape they are measured against, and the precedents behind the reading.

See the currency-crisis recipe Browse every recipe →

How confident we are

Not every kind of event leaves the same fingerprint in the data. Some recipes stand out sharply from everything else; others are fainter. Every recipe carries a confidence mark for how clearly its shape separates from the noise, and that mark travels with it, onto every recipe and every country reading.

We lead with the cases we are most confident about and keep the fainter ones honestly labelled. You will never see a number presented as more certain than it is.

How to read a score

It is resemblance, not probability. A country near the top of a recipe is tracing a shape that has run ahead of that kind of event before. It is not a prediction that the event will happen.
Higher means look closer. Read a strong score as a reason to open the evidence, the signals, the canonical shape, the historical analogues, not as a verdict.
Weaker matches are flagged. When a resemblance is thin or rests on a small base of cases, we say so on the reading itself.
Every reading is datable. Scores are refreshed every quarter and stamped with their date, so you always know how current a reading is.