Why our income classes are anchored to World Bank poverty lines, not invented

Every income distribution dataset has to draw lines somewhere. Marginalised, low, middle, high — where exactly does one class end and the next begin? A lot of providers answer that question with a number that looks reasonable but isn't grounded in anything external. We didn't want to be one of them.

Our four income classes are anchored directly to the World Bank's own international poverty lines — the same thresholds used in official poverty statistics published for every country in the world. That means when we say a household is "marginalised," we mean it by the same definition the World Bank itself uses for lower-middle-income-country poverty. Nothing invented, nothing proprietary about the boundary itself — only the methodology for distributing population across those boundaries is ours.

The four thresholds, and where they come from

Our four classes map onto World Bank's $2.15 / $3.65 / $6.85-style international poverty lines (lower-middle and upper-middle income thresholds), not an internally invented scale

The World Bank publishes three global poverty lines, recalibrated periodically as purchasing power changes: roughly $2.15/day (extreme poverty), $3.65/day (the poverty line typically used for lower-middle-income countries), and $6.85/day (upper-middle-income countries). Our marginalised/low boundary sits at $3.65/day — the same number the World Bank itself uses to classify poverty across most of the African continent, since most African economies are classified as lower-middle income.

This matters because it means our marginalised class isn't a marketing-friendly label dressed over an arbitrary cutoff — it's the same dollar line a World Bank economist would draw on the same data.

Why this matters for how you use the data

If you're building a segmentation strategy — pricing tiers, distribution footprint, product positioning — you need classes that mean something stable across countries and over time. A threshold that shifts depending on which country you're looking at, or that doesn't correspond to any external standard, makes cross-country comparison unreliable. Anchoring to World Bank's own published lines means our marginalised class in Rwanda means the same thing as our marginalised class in Tunisia — directly comparable, no hidden adjustment.

This is also why we deliberately kept four classes rather than the finer 6-to-10-band breakdowns some other sources use. Clients building segmentation strategy consistently tell us granular bands create more confusion than insight — four classes map directly onto how go-to-market and pricing decisions actually get made.

A worked example — validating Angola

Methodology claims are only useful if they hold up against real numbers. Here's an actual validation pass we ran on Angola, a useful test case because its 2018 Gini coefficient (51.3, per World Bank) sits almost exactly on our own modelled Gini for 2019 (50.83) — a clean comparison with the input held nearly constant.

Source	Marginalised share (<$3.65/day)	Gini coefficient
World Bank (2018, official survey data)	52.9%	51.3
Pan Africa Data (2019, modelled)	59–69%	50.83

With an almost identical Gini input, our modelled marginalised share runs somewhat higher than World Bank's directly-surveyed figure — a gap of roughly 6 to 16 percentage points depending on which urban/rural split is examined. That gap is real, and it's worth being transparent about it rather than pretending the model is perfectly calibrated. It reflects a known limitation of lognormal-distribution income models generally: they can run conservative — meaning more pessimistic — in economies where the actual income distribution isn't a clean lognormal shape.

The direction of that gap matters. A model that's modestly conservative on poverty is a safer error to make than one that overstates prosperity — for due diligence and risk-assessment use cases especially, understating how wealthy a market looks is the less costly mistake. We're actively working on tightening this calibration further using World Bank's published poverty headcounts as ground truth across a wider set of countries, rather than relying on any single internal benchmark.

What "validated against the World Bank" actually means here

We don't claim our model perfectly reproduces World Bank survey results — no model does, since survey data captures real, messy, non-lognormal income distributions that a mathematical model can only approximate. What we do claim is that our class boundaries are the same dollar figures the World Bank itself publishes, and that we test our outputs against those same published figures as part of an ongoing validation process, country by country, rather than asserting accuracy without checking.

If a number in our database looks surprising for a country you know well, that's worth investigating — and we'd want to know. Our validation process is continuous, not a one-time check before launch.

Beta is live — public launch Friday 19 June 2026

Income distribution data for all 54 African countries, methodology anchored to World Bank's own poverty lines, validated and transparent about its limitations.

Get access →

Why our income classes are anchored to World Bank poverty lines — not invented thresholds

The four thresholds, and where they come from

Why this matters for how you use the data

A worked example — validating Angola

What "validated against the World Bank" actually means here

Beta is live — public launch Friday 19 June 2026