Methodology

Africa-specific models,
not generic extrapolation

Pan Africa Data applies a tiered forecasting framework built specifically for African market dynamics — accounting for structural informality, commodity dependence, data scarcity and market volatility that generic models designed for developed economies ignore.

World Bank WDI
Proprietary forecast models
Africa-specific methodology
Contents
  1. 01Data sources and licensing
  2. 02Data pipeline and quality framework
  3. 03Tiered forecasting framework
  4. 04Limitations and transparency
01
Data sources and licensing

Pan Africa Data aggregates, cleans and structures data from the World Bank World Development Indicators — the most comprehensive freely licensed macroeconomic and demographic dataset covering all 54 African countries. We do not create the underlying measurements. What we provide is the aggregation, cleaning, gap-filling, Africa-specific forecasting methodology and structured API delivery layer.

For forecast years beyond the World Bank's published data, we apply our proprietary tiered forecasting models — described in sections 03–06 — anchored to publicly available macroeconomic projections and structural relationships calibrated for African market dynamics.

Source Dataset Coverage Licence Update frequency
World Bank
data.worldbank.org
World Development Indicators (WDI) — 90+ macroeconomic, demographic, poverty, financial inclusion and infrastructure indicators 54 African countries, 2000–2024 CC-BY 4.0 — free to use and redistribute with attribution Quarterly
Pan Africa Data
Proprietary forecast models
Africa-specific tiered forecasting models for years beyond World Bank published data — Gini mean reversion, GDP-anchored consumption, poverty elasticity and trend models 54 African countries, forecasts 2025–2035 Proprietary — Pan Africa Data (Pty) Ltd Quarterly, aligned to World Bank refresh
Attribution requirement: When using data obtained through Pan Africa Data in publications, reports or presentations, you must include attribution to the World Bank World Development Indicators as the underlying source, as specified in the metadata returned with each API record. The World Bank data is licensed under CC-BY 4.0. Pan Africa Data is not endorsed by or affiliated with the World Bank Group.
02
Data pipeline and quality framework

Every record in the Pan Africa Data database carries a value type flag that tells you exactly what kind of data point you are working with. This is not a quality score — it is a precise description of how that value was obtained or derived. We believe transparent labelling is more useful than opaque quality indices.

🟢
M — Measured
Direct from the World Bank World Development Indicators. Original source values — not modified, interpolated or adjusted in any way. The highest confidence data type.
Highest confidence
🔵
I — Interpolated
Linear interpolation between two measured values. Applied to survey-based indicators (Gini, poverty headcount, financial inclusion) where surveys are conducted every 3–5 years, creating gaps between measured points. Interpolated values are only generated when both endpoints are measured values.
Medium confidence
🟡
CF — Carried Forward
Last known measured value extended forward for years where publication lag exists or where a survey has not yet been updated. Applied conservatively — only where the indicator is structurally stable and unlikely to have changed significantly. Gini coefficients measured within the last 5 years are carried forward rather than modelled.
Medium confidence
🟠
TE — Trend Extrapolation
Historical trend projected forward using our Africa-specific models (see sections 04–06). Unlike simple linear extrapolation, our TE values apply fundamentals-based models that account for GDP growth, regional mean reversion and structural factors specific to African markets. Applied to forecast years 2025–2035 where external projections are not available.
Model-based estimate
🔴
P — External Projection
Projections for GDP growth, inflation, current account and related macroeconomic aggregates sourced from authoritative international institutions. These projections are published periodically and extend approximately 5 years forward. They are used directly in our models without modification.
Authoritative projection

Our data pipeline runs quarterly following each World Bank data refresh cycle. The pipeline fetches updated source data, recalculates gap-filled values where new measured data has become available, and re-runs forecast models with updated inputs. All historical measured values are preserved — we never overwrite source data with model outputs.

03
Tiered forecasting framework

Africa presents forecasting challenges that generic models — designed for data-rich developed economies — handle poorly. Survey-based indicators are measured irregularly. History is short. Structural breaks are common. Markets are shaped by commodity prices, political cycles and demographic transitions that don't follow the same patterns as OECD economies.

Rather than applying a single extrapolation approach to all indicators, we use a five-tier framework that matches the forecasting method to the nature of each indicator.

Tier Method Indicators Rationale
Tier 1Direct source Direct source data GDP growth, inflation, current account balance, government debt These indicators have authoritative projections from international institutions. Where available, we use these directly without modification.
Tier 2Structural Structural relationship model Household consumption expenditure, poverty headcount ($2.15, $3.65, $6.85), income shares These indicators have well-established relationships with GDP growth and inequality. We model them using those relationships rather than extrapolating the indicator itself.
Tier 3Mean reversion Rules-based mean reversion Gini coefficient, income quintile shares Inequality indicators are structurally sticky but do drift toward regional averages over time. We apply controlled mean reversion anchored to regional peers.
Tier 4S-curve Logistic adoption model Internet penetration, mobile subscriptions, bank account ownership, access to electricity Technology and infrastructure adoption follows S-curve patterns — slow start, rapid adoption, plateau at saturation. Linear extrapolation overstates growth near saturation.
Tier 5Capped trend Trend extrapolation with caps FDI net inflows, trade volumes, remittances These indicators are more volatile and less amenable to structural modelling. We apply trend extrapolation with hard caps to prevent implausible projections (±30% of 5-year average).
Proprietary model detail: The specific parameters, elasticities, regional calibrations and model logic underpinning our Tier 2–5 forecasts are proprietary to Pan Africa Data (Pty) Ltd. This methodology overview describes the framework and approach at a high level. For questions about our forecasting approach, contact info@panafricadata.com.
Why not just use linear extrapolation? Linear extrapolation assumes the future looks like the recent past — a poor assumption for African markets where structural change, commodity cycles and demographic transitions create non-linear dynamics. A country like Ethiopia growing at 8% annually cannot sustain that rate indefinitely; a country like South Africa with persistently high inequality is unlikely to see rapid Gini improvement without structural change. Our models embed these realities.
04
Limitations and transparency

We believe transparency about limitations is as important as the methodology itself. Clients making investment decisions or policy recommendations need to understand what our data can and cannot tell them.

⚠️
Survey data is not annual
Household surveys (Gini, poverty headcount, financial inclusion) are conducted every 3–5 years in most African countries. Values between survey years are interpolated or modelled estimates, not measurements. Always check the value type flag (M, I, CF, TE) before drawing conclusions.
⚠️
Forecasts are not predictions
Our forecast values (TE, P) represent the most likely trajectory given current information and our modelling assumptions. They do not account for political shocks, conflict, pandemics, commodity price collapses or other structural breaks. All forecasts should be treated as scenarios, not certainties.
⚠️
Data coverage varies by country
Data availability and quality varies significantly across 54 countries. Large economies (South Africa, Nigeria, Kenya, Egypt) have more measured data points and higher confidence forecasts than smaller or conflict-affected economies (South Sudan, Eritrea, Somalia). Check the data quality score for each country before use.
⚠️
Revisions are normal
Source data is revised by the World Bank as new information becomes available. Our quarterly refresh incorporates these revisions. Historical values in our database may therefore change between refresh cycles — this is correct behaviour, not an error.
⚠️
PPP comparisons have limits
PPP conversion factors are updated periodically through the International Comparison Program (ICP). Changes in the PPP base year can significantly affect cross-country income comparisons. We use the most recent International Comparison Program round and note the base year in our metadata.
Questions about our methodology? We welcome scrutiny and feedback. If you identify an error, a better approach or a limitation we haven't addressed, please contact us at info@panafricadata.com. We update this methodology page when our models change.

Built for practitioners who need to trust their data

Every data point labelled. Every model documented. Start with a free 7-day trial.