| airport-codes | README.md | datapackage.json | scripts/process.py; scripts/airport-codes-flow.py | Reference table of IATA/ICAO airport codes from OurAirports (ourairports.com) | 2025-03-30 | 2026-03-02 | true | Local data is 337 days behind - upstream has 1846 more records (84654 vs 82808 local); daily GH Actions cron stopped updating since March 2025 | done |
| archive-fivethirtyeight | README.md | not present | scattered per-dataset scripts (e.g. bechdel/analyze-bechdel.R) | Archive of 160+ FiveThirtyEight datasets covering sports forecasts, politics, culture and social issues | 2025-02-24 | 2025-02-25 | true | In sync with upstream but FiveThirtyEight ceased publishing - last update 2025-02-25 (~12 months ago); no further updates expected | done |
| atp-world-tour-tennis-data | README.md | datapackage.json | python/ | ATP World Tour tennis data: tournaments, match scores, stats, rankings and player overviews scraped from atptour.com | 2017-11-20 | 2022-11-20 | true | Local copy is 5 seasons behind upstream (covers only through 2017); upstream itself not updated past 2022; live ATP site returns HTTP 403 | done |
| bond-yields-gov-long-term | README.md | datapackage.json | scripts/process.py | Long-term US government bond yields (10-year Treasury monthly averages) sourced from FRED | 2025-02 | 2026-02 | true | 12 months behind - local ends 2025-02; FRED has data through 2026-02; script has hardcoded dates from last run (Feb 2025) | done |
| bond-yields-uk-10y | README.md | datapackage.json | scripts/bond_uk_flow.py | 10-year nominal UK government bond yields from Bank of England - annual (1984-present) and quarterly (1963-present) | 2025-03-31 | 2025-12-31 | true | Quarterly is 3 quarters behind (missing Q2-Q4 2025); annual is 1 year behind (missing 2025); both BoE endpoints reachable | done |
| bond-yields-us-10y | README.md | datapackage.json | scripts/ | Monthly 10-year US Treasury bond yields from Federal Reserve H.15 (April 1953-present) | 2026-01-01 | 2026-01 | false | Up to date - local matches upstream (latest: 2026-01 at 4.21%); Feb 2026 not yet published by Federal Reserve | done |
| breast-cancer | README.md | datapackage.json | scripts/main.py | Static 1988 Ljubljana breast cancer dataset - 286 instances with 10 categorical clinical attributes; frozen benchmark used in ML research | 2018-01-09 | 2020-11-20 | true | Local (272 rows) diverged from upstream (286 rows) since Jan 2018 deduplication; not re-synced in 8+ years though dataset itself is frozen/no new records | done |
| browser-stats | README.md | datapackage.json | scripts/process.py | Monthly web browser market share (Chrome/Firefox/IE/Safari/Opera) from W3Schools server logs (2002-present) | 2016-05 | 2025-09 | true | ~9 years and 4 months behind - local ends 2016-05; upstream has data through 2025-09; script uses Python 2 syntax and unmaintained libs | done |
| carbondoomsday | README.md | none | src/data/co2ppm.csv.sh | Observable Framework CO2 dashboard - daily Mauna Loa CO2 PPM readings back to 1958; data fetched at build time from datahub.io | N/A (no committed data) | 2025-08-09 | true | No local data committed; datahub.io feed last updated 2025-08-09 (6+ months stale); NOAA authoritative source has data through 2026-02-28 | done |
| cash-surplus-deficit | README.md | datapackage.json | scripts/process.py | Cash surplus/deficit as % of GDP from World Bank WDI Archives (1972-2014; ~190 countries); retired/frozen indicator | 2014 | 2014 | false | Up to date - local matches upstream exactly; indicator retired to frozen archive (WDI DS57) with no data beyond 2014 | done |
| cervical-cancer | README.md | datapackage.json | scripts/main.py | Static cohort of 858 patients from Hospital Universitario de Caracas - risk factors for cervical cancer (UCI ML dataset #383) | 2025-01-29 | 2024-03-10 | false | Up to date - local (858 rows last synced 2025-01-29) matches upstream exactly; static historical dataset with no ongoing feed | done |
| cirp-survey-of-freshmen | README.md | datapackage.json | scripts/process.py | CIRP 2014 Freshman Survey - student opinions and institution info extracted from UCLA/HERI annual PDF monograph | 2014 | 2024 | true | 10 survey years behind - only covers Fall 2014; HERI publishes annually with Fall 2024 available (Feb 2025); extraction script hardcoded to 2014 PDF layout | done |
| clinical-trials-us | README.md | none | extract.js | US clinical trials from ClinicalTrials.gov - XML records for human participant studies; ~139848 records as of 2013 | 2013-01-31 | 2026-03-02 | true | 13+ years behind - local pulled Feb 2013 (~139848 studies); ClinicalTrials.gov now has 573913 studies updated daily; legacy ct2 XML bulk download retired | done |
| co2-fossil-by-nation | README.md | datapackage.json | scripts/file.py | Per-country annual CO2 emissions from fossil fuels and cement (1751-present) from CDIAC-FF at Appalachian State University | 2020 | 2022 | true | 2 years behind - local ends at 2020; CDIAC-FF has published nation.1751_2021.xlsx (Oct 2025) and nation.1751_2022.xlsx (Nov 2025) | done |
| co2-fossil-global | README.md | datapackage.json | none | Annual global CO2 emissions from fossil fuels and cement (1751-present) from CDIAC NDP-030 | 2010 | 2023 | true | 13 years behind - local ends 2010; GCB2024 has data through 2023; original CDIAC URLs dead (HTTP 530); no update scripts | done |
| co2-ppm | README.md | datapackage.json | scripts/process.sh | Monthly and annual atmospheric CO2 (ppm) from NOAA GML - Mauna Loa (1958-present) and global average marine surface sites | 2025-11 | 2026-01 | true | co2-mm-mlo.csv is empty (header only - data missing); upstream MLO has data through 2026-01; FTP source reachable; other 5 files are current | done |
| co2-ppm-daily | README.md | datapackage.json | scripts/co2-ppm-daily-flow.py | Daily CO2 PPM from Scripps Institution of Oceanography - Mauna Loa in-situ measurements (1958-present) | 2025-08-09 | 2025-11-29 | true | ~112 days behind - local ends 2025-08-09; Scripps upstream has valid data through 2025-11-29; GH Actions automation stopped ~Aug 2025 | done |
| cofog | README.md | datapackage.json | scripts/ | UN Classification of the Functions of Government (COFOG 1999) - 188 hierarchical government function codes with multilingual descriptions | N/A | N/A | false | Not stale - stable reference classification; COFOG 1999 is still the only official version; all 188 codes match upstream exactly | done |
| commodity-prices | README.md | datapackage.json | scripts/process.py | Monthly prices for ~53 commodities and 10 indices from IMF Primary Commodity Price System (1980/1990-present) | 2025-09-01 | 2026-01-01 | true | 4 months behind - local ends 2025-09; IMF upstream has through 2026-01; source2 .ashx URL broken (redirects to PDF now) | done |
| corruption-perceptions-index | README.md | data/datapackage.json | script/corruption-perceptions-index-dataflows.py | Transparency International CPI - country corruption perceptions rankings (1995-2017; 252 countries) | 2017 | 2025 | true | 8 years behind - local ends 2017; TI published CPI 2025 on 2026-02-04; legacy script URLs broken (HTTP 403/connection refused) | done |
| country-codes | README.md | datapackage.yml | scripts/ | Comprehensive country codes combining ISO 3166/4217, ITU, UNTERM, CLDR, GeoNames, M49 and more (~250 countries) | 2026-01-01 | 2026-03-02 | true | ~60 days behind - Feb 2026 CI run missing; CLDR pinned to Sep 2024 commit (2 major versions behind); GeoNames updated 2026-03-02; SEC EDGAR returns 403 | done |
| country-list | README.md | datapackage.yml | none | ISO 3166-1-alpha-2 English country names and 2-letter codes - 249 entries as of Dec 2012 | 2012-12 | 2025-08 | true | Stale since 2012 - at least 3 outdated names (Cape Verde?Cabo Verde 2013; Czech Republic?Czechia 2016; Macedonia?North Macedonia 2019); 911 ISO updates published since; no update scripts | done |
| covid-19 | README.md | datapackage.json | scripts/ | COVID-19 confirmed cases/deaths/recoveries worldwide from JHU CSSE - country and US county/state level time series | 2022-04-16 | 2023-03-09 | true | 11 months behind upstream; JHU CSSE repo permanently archived (stopped 2023-03-10); no future updates possible; dataset is concluded historical archive | done |
| cpi | README.md | datapackage.json | scripts/cpi2datapackage.py | Annual Consumer Price Index (CPI) for most countries from World Bank indicator FP.CPI.TOTL.ZG | 2024 | 2024 | true | Local snapshot from 2025-06-05 is ~9 months old; upstream (updated 2026-02-24) now has 219 countries with 2024 data vs 123 locally - 96 countries missing | done |
| cpi-change | README.md | datapackage.json | scripts/process.py | Annual CPI percent changes for US food categories (1974-present) from USDA ERS Food Price Outlook - 16 categories | 2024 | 2024 | false | Data content up to date - local matches upstream (both end at 2024; 2025 not yet published); datapackage.json metadata stale (claims 1974-2016/2017) | done |
| cpi-gb | README.md | datapackage.json | scripts/process.py | UK CPI and RPI inflation from ONS (1800-present) - monthly from 1947 and annual; CDKO and CDSI series | 2026-01 | 2026-01 | false | Up to date - local matches upstream ONS release of 2026-02-18; next release scheduled 2026-03-25 | done |
| cpi-us | README.md | datapackage.json | scripts/process.py | Monthly US CPI-U from BLS (Jan 1913-present) - index value and month-over-month inflation rate; series CUUR0000SA0 | 2023-12-01 | 2026-01-01 | true | ~25 months behind - local ends Dec 2023; BLS upstream has through Jan 2026; needs re-run with BLS_API_KEY | done |
| crime-uk | README.md | none | scripts/scrape.js | UK monthly crime statistics from data.police.uk - only static forces list committed; actual crime records not stored in repo | 2010 | 2026-01 | true | No crime records committed; only 2010 population data; upstream has data through Jan 2026 (13+ year gap); scraping script uses deprecated Node APIs | done |
| currency-codes | README.md | datapackage.json | scripts/runall.sh | ISO 4217 currency codes - current (Table A.1) and historic (Table A.3) from SIX Group on behalf of ISO | 2024-09-01 | 2026-01-01 | true | ~16 months behind - local archive XMLs from 2024-06-25 / 2024-09-01; upstream SIX Group published 2026-01-01 | done |
| dac-and-crs-code-lists | README.md | datapackage.json | scraper.py | OECD DAC/CRS aid reporting code lists - 16 CSVs parsed from OECD XLS/XML at webfs.oecd.org | 2025-01-08 | 2026-02-27 | true | ~13 months behind - local last updated 2025-01-08; upstream XLS updated 2026-02-27; XML updated 2025-03-12 | done |
| edgar | README.md | none | scripts/edgar.py | SEC EDGAR documentation scaffold - describes access to filing indices and DERA XBRL financial datasets; no actual data downloaded | 2016-Q4 | 2026-02-27 | true | No data ever committed; documentation references 2016 Q4 as recent; EDGAR live through 2026-02-27; old DERA ZIP URL pattern deprecated (HTTP 404) | done |
| employment-us | README.md | datapackage.json | scripts/process.py | US employment and unemployment rates (1940-present) from BLS CPS table cpsaat01 - annual civilian noninstitutional population statistics | 2024 | 2025 | true | 1 year behind - local ends 2024; BLS upstream has 2025 annual averages (4.3% unemployment); last commit 2025-04-04 | done |
| eu-emissions-trading-system | README.md | datapackage.json | scripts/process.py | EU ETS greenhouse gas emissions and allowances by country/sector/year (2005-2024) from EEA/EUTL | 2024 | 2024 | false | Data content current (both end at 2024); but update pipeline broken - download URL returns HTTP 404; README says 2005-2014 (should be 2005-2024) | done |
| euribor | README.md | datapackage.json | scripts/scrap_euribor.py | Monthly Euribor benchmark interest rates (1999-present) across 8 maturities sourced from euribor-rates.eu | 2026-02-02 | 2026-02-02 | true | 3m/6m/12m files missing Jan-Feb 2026 data; 1m and 1w files are current; upstream has rates for all 5 maturities through 2026-02-02 | done |
| exchange-rates | README.md | datapackage.json | scripts/main.py; exchange_rates_flow.py | Foreign exchange rates (~35 currencies vs USD) from FRED - daily/monthly/annual granularities | 2018-10-12 | 2026-02-27 | true | ~7 years behind - local ends 2018-10-12; FRED upstream through 2026-02-27; FRED .txt URLs now redirect to HTML login (scripts broken) | done |
| exchange-rates-usd | README.md | not present | run.py | Historical USD exchange rates from Federal Reserve FRED H.10 - consolidated CSV from bulk FRED download | none (no data committed) | 2026-02-27 | true | No data ever committed; original source ZIP returns HTTP 403; script is Python 2 only; Fed H.10 release is live through 2026-02-27 | done |
| expenditure-on-research-and-development | README.md | datapackage.json | scripts/process.py | Gross Domestic Expenditure on R&D by country and funding source from UNESCO UIS (1996-present) | 2016 | 2023 | true | 7 years behind - local ends 2016; UNESCO UIS Feb 2026 release has data through 2023; original URL now redirects to new portal | done |
| fertility | README.md | datapackage.json | scripts/main.py | Static UCI ML dataset #244 - 100 male fertility diagnosis instances with lifestyle/medical attributes (donated 2013) | 2018-01-03 | 2013-01-16 | false | Not stale - static one-time research dataset; no upstream updates since 2013 donation; local matches upstream exactly | done |
| finance-vix | README.md | datapackage.yaml | Makefile | CBOE VIX daily time-series (open/high/low/close) from CBOE CDN (1990-present) - updated daily via curl | 2026-02-27 | 2026-02-27 | false | Up to date - local matches upstream; 2026-02-27 is last trading day (today is Monday 2026-03-02) | done |
| gdp | README.md | datapackage.json | scripts/process.py | Country/regional/world GDP in current USD from World Bank indicator NY.GDP.MKTP.CD (1960-present) | 2023 | 2024 | true | 1 year behind - local ends 2023; World Bank has 2024 data (updated 2026-02-24); datapackage.json shows 2026-02-24 but CSV was not updated | done |
| gdp-uk | README.md | datapackage.json | scripts/process.sh | UK Real GDP (1948-present) from ONS - GVA chained volume measures seasonally adjusted; annual and quarterly | 2024-Q3 | 2025-Q3 | true | data/data.csv is empty (header only); archive has 2024-Q3; ONS upstream has 2025-Q3 (released 2025-12-22); script writes to non-existent source/ dir | done |
| gdp-us | README.md | datapackage.json | scripts/process.py | US GDP nominal and real - annual (1929-present) and quarterly (1947-present) from BEA in current and chained 2017 dollars | 2024-10-01 | 2025-10-01 | true | 4 quarters behind quarterly (ends Q4 2024; BEA has Q4 2025); 2 years behind annual (ends 2023; BEA has 2025); BEA XLS URLs broken (return HTML) | done |
| genome-sequencing-costs | README.md | datapackage.json | scripts/process.py | NHGRI DNA sequencing cost per megabase and per genome (2001-present) from National Human Genome Research Institute | 2022-05 | 2022-05 | true | Local matches upstream but both stale - NHGRI has not published newer data since May 2022; ~3.8 years behind today | done |
| gini-index | README.md | datapackage.json | scripts/process.py | GINI Index from World Bank WDI (SI.POV.GINI) - long-format CSV with country/year/value | 2024 | 2024 | true | WDI updated 2026-02-24 (~8mo after last local fetch 2025-07-01); upstream has more 2024 country records (13 vs 2 locally); CI automation stalled | done |
| glacier-mass-balance | README.md | datapackage.json | scripts/process.py | Average cumulative mass balance of reference glaciers worldwide (1945-present) in meters water equivalent from EPA/WGMS | 2023 | 2023 | true | Original EPA source URL returns HTTP 404 (site reorganized); EPA replaced indicator with Arctic-only scope; WGMS has 2024 data (most negative year on record) not yet in pipeline | done |
| global-temp | README.md | datapackage.json | scripts/process.py | Global temperature anomalies from GISTEMP (NASA/1880-present) and HadCRUT5 (Met Office/1850-present) - monthly and annual | 2024-07 | 2026-01 | true | GISTEMP missing 2024-2025 annual and 25 months monthly; HadCRUT5 missing 2025 annual and 7 months monthly; all upstream URLs reachable | done |
| collective | README.md | N/A | N/A | DataHub community documentation wiki - Markdown notes, blog posts and codex guides; no structured dataset files | 2026-02-18 | 2026-02-18 | false | Not a dataset repo - documentation/wiki only; in sync with upstream (latest commit 2026-02-18) | done |
| london-population | README.md | datapackage.json | scripts/population.py | London population | 2027-01-01 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403); https://data.london.gov.uk/download/projections/b9daabe6-6dd3-4082-adcf-a6b458ef4945/central_trend_2016_base.xlsx (HTTP 403) | done |
| london-transport | README.md | data/datapackage.json | scripts/london_public_transport_journeys.py | London public journeys by type of transport | 2024-07-20 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-underground-report | README.md | data/datapackage.json | scripts/london_underground.py | London underground performance | 2018-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-unemployment | README.md | data/datapackage.json | scripts/london_unemployment.py | London unemployment rate | 2024-11-01 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| lymph | README.md | datapackage.json | scripts/main.py | Lymph | | 1988-12-31 | true | No local date detected, while upstream has data through 1988-12-31 | done |
| media-types | README.md | datapackage.json | scripts/process.py | List of Media Types, Media Subtypes, and their extensions. | 2018-12-31 | 2018-12-31 | false | Local matches upstream at 2018-12-31 | done |
| membership-to-copyright-treaties | README.md | datapackage.json | scripts/process.py | Membership to Copyright Treaties | 2025-11-22 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| nasdaq-listings | README.md | datapackage.json | scripts/process.py | Nasdaq Listings | 2027-12-31 | 2027-12-31 | false | Local matches upstream at 2027-12-31 | done |
| natural-gas | README.md | datapackage.json | scripts/process.py | Natural gas prices | 2026-02-23 | 2026-02-25 | true | Local latest 2026-02-23 is 2 days behind upstream 2026-02-25 | done |
| nyse-other-listings | README.md | datapackage.json | scripts/process.py | NYSE and Other Listings | 2027-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| oil-prices | README.md | datapackage.json | oil_prices_flow.py; Makefile | Brent and WTI Spot Prices | 2026-02-23 | 2026-02-25 | true | Local latest 2026-02-23 is 2 days behind upstream 2026-02-25 | done |
| openflights | README.md | not present | none | Welcome to the code base for [OpenFlights](http://openflights.org), a tool that lets you map your flights around the world, | 2011-09-30 | | true | Could not compute upstream max date from reachable sources; probe failures: http://openflights.org/data.html (HTTP 404) | done |
| openml-datasets | not present | not present | none | No description found | 2027-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| opented | README.md | not present | scripts/extract.py; scripts/scrape.js | <a className="gh-badge" href="https://datahub.io/core/opented"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | 2011-12-31 | true | No local date detected, while upstream has data through 2011-12-31 | done |
| owid-datasets | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/owid-datasets"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2027-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/core/owid-datasets (HTTP 500) | done |
| pharmaceutical-drug-spending | README.md | datapackage.json | scripts/population.py; scripts/process.py | Pharmaceutical Drug Spending by countries | 2023-12-31 | 2023-12-31 | false | Local matches upstream at 2023-12-31 | done |
| political_geography | README.md | not present | Makefile | <a className="gh-badge" href="https://datahub.io/core/political_geography"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | | true | Could not compute upstream max date from reachable sources; probe failures: https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py (HTTP 404) | done |
| polls-us-presidential | README.md | not present | none | Only "open" source i could find: | 2017-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| population | README.md | datapackage.json | scripts/process.py | Population figures for countries, regions (e.g. Asia) and the world. Data comes originally from World Bank and has been converted into standard CSV. | 2023-12-31 | 2026-02-24 | true | Local latest 2023-12-31 is 786 days behind upstream 2026-02-24 | done |
| population-city | README.md | datapackage.json | scripts/process.py | City population yearly timeseries for female and male, and for both sexes, collected by the United Nations Statistics Division and published by UNData in 22 Dec 2014 Next update in UNdata: Jun 2015 (est.) Periodicity: biannual. Contains two datasets in CSV format: unsd-citypopulation-year-both.csv Size: 2.4 MB unsd-citypopulation-year-fm.csv File size: 3.7 MB Final 222 lines in both datasets contain original notes. | 2025-02-24 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.un.org/Handlers/DownloadHandler.ashx (HTTP 500) | done |
| population-global-historical | README.md | datapackage.json | scripts/consolidate.py | Global Historical Population | 2000-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| population-growth-estimates-and-projections | README.md | datapackage.json | population_estimates_flow.py | Population Growth | 2027-12-31 | 2026-01-06 | false | Local latest 2027-12-31 is ahead of inferred upstream 2026-01-06 (check source parsing) | done |
| population-reference-bureau | README.md | not present | scripts/__init__.py; scripts/settings.py | <a className="gh-badge" href="https://datahub.io/core/population-reference-bureau"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2019-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| ppp | README.md | datapackage.json | scripts/process.py; Makefile | Data are sourced from the World Bank, International Comparison Program database. One dataset is provided: PPP conversion factor, GDP (LCU per international $). | 2024-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://api.worldbank.org (HTTP 404); http://www.worldbank.org}country (network error: [Errno -2] Name or service not known) | done |
| primary-tumor | README.md | datapackage.json | scripts/main.py | Primary tumor | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| product-data | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/product-data"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2027-12-31 | 2024-07-10 | false | Local latest 2027-12-31 is ahead of inferred upstream 2024-07-10 (check source parsing) | done |
| publicbodies | README.md | data/datapackage.json | scripts/deploy/prepare_build.py; scripts/import/br/import_br.py; scripts/import/it/import_it.py; scripts/maintenance/domain_to_url.py; scripts/maintenance/se/generate_ids.py; scripts/maintenance/se/simpleslugger.py; scripts/migrate/process.py | A database of public bodies (or organizations) around the world, such as government departments, ministries etc. | | 2026-02-18 | true | No local date detected, while upstream has data through 2026-02-18 | done |
| race-and-ethnicity-codes-us | README.md | datapackage.json | none | This data set contains Race/Ethinicty codes. | 2016-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| reference-staging | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/reference-staging"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2011-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/core/reference-staging (HTTP 404) | done |
| rio2016 | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/rio2016"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2002-09-12 | 2016-12-31 | true | Local latest 2002-09-12 is 5224 days behind upstream 2016-12-31 | done |
| s-and-p-500 | README.md | datapackage.json | scripts/date_utils.py; scripts/process.py; scripts/test_data.py; scripts/test_process.py; scripts/update_from_fred.py; Makefile | Standard and Poor's (S&P) 500 Index Data including Dividend, Earnings and P/E Ratio | 2026-02-11 | 2026-02-27 | true | Local latest 2026-02-11 is 16 days behind upstream 2026-02-27 | done |
| s-and-p-500-companies | README.md | datapackage.yaml | scripts/scrape.py; Makefile | <a className="gh-badge" href="https://datahub.io/core/s-and-p-500-companies"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2025-07-23 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.spindices.com/indices/equity/sp-500 (HTTP 403); http://data.okfn.org/data/s-and-p-500 (network error: [Errno -2] Name or service not known) | done |
| s-and-p-500-companies-financials | README.md | datapackage.json | scripts/constituents-financials.py; scripts/constituents.py; scripts/validate.py | S&P 500 Companies with Financial Information | | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.spindices.com/indices/equity/sp-500 (HTTP 403); http://www.spindices.com/documents/additional-material/sp-500-eps-est.xlsx?force_download=true (HTTP 403); http://us.spindices.com/idsexport/file.xls?hostIdentifier=48190c8c-42c4-46af-8d1a-0cd5db894797&selectedModule=Constituents&selectedSubModule=ConstituentsFullList&indexId=340 (network error: [Errno -2] Name or service not known) | done |
| sea-level-rise | README.md | datapackage.json | scripts/ftp_download.py; scripts/process.py; Makefile | This data contains cumulative changes in sea level for the world’s oceans since 1880, based on a combination of long-term tide gauge measurements and recent satellite measurements. | 2023-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www3.epa.gov/climatechange/images/indicator_downloads/sea-level_fig-1.csv (network error: timed out); http://www3.epa.gov/climatechange/science/indicators/oceans/sea-level.html (network error: timed out); http://www.springerlink.com/content/h2575k28311g5146/ (network error: [Errno -2] Name or service not known) | done |
| seismic-bumps | README.md | datapackage.json | scripts/main.py | Seismic bumps | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| seshat | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/seshat"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2016-12-31 | 2024-06-30 | true | Local latest 2016-12-31 is 2738 days behind upstream 2024-06-30 | done |
| smdg-master-terminal-facilities-list | README.md | datapackage.json | scripts/process.py | Code list for terminal facilities built as an extension to UN/LOCODE, sourced from https://github.com/smdg-org/Terminal-Code-List | 2025-12-31 | 2026-12-31 | true | Local latest 2025-12-31 is 365 days behind upstream 2026-12-31 | done |
| socrata-opendata | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/socrata-opendata"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2017-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://opendata.socrata.com/ (redirected to login) | done |
| speed-dating | README.md | datapackage.json | scripts/main.py | Speed dating | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| threatened-species | README.md | datapackage.yaml | scripts/data.py; Makefile | <a className="gh-badge" href="https://datahub.io/core/threatened-species"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | | true | Could not compute upstream max date from reachable sources; probe failures: https://apiv3.iucnredlist.org/api/v3 (HTTP 525) | done |
| tic-tac-toe | README.md | datapackage.json | scripts/main.py | Tic Tac Toe Endgame | | | true | Could not compute upstream max date from reachable sources; probe failures: https://archive.ics.uci.edu/ml/datasets/Tic-Tac-Toe+Endgame (HTTP 404) | done |
| top-level-domain-names | README.md | datapackage.json | scripts/process.py | This Data Package contains the delegation details of top-level domains | 2012-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| uk-sic-2007-condensed | README.md | datapackage.json | none | UK Condensed SIC 2007 | 2007-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.ons.gov.uk/ons/site-information/information/creative-commons-license/index.html (HTTP 405); https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/376462/condensedSICList.pdf (HTTP 410); https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| unece-package-codes | README.md | datapackage.json | none | Coded representations of the package type names used in International Trade (UNECE/CEFACT Trade Facilitation Recommendation No.21) | 2002-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://unece.org/trade/uncefact/cl-recommendations (HTTP 403); http://opendatacommons.org/guide/#sthash.97PSVxmh.dpuf (HTTP 404) | done |
| unece-units-of-measure | README.md | datapackage.json | none | UNECE Units of measure | | 2026-01-30 | true | No local date detected, while upstream has data through 2026-01-30 | done |
| unicode-characters | README.md | datapackage.json | scripts/process.py | UnicodeData.txt is a reference file from the Unicode Consortium containing metadata for every Unicode character, including code points, names, categories, and properties, used to support Unicode-compliant text processing. | | 2025-08-16 | true | No local date detected, while upstream has data through 2025-08-16 | done |
| un-locode | README.md | datapackage.json | scripts/download_loc.py; scripts/integrate.py; scripts/prepare.py; scripts/prepare_edition_mdb.sh; Makefile | UN-LOCODE Codelist | 2024-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.unece.org/cefact/locode/welcome.html (HTTP 403); https://www.unece.org/cefact/codesfortrade/codes_index.html (HTTP 403); https://unece.org/trade/cefact/UNLOCODE-Download (HTTP 403) | done |
| usa-education-budget-analysis | README.md | datapackage.json | scripts/process.py | United States of America education budget analysis | 2027-12-31 | 2023-12-31 | false | Local latest 2027-12-31 is ahead of inferred upstream 2023-12-31 (check source parsing) | done |
| world-cities | README.md | datapackage.json | scripts/process.py | List of the world's major cities (above 15,000 inhabitants) | | 2000-12-31 | true | No local date detected, while upstream has data through 2000-12-31 | done |
| world-development-indicators | README.md | not present | scripts/extractFeaturedWorldBankDatasets.py; scripts/gen-indicators-list.sh; scripts/get.py | <a className="gh-badge" href=""><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2024-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/core/world-development-indicators/$given_folder/$dirname (HTTP 404); https://api.worldbank.org/v2/indicator/%s?format=json (HTTP 400); https://api.worldbank.org/v2/en/indicator/%s?downloadformat=csv (HTTP 400) | done |
| world-happiness-report | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/world-happiness-report"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | 2024-07-05 | true | No local date detected, while upstream has data through 2024-07-05 | done |
| world-religion-projections | README.md | datapackage.json | none | This dataset contains the estimated religious composition of 198 countries and territories for 2010 | 2020-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| world-wealth-and-income-database | README.md | datapackage.json | none | The World Top Incomes Database | 2012-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://g-mond.parisschoolofeconomics.eu/topincomes/ (network error: [Errno -2] Name or service not known); http://g-mond.parisschoolofeconomics.eu/topincomes (network error: [Errno -2] Name or service not known); https://commondatastorage.googleapis.com/ckannet-storage/2012-01-07T150817/data.csv (HTTP 403) | done |
| zopa | README.md | datapackage.json | none | ZOPA Market Interest Rate and Risk Data | 2011-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://uk.zopa.com/ZopaWeb/public/lending/meet-dave.html (network error: [Errno -5] No address associated with hostname) | done |
| collective | README.md | not present | none | We are a collective creating high-quality open data. We curate, publish, and maintain datasets together -- and build the tooling to support that. We also just hangout and [chat about data stuff](https://discord.gg/8KvAeFV). | | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/notes (HTTP 404) | done |
| commons | README.md | not present | none | The awesome section presents collections of high quality datasets organized by topic. | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| continent-codes | README.md | datapackage.json | none | List of continent codes | 2016-03-25 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| core-datasets | README.md | datapackage.json | none | Registry of published datasets in the Core Datasets Project | 2018-04-01 | | true | Could not compute upstream max date from reachable sources; probe failures: http://frictionlessdata.io/guides/tabular-data-package/ (HTTP 404); https://github.com/datasets/registry/issues/new (redirected to login) | done |
| crunchbase-data | readme.md | not present | none | This data was extracted from the December 4, 2015 [Crunchbase Data Export](http://info.crunchbase.com/about/crunchbase-data-exports/). | 2016-12-31 | 2026-02-27 | true | Local latest 2016-12-31 is 3345 days behind upstream 2026-02-27 | done |
| crunchcrawl | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/crunchcrawl"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2024-12-31 | 2026-03-02 | true | Local latest 2024-12-31 is 426 days behind upstream 2026-03-02 | done |
| datacatalogs.org | not present | not present | process.py | No description found | | | true | Could not compute upstream max date from reachable sources; probe failures: http://api.geonames.org/searchJSON?maxRows=1&username=%s&q= (HTTP 401); http://datacatalogs.org/api/search/dataset?q=&limit=500&all_fields=1 (HTTP 404); http://datahub.io/api/data/39317285-d0e8-4dad-9e5d-f064100132c9 (HTTP 404) | done |
| datapackage-factory-openMV | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/datapackage-factory-openMV"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2012-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| datasets.github.com | not present | not present | none | No description found | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| dermatology | README.md | datapackage.json | scripts/main.py | Dermatology | | 1998-12-31 | true | No local date detected, while upstream has data through 1998-12-31 | done |
| diagnosed-diabetes-prevalence | README.md | datapackage.json | none | This dataset contains number and percentage of diabetes patients in the US during 2013 grouped by ZIP code. The prevalence and incidence of diabetes have increased in the United States in recent decades, no studies have systematically examined long-term, national trends in the prevalence and incidence of diagnosed diabetes. The prevalence of diabetes increased substantially between 2000 and 2007, mainly because there are more patients with a new | 2016-01-26 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| eeg-eye-state | README.md | datapackage.json | scripts/main.py | EEG Eye State | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| emojis | README.md | datapackage.json | scripts/helpers/__init__.py; scripts/helpers/helpers.py; scripts/process.py | UTS #51 Unicode Emoji | 2025-12-31 | 2025-12-31 | false | Local matches upstream at 2025-12-31 | done |
| fips-10-4 | README.md | datapackage.json | scripts/process.py | Region codes of countries, dependencies, areas of special sovereignty, and their principal administrative divisions | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| five-thirty-eight-datasets | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/five-thirty-eight-datasets"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2027-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| football-datasets | README.md | not present | scripts/package.py; scripts/process.py | <a className="gh-badge" href="https://datahub.io/collections/football"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2021-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/core/football-datasets (HTTP 404) | done |
| gcat-artificial-space-objects | README.md | not present | none | https://planet4589.org/space/gcat/ | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| geo-admin1-us | README.md | datapackage.json | none | Natural Earth Polygons for the states in the United Stated of America | | 2026-03-03 | true | No local date detected, while upstream has data through 2026-03-03 | done |
| geo-boundaries-us-110m | README.md | datapackage.json | none | geodata data package providing geojson polygons for United States' Internal Administrative Boundaries | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| geo-boundaries-world-110m | README.md | datapackage.json | none | geodata data package providing geojson polygons for all the world's countries | | 2026-03-03 | true | No local date detected, while upstream has data through 2026-03-03 | done |
| geo-countries | README.md | datapackage.json | Makefile | geodata data package providing geojson polygons for all the world's countries | | 2026-03-03 | true | No local date detected, while upstream has data through 2026-03-03 | done |
| geodata | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/geodata"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| geoip2-ipv4 | README.md | datapackage.json | scripts/process.py | IPv4 geolocation | | | true | Could not compute upstream max date from reachable sources; probe failures: https://download.maxmind.com/geoip/databases/GeoLite2-Country-CSV/download?suffix=zip (HTTP 401) | done |
| geo-ne-admin1 | README.md | datapackage.json | scripts/process.py | Polygons polygons for the largest administrative subdivisions in every countries | | 2026-03-03 | true | No local date detected, while upstream has data through 2026-03-03 | done |
| geo-nuts-administrative-boundaries | README.md | datapackage.json | scripts/process.py | geodata data package providing geojson polygons and shp for administratives European NUTS levels 1, 2 and 3 | | | true | Could not compute upstream max date from reachable sources; probe failures: https://datahub.io/core/geo-nuts-administartive-boundaries (HTTP 404); http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units (HTTP 404); http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data (HTTP 404) | done |
| giveth | README.md | not present | none | Giveth.io is a blockchain based giving platform. this is about analysing how much money they have given out and when -- and how much they spend to run the platform. | | | true | Could not compute upstream max date from reachable sources; probe failures: https://blog.giveth.io/the-galactic-giving-round-results-are-here-c73c62abd2f8 (HTTP 403) | done |
| giving-data | README.md | not present | none | An open dataset mapping who gives what to whom in global philanthropy. | 2024-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| global-temp-anomalies | README.md | datapackage.json | scripts/process.py; Makefile | Data are sourced from Carbon Dioxide Information Analysis Center (CDIAC). Four different series are provided: Global Annual Temperature Anomalies (Land) 1880-2014, Global Annual Temperature Anomalies (Land and Ocean) 1880-2014, Hemispheric Temperature Anomalies (Land+ Ocean) 1880-2014 and Annual Temperature anomalies (Land + Ocean) for three latitude bands that cover 30%, 40% and 30% of the global area, respectively, 1900-2014. | 2015-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.ncdc.noaa.gov/ghcnm/ (probe error: The read operation timed out); http://www.ncdc.noaa.gov/oa/climate/research/ushcn/ (probe error: The read operation timed out); http://cdiac.ornl.gov/trends/temp/hansen/hansen.html#trends (network error: [Errno -2] Name or service not known) | done |
| glwd | README.md | datapackage.json | none | geodata data package providing geojson polygons and shapefiles for Global Lakes and Wetlands Database | | | true | Could not compute upstream max date from reachable sources; probe failures: https://worldwildlife.org/pages/global-lakes-and-wetlands-database (HTTP 403); http://worldwildlife.org/pages/global-lakes-and-wetlands-database (HTTP 403) | done |
| gold-prices | README.md | datapackage.json | scripts/process.py | Gold Prices | 2025-07-01 | | true | Could not compute upstream max date from reachable sources; probe failures: https://nma.org/wp-content/uploads/2016/09/historic_gold_prices_1833_pres.pdf (redirected to login) | done |
| harmonized-system | README.md | datapackage.json | scripts/convert.py | <a className="gh-badge" href="https://datahub.io/core/harmonized-system"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | 2022-12-31 | 2026-02-28 | true | Local latest 2022-12-31 is 1155 days behind upstream 2026-02-28 | done |
| hepatitis | README.md | datapackage.json | scripts/main.py | Hepatitis | | 1988-12-31 | true | No local date detected, while upstream has data through 1988-12-31 | done |
| historical-adoption-of-technology | README.md | datapackage.json | scripts/process.py; Makefile | Historical Adoption of Technology | 2016-06-13 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.nber.org/data/chat/FinalCHAT_72909.csv (HTTP 404) | done |
| household-income-us-historical | README.md | datapackage.json | household_us_flow.py | Households as of March of the following year. Income in current and 2016 CPI-U-RS adjusted dollars. | 2018-12-31 | 2019-09-10 | true | Local latest 2018-12-31 is 253 days behind upstream 2019-09-10 | done |
| house-prices-fr | README.md | not present | none | <a className="gh-badge" href="https://datahub.io/core/house-prices-fr"><img src="https://badgen.net/badge/icon/View%20on%20datahub.io/orange?icon=https://datahub.io/datahub-cube-badge-icon.svg&label&scale=1.25" alt="badge" /></a> | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| house-prices-global | README.md | datapackage.json | scripts/process.py | Contain data for 59 countries at a quarterly frequency (real series are the nominal price series deflated by the consumer price index), both in levels and in growth rates (ie four series per country). These indicators have been selected from the detailed data set to facilitate access for users and enhance comparability. The BIS has made the selection based on the Handbook on Residential Property Prices and the experience and metadata of central b | 1967-12-31 | 2027-12-31 | true | Local latest 1967-12-31 is 21915 days behind upstream 2027-12-31 | done |
| house-prices-uk | README.md | datapackage.json | scripts/process.py | House Prices in the UK since 1952 | 2025-05-01 | 2026-03-03 | true | Local latest 2025-05-01 is 306 days behind upstream 2026-03-03 | done |
| house-prices-us | README.md | datapackage.json | scripts/convert_to_final_data.py; scripts/data_fetch_and_process.py; Makefile | US House Price Index (Case-Shiller) | 2024-07-01 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.spindices.com/documents/methodologies/methodology-sp-cs-home-price-indices.pdf (HTTP 404); http://www.spindices.com/index-family/real-estate/sp-case-shiller (HTTP 403); https://api.stlouisfed.org/fred/series/observations?series_id= (HTTP 400) | done |
| ICC-Incoterms | README.md | datapackage.json | none | Current Incoterms used in international sale of goods as defined by international chamber of commerce | 2020-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://opendatacommons.org/guide/#sthash.97PSVxmh.dpuf (HTTP 404); https://www.gov.uk/incoterms-international-commercial-terms/overview (HTTP 404) | done |
| imf-weo | README.md | datapackage.json | scripts/process.py | IMF World Economic Outlook Database | 2020-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.imf.org/external/ns/cs.aspx?id=29 (HTTP 403); http://www.imf.org/external/ns/cs.aspx?id=28 (HTTP 403); http://www.imf.org/external/pubs/ft/weo/2015/01/weodata/index.aspx (HTTP 403) | done |
| IMO-IMDG-Codes | README.md | datapackage.json | none | For the purposes of this Code, dangerous goods are classified in different classes, to subdivide a number of these classes and to define and describe characteristics and properties of the substances, material and articles which would fall within each class or division. General provisons for each class or division are given. Individual dangerous goods are listed in the Dangerous Goods List, with the class and any specific requirements. In accordan | 1978-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.imo.org/blast/mainframe.asp?topic_id=158 (HTTP 404); http://opendatacommons.org/guide/#sthash.97PSVxmh.dpuf (HTTP 404) | done |
| inflation | README.md | datapackage.json | scripts/process.py; Makefile | Annual inflation by GDP deflator and consumer prices | 2024-12-31 | 2026-02-24 | true | Local latest 2024-12-31 is 420 days behind upstream 2026-02-24 | done |
| interest-rates-gb | README.md | datapackage.json | scripts/convert.py; scripts/scrape.py | Bank of England Interest Rate | 2025-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| investor-flow-of-funds-us | README.md | datapackage.json | scripts/process.py | US Investor Flow of Funds into Investment Classes (Bonds, Equities etc) | 2024-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.ici.org/research/stats (HTTP 404); http://www.ici.org/info/flows_data_%s.xls (HTTP 400); https://www.ici.org/system/files/{year}-{month:02d}/etf_flows_data_{year}.xls (HTTP 404) | done |
| ISO-Container-Codes | README.md | datapackage.json | none | List of ISO 6346 Container Type Codes | 2025-12-31 | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| labs | README.md | not present | none | A repo for data wrangling experiments. | | | true | Could not infer upstream freshness date from source payloads; manual review needed | done |
| land-matrix | README.md | datapackage.json | scripts/combine_data.py; scripts/process.py; process.py | A dataset containing information about land deals, including target country, region, investors, crops, and deal size in hectares. | 2027-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://landmatrix.org/data-downloads/ (HTTP 404); https://landmatrix.org/api/legacy_export/ (probe error: The read operation timed out); http://datahub.io/dataset/land-matrix/resource/f46a9192-cf2b-410e-b4f7-dc80538e5541 (HTTP 404) | done |
| language-codes | README.md | datapackage.json | scripts/language-codes.sh | ISO Language Codes (639-1 and 693-2) and IETF Language Types | | 2025-10-25 | true | No local date detected, while upstream has data through 2025-10-25 | done |
| lme-large-marine-ecosystems | README.md | datapackage.json | none | geodata data package providing geojson polygons and shapefiles for LME (Large Marine Ecosystems) | 2013-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: http://www.lme.noaa.gov/index.php?option=com_content&view=article&id=177&Itemid=75 (network error: [Errno -2] Name or service not known) | done |
| london-air-quality | README.md | datapackage.json | none | Air quality | 2018-12-01 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); https://data.london.gov.uk/dataset/london-average-air-quality-levels (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-crime | README.md | datapackage.json | none | London Recorded crime rates | 2016-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); https://data.london.gov.uk/dataset/recorded_crime_rates (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-gva | README.md | datapackage.json | none | London Gross Value Added (GVA) | 2014-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-life-expectancy | README.md | datapackage.json | scripts/london-life-expectancy.py | London life expectancy | 2018-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://data.london.gov.uk/ (HTTP 403); https://data.london.gov.uk/dataset/life-expectancy-birth-and-age-65-borough (HTTP 403); http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/ (HTTP 403) | done |
| london-median-housing-affordability | README.md | datapackage.json | scripts/home_affordability.py | London home affordability | 2024-12-31 | | true | Could not compute upstream max date from reachable sources; probe failures: https://www.gov.uk/government/statistics/house-price-indexes-for-england-and-wales-quarterly-data-2016-to-present (HTTP 404) | done |