Skip to main content

Gera US Air Health Index — Methodology

A transparent, reproducible formula. Every value in this dataset traces directly to real EPA AQS June 2026 pre-generated files, downloadable without a key from aqs.epa.gov.

The formula

pollution_score = 40·norm(PM2.5_mean) + 35·norm(ozone_mean_ppb) + 25·norm(days_over_100)

GUSAHI = 100 − pollution_score

where norm(x) = (x − min_x) / (max_x − min_x) across the 458 included counties

PM2.5 receives the highest weight (40%) as the air pollutant most strongly linked to long-term cardiovascular and respiratory mortality at annual concentrations. Ozone (35%) drives acute respiratory episodes and contributes to chronic lung disease. Days over AQI 100 (25%) captures the frequency of acute pollution events that affect the most sensitive groups.

Normalisation ranges (June 2026)

Min-max normalisation ranges across 458 counties — EPA AQS 2023
MetricWeightMin (cleanest)Max (most polluted)
Annual mean PM2.540%2.3 µg/m³15.7 µg/m³
Mean 8-hour ozone35%21.5 ppb51.6 ppb
Days with AQI > 10025%0 days118 days

Reproduce it yourself — step by step

  1. 1

    Download the EPA AQS AirData pre-generated files (no API key required)

    Fetch two files from https://aqs.epa.gov/aqsweb/airdata/: (1) annual_aqi_by_county_2023.zip — provides "Unhealthy for Sensitive Groups Days", "Unhealthy Days", "Very Unhealthy Days", and "Hazardous Days" per county; (2) annual_conc_by_monitor_2023.zip — provides per-monitor arithmetic means for PM2.5 and ozone. Both are US federal government works downloadable without a key.

    https://aqs.epa.gov/aqsweb/airdata/
  2. 2

    Compute per-county PM2.5 annual mean

    From annual_conc_by_monitor_2023.csv, filter to rows where Parameter Name = "PM2.5 - Local Conditions", Sample Duration = "24-HR BLK AVG", and Pollutant Standard = "PM25 Annual 2012". Average the Arithmetic Mean across all qualifying monitors within each (State Name, County Name) pair. Unit: µg/m³ (LC).

  3. 3

    Compute per-county 8-hour ozone annual mean

    From annual_conc_by_monitor_2023.csv, filter to rows where Parameter Name = "Ozone", Sample Duration = "8-HR RUN AVG BEGIN HOUR", and Pollutant Standard = "Ozone 8-hour 2015". Average the Arithmetic Mean (ppm) × 1000 to convert to ppb across all qualifying monitors per county.

  4. 4

    Compute per-county days with AQI > 100

    From annual_aqi_by_county_2023.csv, sum the four columns: "Unhealthy for Sensitive Groups Days" + "Unhealthy Days" + "Very Unhealthy Days" + "Hazardous Days". This gives the total number of days in 2023 where the AQI exceeded 100.

  5. 5

    Join and exclude counties missing any metric

    Join on (State, County). Counties where any of the three metrics (PM2.5 mean, ozone mean, days > 100) is absent — because the county has no qualifying monitors — are excluded entirely. Never impute. Monitors from "Country Of Mexico" are also excluded. This results in 458 valid county rows.

  6. 6

    Min-max normalise each metric across the included counties

    For each metric m in {PM2.5_mean, ozone_mean_ppb, days_over_100}, compute: norm_m = (value_m − min_m) / (max_m − min_m). This rescales each metric to [0, 1] across the 458 counties, where 1 = highest pollution / worst value. The normalisation ranges used in the June 2026 release are shown in the table below.

  7. 7

    Compute the weighted pollution score and GUSAHI

    pollution_score = 40 × norm(PM2.5_mean) + 35 × norm(ozone_mean_ppb) + 25 × norm(days_over_100). GUSAHI = 100 − pollution_score. PM2.5 carries the highest weight (40%) as the primary driver of long-term respiratory and cardiovascular harm at annual concentrations. Ozone (35%) drives acute respiratory effects and peak-day events. Days over AQI 100 (25%) captures acute-episode frequency. The result is in [0, 100]; higher = cleaner air. Round to one decimal place.

  8. 8

    Assign national ranks

    Sort all 458 counties by GUSAHI descending. Rank 1 = cleanest air; rank 458 = worst air. Ties are broken alphabetically by state then county name.

Validation examples (June 2026)

Sample counties — real EPA values + GUSAHI (June 2026)
CountyStatePM2.5OzoneDays > 100GUSAHI
BayamonPR4.0 µg/m³21.5 ppb094.8 / 100
HonoluluHI4.3 µg/m³26.6 ppb187.9 / 100
SkagitWA5.3 µg/m³27.3 ppb084.2 / 100
RiversideCA9.4 µg/m³50.4 ppb11620.5 / 100
San BernardinoCA8.7 µg/m³51.3 ppb11821.1 / 100
TulareCA10.9 µg/m³51.0 ppb8222.6 / 100

Data source and licence

All underlying data is published by the U.S. Environmental Protection Agency (EPA) through the AQS AirData pre-generated files (June 2026 annual files). These are U.S. federal government works in the public domain under 17 U.S.C. § 105. No API key, registration, or fee is required to download them. Gera does not claim any copyright over the derived index values; the formula and data are published here so any reader can reproduce them.

The GUSAHI is distinct from the EPA Air Quality Index (AQI): the AQI is a daily real-time metric for a single pollutant; the GUSAHI is an annual composite health-context score combining PM2.5, ozone, and acute-episode frequency across an entire year, normalised relative to other US counties.

Frequently asked questions

Why does GUSAHI invert the score (higher = cleaner)?
The choice to make higher = cleaner is intentional: users searching for "good air quality" areas expect a higher score to mean better conditions, consistent with star ratings, health scores, and livability indices. The raw pollution_score (higher = dirtier) is an intermediate calculation; GUSAHI = 100 − pollution_score is the published, user-facing number. Both are derived identically from EPA data.
Why use PM25 Annual 2012 standard for PM2.5?
The PM25 Annual 2012 standard (12 µg/m³ annual) was the prevailing regulatory standard for most of 2023. The 2024 NAAQS revision tightened it to 9 µg/m³, but the 2023 monitoring data was collected under the 2012 standard. Using the 2012 standard filter ensures we capture the broadest set of monitors with complete annual data. We compare values to the 9 µg/m³ (2024) standard in the commentary.
Are all US counties included?
No. The June 2026 GUSAHI covers 458 US counties that have qualifying EPA AQS monitors for all three metrics. Many rural counties and some states have limited or no AQS monitoring. Counties with any missing metric are excluded entirely — never estimated — and shown as "insufficient data" in the interface.
How often is the GUSAHI updated?
EPA AQS publishes new annual summaries each year. Gera re-computes and re-dates the GUSAHI each time a new EPA annual dataset is available (typically mid-year, for the prior calendar year). The current version is based on 2023 data and was last updated 2026-06-20.
Can I reproduce the GUSAHI myself?
Yes. Download the two EPA AQS AirData ZIPs from aqs.epa.gov/aqsweb/airdata/ (no key required). Follow the steps on this page: filter concentration data to the right parameter and standard, average across monitors per county, join with the AQI day-count file, exclude counties with missing values, apply min-max normalisation, compute the weighted sum, and subtract from 100. The formula is deterministic — you will arrive at the same GUSAHI values published here.

← Browse all 458 US county air quality scores

Contains public sector information published by U.S. Environmental Protection Agency (EPA) and licensed under the U.S. Public Domain (federal government work, 17 U.S.C. § 105). Source: EPA AQS AirData — Annual AQI by County + Annual Concentration by Monitor, 2023 (June 2026, published 2023 (released 2024)).

Informational/educational only — not a substitute for professional medical advice; a clinician interprets results.