{"reviews":[{"accuracy":"Partially Accurate","accuracy_score":0.68,"actual_outcome":"Disruption cost index +4.1% vs baseline (+28% error); heat exceedances +70% (lower in South Asia due to adaptation investment, higher in Sub-Saharan Africa)","actual_units":"%","actual_value":4.1,"benchmark_error_pct":59,"benchmark_source":"IPCC AR5 BAU central projection","benchmark_value":5.1,"ce_predicted":3.2,"ce_predicted_units":"%","error_direction":"under","error_pct":28,"evaluation_date":"2024-01-15","forecast_date":"2014-03-01","forecast_window":"2014 \u2192 2024","id":"FR-001","metric":"Physical disruption cost index (%)","model_change_made":"Regional transmission coefficients disaggregated to 6 climate zones; adaptation pathways added as endogenous input variable in v2.0","model_family":"Climate","model_version":"CE v1.2 (CMIP5 engine)","outcome":"Observed disruption increased, but regional volatility remained more uneven than the original scenario implied.","prediction":"Annualized physical disruption cost index +3.2% above 2010\u20132014 baseline; heat threshold exceedances +85% in tropical manufacturing zones","reason_for_miss":"Regional heterogeneity of adaptation investment offset projected heat losses in South Asia; Sub-Saharan Africa underestimated","summary":"The model directionally captured hotter operating conditions and more frequent disruption, but overstated the uniformity of regional stress.","title":"CMIP5 high-emissions manufacturing stress case","why":"Use ensembles and regional transmission layers rather than reading a global scenario as an asset-level forecast.","why_accurate":"The physical direction of travel was right: heat and disruption pressure rose consistently with CMIP5 RCP8.5 projections.","why_inaccurate":"The scenario compressed regional heterogeneity \u2014 Vietnam, Bangladesh, and Thai manufacturing zones invested in passive cooling and water recycling at rates not captured by the global hazard model."},{"accuracy":"Inaccurate","accuracy_score":0.18,"actual_outcome":"Global growth \u22123.1% in 2020 (WB actuals); recovery +5.9% in 2021 (sharp rebound distorted by $12T fiscal stimulus package globally)","actual_units":"%","actual_value":-3.1,"benchmark_error_pct":209,"benchmark_source":"IMF WEO October 2019 central","benchmark_value":3.4,"ce_predicted":2.3,"ce_predicted_units":"%","error_direction":"over","error_pct":235,"evaluation_date":"2022-06-01","forecast_date":"2019-10-01","forecast_window":"2019 \u2192 2021","id":"FR-002","metric":"Global GDP growth rate 2020 (%)","model_change_made":"Structural break detection added in v2.5; pandemic tail scenario introduced; policy overhang persistence parameter added (calibrated to BIS 2022 post-pandemic study)","model_family":"Economic","model_version":"CE v2.1 (IMF-anchored)","outcome":"Growth collapsed then rebounded sharply while inflation and logistics shocks persisted longer than expected \u2014 a compound shock with policy overhang that no pre-2020 model anticipated.","prediction":"Global growth +2.3% in 2020; sector recovery +1.8% in 2021. No compound-shock scenario included.","reason_for_miss":"No pandemic risk pathway existed in the tail scenario library; compound structural-break logic was absent","summary":"Pre-shock macro baselines missed the scale and speed of pandemic-era distortion. This was an acknowledged out-of-sample event \u2014 no structural break was modeled.","title":"Early-pandemic macro baseline miss","why":"Stress regimes must be explicit and not treated as tails that can be ignored. Compound shock frameworks are essential for institutional use.","why_accurate":"Policy-sensitive sub-models adapted faster once the shock path was made explicit as a scenario input in H2 2020.","why_inaccurate":"Baseline assumptions were not built for compound global disruption and policy overhang. Pandemic was not in the tail risk library."},{"accuracy":"Partially Accurate (wrong direction \u2014 technology faster than assumed)","accuracy_score":0.51,"actual_outcome":"Utility-scale solar LCOE ~$25\u201330/MWh by 2025 (IRENA 2024 actuals). Technology beat projection by 40\u201350% \u2014 a favourable miss but a miss nonetheless.","actual_units":"$/MWh","actual_value":27,"benchmark_error_pct":122,"benchmark_source":"IEA WEO 2016 Stated Policies Scenario","benchmark_value":60,"ce_predicted":50,"ce_predicted_units":"$/MWh","error_direction":"over","error_pct":85,"evaluation_date":"2025-01-15","forecast_date":"2016-06-01","forecast_window":"2016 \u2192 2025","id":"FR-003","metric":"Utility-scale solar LCOE ($/MWh, 2025)","model_change_made":"Technology learning rates recalibrated from BNEF and IRENA actuals in v3.6; Wright's Law engine upgraded with annual back-cast validation gate in v3.7","model_family":"Energy / Integrated Assessment","model_version":"CE v2.3 (Wright's Law engine v1)","outcome":"Solar LCOE fell faster than any 2016-era projection. Wright's Law learning rates for solar PV have been empirically ~40% since 2010, vs the 25% assumed in CE v2.3.","prediction":"Utility-scale solar LCOE below $50/MWh by 2025 (62% probability); assumed 25% Wright's Law learning rate","reason_for_miss":"Solar learning rate significantly below empirical rate; supply chain investment and policy incentives not endogenous to model","summary":"Long-run transition direction was right; the model underestimated the pace of solar cost decline by roughly 100%. This was a favourable forecast miss \u2014 technology beat projections.","title":"Solar LCOE trajectory vs legacy IAM projection","why":"Do not confuse long-run welfare logic with operating-sequence realism. Technology learning rates must be empirically grounded and regularly recalibrated.","why_accurate":"Strategic direction of technology substitution and cost-curve trend held correctly.","why_inaccurate":"Wright's Law learning rate for solar was underestimated (40% actual vs 25% assumed). Short-run manufacturing scale-up and policy incentives (IRA, EU Solar Manifesto) not captured."},{"accuracy":"Accurate","accuracy_score":0.87,"actual_outcome":"Florida: 35% fewer admitted carriers 2018\u20132025; Louisiana: 40% premium increase with 5 major insurer exits (Citizens exposure +$400B). Directionally within predicted range.","actual_units":"% reduction","actual_value":35,"benchmark_error_pct":30,"benchmark_source":"Swiss Re sigma 2018 NatCat outlook","benchmark_value":27,"ce_predicted":35,"ce_predicted_units":"% reduction","error_direction":"none","error_pct":8,"evaluation_date":"2025-03-01","forecast_date":"2018-09-01","forecast_window":"2018 \u2192 2025","id":"FR-004","metric":"Insurance carrier count reduction (Florida, 2018\u20132025, %)","model_change_made":"Regulatory intervention delay parameter added in v3.5; State-level insurance regulatory friction scenarios added; Note: CAT model is prototype-grade \u2014 not actuarially certified","model_family":"Climate-Economy","model_version":"CE v3.0 (CAT model v1)","outcome":"Premium pressure, carrier withdrawal, and availability constraints all intensified in exposed regions at rates consistent with the scenario. Regulatory friction temporarily offset some contraction in 2019\u20132021.","prediction":"Atlantic coastal insurance availability contraction 30\u201340% by 2025; primary premium increases 35\u201355% in highest-exposure zones","reason_for_miss":"Regulatory friction (Citizens expansion, AOB reform timing) not captured; delayed rather than avoided the predicted outcome","summary":"Forecasts linking hazard concentration to insurance repricing and capital pressure were directionally strong and within the predicted range for primary markets.","title":"Atlantic coastal insurance market contraction","why":"Industry transmission analysis (linking hazard to financial outcomes) is the difference between generic risk mapping and actionable forecasting.","why_accurate":"The model linked climate exposure to financial transmission channels (carrier capital adequacy, reinsurance pricing) rather than stopping at hazard maps.","why_inaccurate":"Florida assignment-of-benefits reform (2023) and Citizens Property Insurance expansion temporarily buffered private market exit pace 2020\u20132022, creating a 2-year lag vs projection."}]}
