基于时间累积效应与随机森林的巢湖蓝藻水华暴发面积短期预报
CSTR:
作者:
作者单位:

1.安徽省巢湖管理局湖泊生态环境研究院;2.西北大学城市与环境学院;3.中国科学院南京地理与湖泊研究所湖泊与流域水安全全国重点实验室

作者简介:

通讯作者:

中图分类号:

基金项目:


Integrating temporal cumulative Effects into the random forest model for short-term forecasting of cyanobacterial bloom area in Lake Chaohu
Author:
Affiliation:

College of Urban and Environmental Sciences, Northwest University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 附件
  • |
  • 文章评论
    摘要:

    蓝藻水华已成为威胁湖泊生态安全和饮用水安全的全球性环境问题,及时、准确地预报蓝藻暴发有助于提前采取应对措施,减轻灾害风险。本研究针对传统机理模型参数众多、运算复杂等应用局限,以巢湖为研究对象,构建了融合监测数据与遥感信息的机器学习预报框架。通过整合多站点气象、水质监测数据以及卫星遥感数据,分析了气象和水质变量对蓝藻水华的时间累积效应。在此基础上,基于随机森林模型,分别构建了考虑变量时间累积效应的预报模型(累积变量模型)和仅使用单日观测值的预报模型(单日变量模型),实现了藻类水华暴发面积1~7 d预报。最后,引入基于博弈论的可解释性(SHapley Additive exPlanations, SHAP)算法,揭示了主要影响因子的贡献度及其非线性阈值规律。结果表明:(1)气象因子(气温、湿度、降雨、气压)累积效应周期约为15~30 d,长于水质因子(氮、磷、溶解氧)的累积效应周期(1~10 d);(2)累积变量模型的预报精度(决定系数R2 = 0.66~0.75)优于单日变量模型(R2 = 0.56~0.63),其中1 d预报效果最优(R2 = 0.75,均方根误差(RMSE)= 49.37 km2);(3)关键阈值条件包括:平均气温> 23 ℃、最大风速< 4 m/s、降雨量> 200 mm、氮磷比< 15、pH > 8.5、溶解氧< 8.9 mg/L。本研究提出的预报方法仅需常规监测数据即可实现短期蓝藻水华预报,这为富营养化湖泊管理提供了可推广的技术路径与决策支持。

    Abstract:

    Cyanobacterial blooms have emerged as a global environmental challenge threatening lake ecosystem security and drinking water safety. Timely prediction of bloom outbreaks is critical for implementing preventive measures and reducing disaster risks. To overcome the limitations of conventional mechanism-driven models, including their numerous parameters and computational complexity, this study established an machine learning framework that integrates multi-source monitoring data and remote sensing observations for Lake Chaohu. By integrating multi-site meteorological and water quality measurements with satellite-derived time-series data, we investigated the temporal cumulative effects of meteorological and water quality variables on cyanobacterial blooms. Based on the Random Forest (RF) model, two forecasting models were developed: one considering the temporal cumulative effects of variables (cumulative variable model) and the other using only single-day observations (single-day variable model), to achieve 1–7day (d) forecasts of bloom coverage area. Additionally, SHapley Additive exPlanations (SHAP) analysis was further applied to decode the model"s decision-making mechanisms, revealing feature contributions and nonlinear threshold behaviors. The results showed that: (1) Meteorological variables (air temperature, humidity, precipitation, and air pressure) exhibited longer cumulative effect durations (15~30 days) compared to water quality variables (nitrogen, phosphorus, and dissolved oxygen (1~10 days); (2) Cumulative-variable models demonstrated superior predictive accuracy (R2 = 0.7~0.8) over single-day variable models (R2 = 0.4~0.6), with optimal 1-day ahead performance (R2 = 0.79, RMSE = 35.36 km2); (3) Critical thresholds were identified at average temperature approximately > 23°C, maximum wind speed approximately < 4 m/s, precipitation approximately > 200 mm, nitrogen-phosphorus ratio approximately < 15, pH > 8.5, and dissolved oxygen approximately < 8.9 mg/L. The proposed method enables high-precision short-term forecasting using multi-station monitoring data, holding promise for providing a transferable decision support framework for eutrophic lake management.

    参考文献
    相似文献
    引证文献
引用本文
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-04-07
  • 最后修改日期:2026-04-29
  • 录用日期:2025-10-29
  • 在线发布日期: 2025-12-18
  • 出版日期:
文章二维码
您是第    位访问者
地址:南京市江宁区麒麟街道创展路299号    邮政编码:211135
电话:025-86882041;86882040     传真:025-57714759     Email:jlakes@niglas.ac.cn
Copyright:中国科学院南京地理与湖泊研究所《湖泊科学》 版权所有:All Rights Reserved
技术支持:北京勤云科技发展有限公司

苏公网安备 32010202010073号

     苏ICP备09024011号-2