| A Study on the Comparison of Deep Learning-Based Imputations for Green Algae and Water Quality Data |
|---|
|
학술지명 위기관리 이론과 실천
저자 이승연,김성훈,이충성,류제완
발표일 2025-02-28
|
|
This study examined various imputation techniques and assessed their performance in handling missing data related to green algae and water quality. Using data from the Daecheong Dam area, a total of 83 weekly datasets from April 2004 to December 2023 were collected and analyzed, including key determinants of green algae bloom: Cyanobacteria cell count, chlorophyll-a concentration, water temperature, and total phosphorus. Artificially induced missing values were implemented for periods of 2, 4, and 8 weeks in each key variable, and missing data were imputed using linear interpolation, kNN, BRITS, and NAOMI. Performance evaluation based on RMSE and MAPE revealed that the optimal imputation methods varied depending on the characteristics of each variable and the length of the missing data period. For Cyanobacteria cell count and chlorophyll-a, kNN consistently exhibited superior performance, whereas for variables with relatively low variability or distinct linear patterns, such as water temperature and total phosphorus, linear interpolation was identified as the most effective imputation method. This study underscores the importance of selecting an imputation technique that accounts for the characteristics of the data when addressing missing values. |