AI and ML Solutions Driving Modern Farming and Urban Innovation

A Water Quality Prediction and Assessment Model using Machine Learning Classifiers

Author(s): M. Vaishali*, M. Monika, M. Atish and Lalitha

Pp: 153-168 (16)

DOI: 10.2174/9798898812102125030014

* (Excluding Mailing and Handling)

Abstract

Ensuring safe and clean water availability is vital to the health of not only human beings but also all species. Additionally, it is crucial for the sustainability of the environment. With the emergence of advanced technologies like machine learning, predictive models can significantly contribute to assessing and managing water quality. Current research proposes a methodology that predicts water quality using several machine learning classifiers on a dataset comprising diverse parameters, such as pH levels, dissolved oxygen, turbidity, and other pollutants, collected from multiple water sources. Initially, the data were preprocessed to remove missing values and outliers. Feature engineering was employed to identify the most relevant parameters that contribute to water quality. Several popular machine learning classifiers, including Random Forest, Support Vector Machines, Decision Trees, and XGBoost, were evaluated and compared for their performance in predicting water quality. The trained models were validated and tested using cross-validation techniques to ensure generalizability and resilience. The research findings demonstrated that the proposed method is effective in accurately forecasting water quality levels. The XGBoost, in particular, exhibited superior performance with high accuracy and minimal overfitting. Additionally, feature importance analysis revealed key factors influencing water quality, providing valuable insights for policymakers and environmentalists.


Keywords: Cross-validation, Machine learning classifiers, ROC curve, Water quality assessment.