Paper: http://ceur-ws.org/Vol-2882/paper51.pdf
Amel Ksibi, Amina Salhi, Ala Alluhaidan and Sahar A. El-Rahman : Insights for wellbeing: Predicting Personal Air Quality Index using Regression Approach. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Providing air pollution information to individuals enables them to understand the air quality of their living environments. Thus, the association between people’s wellbeing and the properties of the surrounding environment is an essential area of investigation. This paper proposes Air Quality Prediction through harvesting public/open data and leveraging them to get the Personal Air Quality index. These are usually incomplete. To cope with the problem of missing data, we applied the KNN imputation method. To predict Personal Air Quality Index, we apply a voting regression approach based on three base regressors which are Gradient Boosting regressor, Random Forest regressor, and linear regressor. Evaluating the experimental results using the RMSE metric, we got an average score of 35.39 for Walker and 51.16 for Car.
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
Insights for wellbeing: Predicting Personal Air Quality Index using Regression Approach
1. Insights for wellbeing:
predicting personal air quality index
using regression approach
Prepared by:
Amel Ksibi1,Amina Salhi1, Ala Alluhaidan1, Sahar A. ELRahman1,2
1 College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
2 Electrical Engineering Department, Faculty of Engineering-Shoubra, Benha University, Cairo, Egypt
3. MOTIVATION
Providing air pollution
information to individuals
enables them to
understand the air quality
of their living
environments.
Open data including
weather data and air
pollution data collected
over the city, have been
investigated widely for
general population.
There is a shortage on
determining the quality of
air at personal scale
Research question is
whether we can use only
data from open sources to
predict the personal air
pollution data.
4. MEDIAEVAL2020:
INSIGHTS FOR WELLBEING
CHALLENGE
Task 1: Personal Air Quality Prediction with public/open data
Predicting the value of personal air pollution data (PM2.5, O3, and NO2) using only:
• weather data (wind speed, wind direction, temperature, humidity)
• air pollution data (PM2.5, O3, and NO2)
from public/open data sources (e.g., stations, website).
5. PROPOSED SOLUTION
Data pre-processing
• Solving missing data using
KNN imputation method
• Time Features extraction
• Features selection
Learning voting regression model
• Gradient Boosting regressor
• Random Forest regressor
• linear regressor
6. EXPERIMENTAL RESULTS
Table 1: Official results of the submitted run
PM2.5 RMSE NO2
RMSE
O3 RMSE AQI RMSE
AVG walkers 35.34 25.98 12.08 35.39
AVG car 40.93 25.02 35.98 51.16
• The performance of learning model depends on the way the data are collected:
o results from data collected by a walker outperform the results from data collected by a car.
7. CONCLUSION & FUTURE WORK
• This was our first participation on MediaEval challenges.
• Our solution was based on KNN imputation method and voting
regression approach.
• The used features are time and location features since weather features
did not improve the results within the training dataset.
• As future work, we will investigate more on:
• time features (seasons, periods of day,…) to consolidate the performances of
our regression model
• Lifelog images mainly on estimating the level of the haze, smoke and greenness
within the photos.
Hinweis der Redaktion
Good evening;
My name is Amel Ksibi from Pnu university.
I will present our topic which is Insights for wellbeing: Predicting Personal Air Quality Index using Regression Approach
Our presentation consists of motivation, challenge, proposed solution, experients and results and finally conclusion and future work
Air pollution remains a crucial subject of study since it has an intensive impact on public health.
So, Providing air pollution information to individuals enables them to understand the air quality of their living environments.
Over the past of 40 years, air quality prediction has been of interest. however, all these previous studies focused only on determining the air pollutants values at city scale for the general population.
So, There is a shortage on determining the quality of air at personal scale. Thus, the research question is whether we can use only data from open sources to predict the personal air pollution data.
This question has been proposed by MediaEval 2020 Insight for Wellbeing: Multimodal personal health lifelog data analysis challenge.
We participate in the first task that aims to predict the value of personal air pollution data using only wather data and air pollution data from public data sources.
Our proposed solution consists of two steps: data preprocessing and learning voting regression model.
Concerning the first step, we applied the knn imputation method the solve the problem of missing data , then we perform time features extraction to extract the month, day and hour features. After that we try different combination of features to determine the impact of each feature type. We found that weather data did not improve the performance of the learning model.
Concerning the second step , we lean a voting regression model based on three base regressors which are : gradient boosting regressor, random forest regressor and linear regressor.
The obtained results are shown in this table, Ae we see the estimating O3 is the esiaset task while estimating the PM2.5 is the hardest.
Also, we can note that the results from data collected by a walker outperform the results from data collected by a car. So the performance of the learning model depends on the way the data are collected.
As future work,
We will investigate more on:
time features (seasons, periods of day,…) to consolidate the performances of our regression model
And also, we will investigate on Lifelog images mainly on estimating the level of the haze, smoke and greenness within these photos.