← Back

Air Quality Analysis Prediction

Prerequisites Description

This project analyzes an air quality dataset that contains the responses of a gas multisensor device deployed on the field in an Italian city. Some exploratory data analysis, a regression analysis and model selection is performed in Python using Pandas, Numpy, Sesborn and Scikit-learn for this project. Each python notebook has markdown explaining each part of my analysis.

Dataset

This dataset contains 9358 instances of hourly averaged responses from an array of 5 metal oxide chemical sensors embedded in an Air Quality Chemical Multisensor Device. The device was located on the field in a significantly polluted area, at road level,within an Italian city. Data were recorded from March 2004 to February 2005 (one year)representing the longest freely available recordings of on field deployed air quality chemical sensor devices responses. More about this Dataset.

Dataset Features Regression Tasks

The task is to predict the Air Quality given the other measurements in the dataset. The dataset has columns Date, Time, CO(GT), PT08.S1(CO), NMHC(GT), C6H6(GT), PT08.S2(NMHC), NOx(GT), PT08.S3(NOx), NO2(GT), PT08.S4(NO2), PT08.S5(O3), T, RH, and AH. The desired target variable is the temperature (T).

Why Air Predict Quality

Being able to model, predict, and monitor air quality is becoming more and more relevant, especially in urban areas, due to the critical impact of air pollution on citizens’ health and the environment. Accurate forecasting helps people plan ahead, decreasing the effects on health and the costs associated.

Results