Post

PhenomDetect: Detection of Air Hazards in the U. S.- SVM, Random Forest, Gradient Boost, XGBoost, KNN, LSTM, GRU, Tableau

This project is a team effort of Team ‘Vesper’ participated in the NASA International Space Apps Challenge'20. Our team took on the challenge of the automatic air hazard detection because we were inspired by the idea of building a tool that could potentially save many lives just by automatically analyzing data from a variety of sources and putting this analysis into the hands of key decision-makers, as well as the general public.

Post

Earthquake Prediction Dashboard - Spark, Tableau, MongoDB

The objective is to report the prediction of the earthquake from the historical data. A machine learning model is trained with historical data of the world related to earthquakes from 1965-2016. The data includes geographical location and magnitude of the earthquakes (23.5k samples). The model predicts earthquake magnitude for the year of 2017. Finally, a dashboard is created to visualize the prediction in addition to the historical analysis on the data.

Post

Weather Prediction - Bidirectional LSTM

This project predicts weather (i.e., min-max temperature) from historical data. The dataset includes hourly inputs of pressure, humidity, temperature, wind speed, and wind direction of 36 cities from the year 2012 to 2017. From the dataset preprocessing is done to engineer attributes to predict min and max temperature of Toronto. Data from 2012-2016 is used as the training dataset, while the attempt is to predict the min and max temp of 2017.

Post

Predicting Food Preparation Time (SkipTheDishes) - Doc2Vec

The objective is to predict food preparation time from ordered food items and quantity. This is a data challenge arranged by SkipTheDishes, Canada’s leading and largest food delivery company. The data includes upto 10 ordered food items along with the quantity. There are 20 features and 80,000 samples in the data. The goal is to predict food preparation time. Doc2Vec has been applied to vectorize food item names and feature engineering.

Post

House Price Prediction - Regression

The objecive of this project is to predict house price from different features. The dataset includes 1460 instances and 80 features. The following algorithms are applied as on selected features from the data: Applied Algorithms: Linear Regression Decision Tree SVM Random Forest AdaBoost GradientBoost XGBoost Feature selection is performed using Parson Correlation. Feature imputation, encoding, and scaling is performed. The best performance achieved is R-square = 0.