This is the list of my projects.
Language of Proteins
Leveraging pre-trained protein language models to classify protein sequences based on their function.
Skills: machine learning, Python, Scikit-learn, FastAPI, Docker
Drug development and discovery is a time and labor-intensive process that could be enhanced and improved by next-generation protein sequencing techniques. Recent breakthroughs in Natural Language Processing (NLP) […]
Online Shopping Behavior
Data analysis of online shopping behavior by applying supervised, unsupervised, semi-supervised learning.
Skills: machine learning, Python, Scikit-learn, Seaborn, Pandas, SHAP
Online shopping behavior is the process by which consumers search for, select, purchase, use, and dispose of goods and services, over the internet. For the ecommerce platform, one of the most important questions is, […]
Bank Churn Prediction
The dashboard displaying likelihood of customer churn based on demographic and banking information
Skills: machine learning, Python, Scikit-learn, Flask
The cost of keeping an existing customer is much cheaper than the cost of acquiring a new customer. In addition, positive word-of-mouth from existing customers leads to cheap, almost free customer acquisition. To improve the customer retention […]
T2D Risk Predictions
Predicting health risks for Type 2 Diabetes based on three A1C levels (no-diabetes, pre-diabetes, diabetes)
Skills: machine learning, Python, Scikit-learn, Seaborn
Type 2 diabetes occurs more commonly in middle-aged and elderly people and if left uncontrolled it can cause all sorts of serious health issues like infections, damaged kidneys, vision loss and blindness, amputations and many […]
Tornado Impact
Data Analysis and visualization of tornado events in the United States 1996 – 2019
Skills: Python, Pandas, Matplotlib, Choropleth maps, GeoJSON,
We are all interested in weather data. Living in Indiana, one of the states that are sometimes included in Tornado Alley, and going frequently through warnings and safety procedures, it is normal that we are asking a lot of questions […]
Croatia – Population Trend
Analyzing population trend in Croatia through last three decades using latest data that include 2018
Skills: Python. Pandas, Matplotlib, Choropleth maps, Folium, Tableau
Being myself an emigrant from Croatia, I am interested in Croatian migration trends in last few decades. As one of ex-Yugoslav republics, in the last 30 years Croatia faced three events that could have an impact to its population: […]
ETL for Movie Datasets
Perform ETL on two movie datasets and insert existing Facebook information to prepare them for production
Skills: Python, Pandas, MongoDB
The purpose of the project was to find two movie datasets and perform ETL (Fxtract, Transform, Load)on them to migrate them to a production database. We are keeping our data warehouse focused on the following subjects […]
Amazon Vine Reviews
Analyzing whether Vine reviews are free of bias and if they are truly trustworthy
Skills: Spark, PostgreSQL, Colab, AWS RDS
Many of Amazon’s shoppers depend on product reviews to make a purchase. Amazon Vine program is an invitation-only club for a small percentage of elite most trusted reviewers, selected by Amazon. The program aims to prov;ide […]
Exploration of Exoplanets
Machine learning models capable of classifying candidate exoplanets from the NASA Kepler space telescope dataset
Skills: machine learning, Scikit-Learn, Tensorflow
Over a period of nine years in deep space, the NASA Kepler space telescope has been out on a planet-hunting mission to discover hidden planets outside of our solar system. This measurement data has been collected along with what the classification […]
Citi Bike Analytics
Tableau Story: Performing data analysis for Citi Bike data (Jersey City from 2018 to 2020) to find interesting insights
Skills: Tableau, Python, Pandas
Citi Bike is New York City’s public bicycle sharing system that was launched in May 2013. It is the largest in the nation and currently serving the New York City boroughs of the Bronx, Brooklyn, Manhattan, and Queens, as well […]
Travel Tips Dashboard
Travel tips dashboard that helps users to plan their next trip with suggesting flights and attractions and showing historical temperatures
Skills: JavaScript, Leaflet.js, Python, Flask, Plotly, SQLAlchemy, Mapbox Api
If you live in the Indianapolis area and you plan to travel to one of top 5 US destinations, this dashboard is for you. The dashboard will display minimum flight prices originated from Indianapolis, […]
Visualizing Seismic Data
Visualizing all earthquakes in the past 7 days and fault lines to illustrate the relationship between tectonic plates and seismic activity
Skills: JavaScript, Leaflet.js, D3.js, HTML/CSS, GeoJSON, Mapbox API
In this project we are visualizing all earthquakes in the past 7 days based on their longitude and latitude. Data on tectonic plates, was added to the map to illustrate the relationship […]
Wine Classification
Classifying wines by type (red, white) and by quality (low, medium, high)
Skills: machine learning, Python, Scikit-learn, Seaborn
Wine is an alcoholic beverage made from fermented grapes. It is a seemingly simple beverage that becomes more complex the more you study it. The good thing is, it does not matter how much you know, nearly everyone can appreciate wine […]
Demographics and Health Risks
Interactive D3 visualization of U.S. demographics and health risks
Skills: JavaScript, D3.js, HTML5, CSS, Bootstrap
The purpose of this project was to compare demographic and health data by states using data from the US Census and CDC Behavioral Risk Factor. The included data set is based on 2014 ACS 1-year estimates […]