Sam_Portfolio

View on GitHub

Boshen(Sam)’s Portfolios

Example data science portfolios

Project 1: Healthy/Sick Faces Classifier with OpenCV and Neural Network

This is a project that integrates convolutional neural network(CNN), OpenCV, and Flask. I was thinking that it would be so beneficial for people to check their health conveniently even at home; therefore, I created this app, which can give you a predicted health condition statement based on simply how you look. After all, the face can tell you a lot about health. The technical design is below:

Project 2: Recommendation Engine Project

This is a project uses IBM Watson Studio dataset to create recommendation algorithm for users. There are four parts in this project

Project 3:Data Pipeline and Data Processing with Spark and AWS

Project Structure

Read data from S3

Process data using spark

Transforms them to create five different tables listed below :

Fact Table

*songplays - records in log data associated with song plays i.e. records with page NextSong songplay_id, start_time, user_id, level, song_id, artist_id, session_id, location, user_agent

Dimension Tables

Load it back to S3

Writes them to partitioned parquet files in table directories on S3.

Project 4: NLP Health Q&A Chatbot

An NLP based chatbot integrated with neural network to generate answers for some common health concerns questions. Technologies include

Project 5: Modelpair

This is a Python package that is designed to compare different machine learning model performance on a dataset. Currently it can only deal with csv file dataset in a certain format

Project 6:Data Modeling and Pipeline

Project 7: FinalLists

FinalLists is a web application that enables users to identify their ideal housing option. Users can generate their personalized housing lists based on their inputed parameters such as