Syllabus
Saturday | Sunday | |
Week 1 | Probability and Statistics with R Part 1 (2h) | Probability and Statistics with R Part 2 (2h) |
Mini Project 1: Banking Financial Data Manipulation with Pandas | ||
Week 2 | Python Machine Learning Eco-system (2h) Project 1 Q&A (1h) | Python Machine Learning Eco-system (2h) |
Mini Project 2: Data Cleansing Practice on Real Estate Data | ||
Week 3 | Supervised Learning: Classification (2h) Project 2 Q&A (1h) | Introduction to Data Application (2h) |
Mini Project 3: Bank Fraud Detection | ||
Week 4 | Supervised Learning: Regression (2h) Project 3 Q&A (1h) | Introduction to Big Data (2h) |
Mini Project 4: Insurance Claims Modeling | ||
Week 5 | Data Analysis using Hadoop Hive 1 (2h) Project 4 Q&A (1h) | Data Analysis using Hadoop Hive 2 (2h) |
Mini Project 5: Data Visualization with Education App User Datasets | ||
Week 6 | Unsupervised Learning: Dimension Reduction (2h) Project 5 Q&A (1h) | Advanced visualization & A/B Testing (2h) |
Mini Project 6: Traveling Company New User Bookings | ||
Week 7 | Unsupervised Learning: Clustering and Outlier Detection (2h) Project 6 Q&A (1h) | Data Visualization Part 1 (2h) |
Week 8 | Deep Learning Part 1 (2h) | Data Visualization Part 2 (2h) |
Week 9 | Deep Learning Part 2 (2h) | Data Processing using Spark SQL and DataFrame (2h) |
Week 10 | Machine Learning using Spark MLLib (2h) | Spark Graph Database (2h) |
Week 11 | Project 1: Recommendation System Project
This project uses Python, Jupyter notebooks, and SQL to develop a recommender system for games on Steam. Two APIs are used to collect the data, and multiple recommendation algorithms are used to suggest the most relevant games to new and old users alike. |
|
Week 12 | ||
Week 13 | ||
Week 14 | Project 2: FinTech (Financial Technology) Project
This project uses Python, Jupyter notebooks, and Flask to create an intelligent investment system using data from Lending Club. The final product is a robotic investor that can be utilized to assess the riskiness of future loans using XGBoost machine learning algorithms. |
|
Week 15 | ||
Week 16 |
Projects
FinTech (Financial Technology) Project
Typically, the Lending Club (a Peer to Peer lending marketplace) contains hundreds of loan projects, which makes it difficult for investors to choose a profitable one. In our FinTech project, we will use the data science knowledge we’ve learned in the courses to design a product as an intelligent investment advisor, helping investors identify the values of different projects in Lending Club, to determine the optimal projects to invest in. When the new loan project comes into the platform, our product would automatically analyze the project’s parameters and screen out the best investment projects. We will also design a simple web page to realize the interaction between our product and users. In this project, what you will do:
- 1,320,000+ data processing and 100+ data feature screening.
- Build the machine learning model by Gradient Boosted Regression Trees (GBRT).
- Design the web user interface to display our product.
- Our product realizes the project evaluation and best investment project screening on Lending Club.
Steam Game Recommendation System Project
The implementation of Recommendation System is very popular in recent years, specifically, it has been widely used in game platforms. In our game recommendation system project, based on the Steam Platform, we will analyze the user’s game playing history and design a recommendation system according to the popularity of the games. Our users will also filter the results by selecting the categories of interest to further refine the recommendation results. Our students will experience a series of complete high-level product development processes including product definition, data crawling, data import, data analysis, recommendation system platform design, effective evaluation and so on.
- Data collection from Steam Game Platform.
- 300+ feature processing and screening.
- Realize Game Recommendation basing on the Popularity-Based Recommendation Algorithm.
- Deign the user interaction interface to display our product.
Schedule
Class Webinar Time: Sat & Sun 5 – 7 p.m. Pacific Time (Final schedule subject to change, we will announce ahead of time)
TA Office Hour & QA Session: Mon & Wed 5 – 7 p.m. Pacific Time
Local classroom available (Students in Los Angeles only)