Objective: Create a classification model for Spotify Listening Personas

Goals:

Work with real, up to date data
Store data in the cloud
Use relevant libraries
Choose appropriate algorithms such as KNN, Decision Trees, Random Forest, Naive Bayes and find the “best” model (How do I know what the best model is?)
perform validation tests on data
outline assumptions of the model
Be able to repeat the process on new data and upload it to the cloud (historical data cycle)

[x] Get data from Spotify Charts through API scraping and InfluxDB
[ ] Clean and prepare the data, feature scaling and regularization for the models
[ ] Explore and Visualize Data
[ ] Ask Questions About the Data
[ ] Use those questions to determine how you will build model, read Data 8 https://inferentialthinking.com/chapters/17/Classification.html
[ ] Build Model using K-means, K-NN
1. K-NN: set list of criteria using pandas queries to make a new column (’Mood’) classifying the type of song it is: [Chill, Upbeat, Crazy, Slow, etc]
  1. then use KNN to predict a song’s ‘Mood’ based on the features
  2. K-means: cluster the songs into k (first find optimal k) different listening personas
[ ] Create A Dashboard for the model using Dash/Plotly and deploy it!
1. Allow for new data to be read into the web-app and new predictions and metrics to be generated!
2. Data Pipeline: Data collection → Wrangling → Feed it into the model → Model results via Dashboard/App
[ ] Optimize Model?
[ ] Explain the importance of this project: How would this be beneficial to Spotify?