Case Study: Job title predictions

Summary: An Education Technology startup was manually classifying hundreds of new users per day into one of thirty-five potential job classifications based on unstructured data (e.g., college major, skills, cover letter, etc.). The company had one employee dedicated to the process full-time, and it was a bottleneck on growth. I built a system that enabled the employee to increase the number of classifications per day into the thousands, and over time had the potential to completely automate the majority of the classification process. The system involved a microservice that generated the predictions as well as a user interface that enabled the employee to quickly make decisions on the accuracy of those predictions. The UI also added additional relevant metadata (e.g., confidence of the prediction) and was designed to improve the accuracy of the model over time. The initial version of the project was a success, but the startup was acquired before the process could be completely automated.

Tools: Python, Logistic Regression, Natural Language Processing, Imputation, TF-IDF, AWS, HTML/CSS/Javascript, Bootstrap

Questions? me@dpmehta.com