I am currently working as a Data Scientist at Grainger Canada, deploying machine learning models for customer-facing applications and working towards productionalizing LLMs. I have a Masters in Data Science, two degrees in Mathematics and have worked as an Analytics Manager in the FinTech domain. I'm a Databricks Certified Machine Learning Associate.
I've developed machine learning models for recommending mutual fund schemes and used clustering methods to segment customers. Presently, I'm interested in MLOps and LLMs. I actively engage in Kaggle Competitions to discover novel methods to tackle varied data problems. My philosophy is to practice what I have learned and learn what I have not read before.
Data Scientist
Grainger CanadaData Scientist
GraingerData Analytics Manager
SBI Funds Management Pvt. Ltd.Data Analytics Manager
SBI Mutual FundsSOLR
MLflow
Databricks
Streamlit
GitHub Copilot
Locust
Splunk
Grafana
CRM
Power BI
Google Colab
SQL Server
Google Analytics
PowerBI
SQL Server
ETL
Flask
MLflow
Superset
Led the development and integration of end-to-end ML microservices, deploying production-ready models that drove Grainger Canada's web search using name entity recognition and text classification algorithms
Designed an LLM-based system to generate descriptions of website products, fine-tuning on existing high-quality descriptions, incorporating manual review for quality and improvement
Refactored the category prediction fastText model by migrating it from an on-premises server to the cloud through MLflow and Databricks, integrating version control and ensuring reproducibility, leading to a 37% reduction in model training time and an 8% enhancement in recall
Deployed an ML API that utilizes a GPT-3 powered model to assess parsed search queries from an established NER model to identify incorrect query labels, resulting in a notable reduction in the time taken
Mentored a team of two junior data scientists and successfully reduced tech-debt of existing ML pipelines by 80%
Executed health check scripts and stress testing for REST APIs using Locust, implemented efficient logging with Splunk, and utilized Grafana dashboards to monitor model drift and performance
Fine-tuned BERT for classifying sentiment of 20,000 user reviews, integrating the result in the internal CRM tool, thereby improving the support teams response time to negative feedback
Implemented clustering and segmentation for over 10 million investors and 37,000 brokers using K-Means and DBSCAN, for sending targeted notifications on app, increasing app usage by 25%
Managed a team of two data analysts in writing efficient SQL queries and reducing time spent on building reports
Created an item-item collaborative filtering model to recommend products to customers in the broker platform for upselling and cross-selling schemes, leading to an 11% increase in digital sales
Built Power BI visualization reports with data from SQL Server & Google Analytics within an ETL framework
Generated a reproducible machine learning pipeline in Google Colab using computer vision algorithms to detect the occurrence of head impact event using a combination of Object-detection and Activity-recognition models
Using YOLOv5 & Resnet 3D, the manual analysis time for videos decreased by 60%, achieving a recall score of 0.91