profile-pic

Rohit Rawat

Vetted Talent

Rohit Rawat

Vetted Talent

I am currently working as a Data Scientist at Grainger Canada, deploying machine learning models for customer-facing applications and working towards productionalizing LLMs. I have a Masters in Data Science, two degrees in Mathematics and have worked as an Analytics Manager in the FinTech domain. I'm a Databricks Certified Machine Learning Associate.

I've developed machine learning models for recommending mutual fund schemes and used clustering methods to segment customers. Presently, I'm interested in MLOps and LLMs. I actively engage in Kaggle Competitions to discover novel methods to tackle varied data problems. My philosophy is to practice what I have learned and learn what I have not read before.

  • Role

    Data Scientist

  • Years of Experience

    5 years

  • Professional Portfolio

    View here

Skillsets

  • ML libraries - 5 Years
  • Hugging face
  • rag
  • Superset
  • Kubeflow
  • Databricks
  • PowerBI
  • Supervised ml - 5 Years
  • Regression - 5 Years
  • FastAPI - 2 Years
  • LLMs - 2 Years
  • Cnn - 1 Years
  • Classification - 5 Years
  • Vector databases - 2 Years
  • Prompt engineering - 2 Years
  • langchain.js - 1 Years
  • Model evaluation - 5 Years
  • Feature Engineering - 5 Years
  • Hugging Face Transformers
  • Reinforcement Learning - 1 Years
  • Rest APIs - 5 Years
  • Openai chatgpt
  • Peft
  • Lora
  • spaCy
  • Nltk
  • LLAMA
  • Data preprocessing - 5 Years
  • BERT
  • Restful APIs - 5 Years
  • Django /Flask - 5 Years
  • Computer Vision - 1 Years
  • TypeScript - 1 Years
  • Data Science - 5 Years
  • NoSql - 5 Years
  • MicroServices - 2 Years
  • AWS - 5 Years
  • Git
  • R
  • postgresql
  • SQL - 5 Years
  • Python - 5 Years
  • NLP - 5 Years
  • Machine Learning - 5 Years
  • Jenkins - 1 Years
  • Kubernetes - 2 Years
  • TensorFlow - 5 Years
  • Snowflake - 2 Years
  • Pytorch - 5 Years
  • Natural Language Processing - 5 Years
  • MLFlow - 5 Years
  • Deep Learning - 5 Years
  • Data Structure - 5 Years
  • CI/CD - 5 Years
  • HuggingFace
  • Parquet
  • GitHub Actions
  • LangChain - 2 Years
  • opencv
  • MongoDB
  • Scikit-learn - 5 Years
  • Airflow
  • CircleCI
  • Automated Testing - 5 Years
  • Jira
  • flask - 5 Years
  • Docker - 5 Years
  • C++
  • Helm
  • Bash
  • Spark

Vetted For

12Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Data Scientist (Remote)AI Screening
  • 79%
    icon-arrow-down
  • Skills assessed :Communication Skills, Jira, Retrieval-Augmented Generation, Computer Vision, Deep Learning, Pytorch, TensorFlow, GitLab, Machine Learning, NLP, NoSql, Python
  • Score: 71/90

Professional Summary

5Years
  • Jul, 2022 - Present2 yr 8 months

    Data Scientist

    Grainger Canada
  • Jan, 2022 - Present3 yr 3 months

    Data Scientist

    Grainger
  • Jun, 2018 - Apr, 20212 yr 10 months

    Data Analytics Manager

    SBI Funds Management Pvt. Ltd.
  • Jan, 2018 - Dec, 20213 yr 11 months

    Data Analytics Manager

    SBI Mutual Funds

Applications & Tools Known

  • icon-tool

    SOLR

  • icon-tool

    MLflow

  • icon-tool

    Databricks

  • icon-tool

    Streamlit

  • icon-tool

    GitHub Copilot

  • icon-tool

    Locust

  • icon-tool

    Splunk

  • icon-tool

    Grafana

  • icon-tool

    CRM

  • icon-tool

    Power BI

  • icon-tool

    Google Colab

  • icon-tool

    SQL Server

  • icon-tool

    Google Analytics

  • icon-tool

    PowerBI

  • icon-tool

    SQL Server

  • icon-tool

    ETL

  • icon-tool

    Flask

  • icon-tool

    MLflow

  • icon-tool

    Superset

Work History

5Years

Data Scientist

Grainger Canada
Jul, 2022 - Present2 yr 8 months

        Led the development and integration of end-to-end ML microservices, deploying production-ready models that drove Grainger Canada's web search using name entity recognition and text classification algorithms

        Designed an LLM-based system to generate descriptions of website products, fine-tuning on existing high-quality descriptions, incorporating manual review for quality and improvement

        Refactored the category prediction fastText model by migrating it from an on-premises server to the cloud through MLflow and Databricks, integrating version control and ensuring reproducibility, leading to a 37% reduction in model training time and an 8% enhancement in recall

        Deployed an ML API that utilizes a GPT-3 powered model to assess parsed search queries from an established NER model to identify incorrect query labels, resulting in a notable reduction in the time taken

        Mentored a team of two junior data scientists and successfully reduced tech-debt of existing ML pipelines by 80%

        Executed health check scripts and stress testing for REST APIs using Locust, implemented efficient logging with Splunk, and utilized Grafana dashboards to monitor model drift and performance

Data Scientist

Grainger
Jan, 2022 - Present3 yr 3 months
    Led the development and integration of end-to-end ML microservices, deploying production-ready models that drove Grainger Canada's web search using name entity recognition and text classification algorithms. Designed an LLM-based system to generate descriptions of website products, fine-tuning on existing high-quality descriptions, incorporating manual review for quality and improvement. Refactored the category prediction fastText model by migrating it from an on-premises server to the cloud through MLflow and Databricks, integrating version control and ensuring reproducibility, leading to a 37% reduction in model training time and an 8% enhancement in recall. Deployed an ML API that utilizes a GPT-3 powered model to assess parsed search queries from an established NER model to identify incorrect query labels, resulting in a notable reduction in the time taken. Mentored a team of two junior data scientists and successfully reduced tech-debt of existing ML pipelines by 80%. Executed health check scripts and stress testing for REST APIs using Locust, implemented efficient logging with Splunk, and utilized Grafana dashboards to monitor model drift and performance.

Data Analytics Manager

SBI Funds Management Pvt. Ltd.
Jun, 2018 - Apr, 20212 yr 10 months

          Fine-tuned BERT for classifying sentiment of 20,000 user reviews, integrating the result in the internal CRM tool, thereby improving the support teams response time to negative feedback

          Implemented clustering and segmentation for over 10 million investors and 37,000 brokers using K-Means and DBSCAN, for sending targeted notifications on app, increasing app usage by 25%

          Managed a team of two data analysts in writing efficient SQL queries and reducing time spent on building reports

          Created an item-item collaborative filtering model to recommend products to customers in the broker platform for upselling and cross-selling schemes, leading to an 11% increase in digital sales

          Built Power BI visualization reports with data from SQL Server & Google Analytics within an ETL framework

Data Analytics Manager

SBI Mutual Funds
Jan, 2018 - Dec, 20213 yr 11 months
    Fine-tuned BERT for classifying sentiment of 20,000 user reviews, integrating the result in the internal CRM tool, thereby improving the support teams response time to negative feedback. Implemented clustering and segmentation for over 10 million investors and 37,000 brokers using K-Means and DBSCAN, for sending targeted notifications on app, increasing app usage by 25%. Managed a team of two data analysts in writing efficient SQL queries and reducing time spent on building reports. Created an item-item collaborative filtering model to recommend products to customers in the broker platform for upselling and cross-selling schemes, leading to an 11% increase in digital sales. Built Power BI visualization reports with data from SQL Server & Google Analytics within an ETL framework.

Achievements

  • Top 2% (16/829) in Kaggle Team competition, WiDS Hackathon 2022 | Forecast Energy Consumption
  • Top 3% (87/3537, Solo Silver) in Kaggle competition, PetFinder.my |CNN | Predict Image Popularity
  • Received DST-INSPIRE scholarship (Top 1% in India ISC Exams) from Govt. of India

Major Projects

1Projects

Head Impact Detection in Sports Videos

May, 2022 - Jul, 2022 1 month

        Generated a reproducible machine learning pipeline in Google Colab using computer vision algorithms to detect the occurrence of head impact event using a combination of Object-detection and Activity-recognition models

        Using YOLOv5 & Resnet 3D, the manual analysis time for videos decreased by 60%, achieving a recall score of 0.91

Education

  • Master of Data Science

    University of British Columbia (2022)
  • M.Sc. Mathematics

    Indian Institute of Technology, Bombay (2018)
  • B.Sc. Mathematics

    Hindu College, University of Delhi (2016)

Certifications

  • Databricks generative ai fundamentals

  • Databricks certified machine learning associate

  • Databricks lakehouse fundamentals

  • Building transformer-based nlp applications (nvidia)

  • Deep learning specialization (coursera)

  • Sql intermediate (hackerrank)

  • Machine learning in production (deeplearning.ai)

Interests

  • Running Marathon
  • Books