profile-pic

Raushan Ranjan

Raushan Ranjan

Research Scientist with over 2 years of experience specializing in Natural Language Processing (NLP) and Computer Vision. Currently contributing to a leading search engine’s AI project at Turing. Proven expertise in developing and deploying machine learning models, designing end-to-end data pipelines, and managing large-scale datasets. Adept at using a variety of tools and technologies, including Python, Spark, AWS, and multiple ML frameworks. Strong background in customer profiling systems and quality control through Named Entity Recognition. Holds a Master’s in Information and Communication Technology from DAIICT and a Bachelor’s in Computer Science Engineering from UPES. Skilled in leadership, communication, and team management.

  • Role

    Research Scientist

  • Years of Experience

    3 years

Skillsets

  • Machine Learning - 3 Years
  • Predictive Modeling - 3 Years
  • Recommendation Systems - 3 Years
  • Research - 2 Years
  • Landing Pages A/B Testing - 2 Years

Professional Summary

3Years
  • Mar, 2024 - Present 6 months

    Data Scientist

    Turing
  • Jun, 2022 - Jan, 20241 yr 7 months

    Associate Data Scientist

    Ecom Express
  • Jun, 2021 - Jun, 20221 yr

    Data Science Research Assistant

    Smart City Lab, DAIICT

Applications & Tools Known

  • icon-tool

    Python

  • icon-tool

    NetBean-IDE

  • icon-tool

    AWS Cloud

  • icon-tool

    AWS (Amazon Web Services)

  • icon-tool

    MySQL

  • icon-tool

    Git

  • icon-tool

    Visual Studio Code

  • icon-tool

    Apache Spark

  • icon-tool

    Amazon DynamoDB

Work History

3Years

Data Scientist

Turing
Mar, 2024 - Present 6 months
    • Contributing to a leading search engines cutting-edge AI project.
    • Create and curate data analytics datasets to support various projects.
    • Review and evaluate model performance to ensure accuracy and effectiveness.
    • Identify and analyze areas where models are underperforming or erroneous.
    • Prepare and present detailed reports on model performance and areas for improvement.

Associate Data Scientist

Ecom Express
Jun, 2022 - Jan, 20241 yr 7 months
    • Developed NLP and Computer Vision models from scratch, all with nearly 95% accuracy.
    • Designed and implemented an end-to-end Customer profiling system, utilizing DynamoDB for storage, to seamlessly ingest, transform, and extract customer information spanning 9 years, with a dedicated API for efficient retrieval.
    • Contributed to the Development and training of a multilingual GPT-2 model using address data.
    • Designed and implemented cross-platform data pipelines for projects, ensuring optimal speed and efficiency in data processing.

    Project 1: Customer and Seller Intelligence Profile Management System [ Spark, DynamoDB, Python, AWS (DynamoDB, Spark, Lambda) ]

    • Engineered a robust system for generating and managing customer and seller intelligence profiles by merging nine years of data.
    • Developed unique identifiers for sellers and customers, facilitating accurate profiling and analysis.
    • Leveraged Apache Spark to process and merge large volumes of data efficiently.
    • Transferred profiles seamlessly to DynamoDB for scalable storage and retrieval.
    • Implemented dynamic profile lookup mechanism at the manifest level, ensuring real-time access to profiles.

    Project 2: Named Entity Recognition for QC of Reverse Pickups [ NLP, Python, MySQL, SpaCy, HuggingFace, PyTorch, NumPy, Pandas, AWS (EC2, Sagemaker), FastAPI ]

    • Explored Machine learning algorithms and developed NLP models and with over 2,00,000 data points with over 98% accuracy.
    • Reduced quality check errors by 11%.
    • Challenges faced: Lexical Ambiguity and Syntactic Ambiguity of tokens.

    Project 3: Delivery Center Prediction Model

    • Address, city, state, pin code -> GPT 2 -> Delivery center prediction.
    • Trained the GPT2 model from scratch using PAN India address text.
    • Reducing training time by 50% using different optimization strategies.
    • Improved model performance by 40% using ONNX runtime inference.
    • Automated the training process using AWS Sagemaker pipelines streamlining the model training on daily basis to avoid data drift issues.
    • Deployed the API on AWS Lambda providing scalable and cost-efficient
    • Inference capabilities to handle 10-25 Lakh hits daily as per load.
    • Contributed to a reduction in misroutes from 6% to 2.8%.

    Project 4: Product Categorizer SKU Master

    • Created Text Classifier model to classify shipments into 103 categories with Over 95% accuracy.
    • Established a Store Keeping Unit (SKU) Master on AWS DynamoDB to reduce Model calls by 93%.

    Project 5: Pin code Prediction Model

    • Address, city, state -> Stacked BiLSTM -> Pin code prediction.
    • Prediction of pin code using address, city, state data of consignee.
    • Accurately predicting the pincode to assist the delivery center

Data Science Research Assistant

Smart City Lab, DAIICT
Jun, 2021 - Jun, 20221 yr

    Project: Bone Segmentation on Ct-Scan images using Deep Learning [ Image Processing, Deep Learning, U-net, Keras, Python, NumPy]

    • Contributed to the creation and study of tissue segmentation from CT scan and MRI images utilizing U-Net architecture and 3D template-based mapping.
    • Investigated several 2D and 3D image segmentation approaches
    • Annotated CT-scan images for bone segmentation masks, playing a key role in the creation of datasets for precise and detailed medical image data analysis.
    • Challenges faced: Limited computing resource for 3d images. Unavailability of accurate data sets.

Major Projects

3Projects

Delivery Center Prediction Model

Ecom Express
May, 2023 - Present1 yr 4 months
    • Address, city, state, pin code -> GPT 2 -> Delivery center prediction.
    • Trained the GPT2 model from scratch using PAN India address text.
    • Reducing training time by 50% using different optimization strategies.
    • Improved model performance by 40% using ONNX runtime inference.
    • Automated the training process using AWS Sagemaker pipelines stream Lining the model training on daily basis to avoid data drift issuess.
    • Deployed the API on AWS Lambda providing scalable and cost-efficient
    • Inference capabilities to handle 10-25 Lakh hits daily as per load.
    • Contributed to a reduction in misroutes from 6% to 2.8%.

Named Entity Recognition for QC of Reverse Pickups

Jan, 2023 - Nov, 2023 10 months
    • Reduced quality check errors by 11%.
    • Develop multiple NLP models with over 2,00,000 data points using platforms like Spacy and HuggingFace with 93% accuracy.

Bone Segmentaion from Ct-Scan images

Dhirubhai Ambani Institute of Information and Communication Technology
Dec, 2020 - May, 20221 yr 5 months

    The project I worked on involved the segmentation of bones from CT scan images to assist medical professionals with total knee replacement surgery. The project utilized various segmentation techniques, including U-Net 3D and 2D segmentation, to segment all bones at the knee joint. Python, TensorFlow, and deep learning were used to develop the segmentation models.

    Here are some details about the project:

    • The goal of the project was to assist medical professionals with total knee replacement surgery by accurately segmenting the bones at the knee joint.
    • To achieve this goal, I researched various segmentation techniques and settled on using U-Net 3D and 2D segmentation models due to their accuracy and efficiency.
    • The segmentation models were developed using Python, TensorFlow, and deep learning techniques.
    • CT scan images were used to train the segmentation models, with a focus on accurate bone segmentation at the knee joint.
    • Once the models were trained, they were used to segment bones in new CT scan images, providing medical professionals with accurate 3D and 2D images of the bones at the knee joint.
    • The segmentation models were evaluated for accuracy, sensitivity, and specificity to ensure their reliability and effectiveness in assisting with total knee replacement surgery.

Education

  • Master of Technology, Information and Communication Technology

    Dhirubhai Ambani Institute of Information and Communication Technology (2022)
  • Bachelor of Engineering, Computer Science Engineering with specialisation in cyber security and forensics

    University of Petroleum and Energy Studies (2018)

Interests

  • Watching Movies
  • Long Rides