profile-pic

Yateendra Tripathi

Yateendra Tripathi

Results-driven and adaptable Data Scientist/Machine Learning with a successful track record in managing multiple priorities and delivering high-quality solutions. Proficient in NLP and machine learning techniques, with a focus on automating and processing vast amounts of text data. Recognized for expertise in using ML to analyze and extract insights from complex documents, such as bonds uploaded to the London Stock Exchange. Adept at developing and deploying unified models for classification and extraction, utilizing Sagemaker pipelines and hyperparameter tuning for optimal performance. Skilled in intelligence gathering, statistical analysis, and data mining, with a strong emphasis on attention to detail and written communication. A proactive problem solver with a passion for leveraging generative AI for Information Warfare. Experienced in technical entry into the Armed Forces, integrating the latest technology into existing frameworks and using ML to automate troop movement and supply management. Holds an M.Tech in Data Science and a B.Tech in Computer Science, along with certifications in machine learning, deep learning, cloud services, and more. Actively engaged in data science projects, including Eurobonds analysis at the London Stock Exchange and machine learning projects on Kaggle. A self-motivated professional with excellent organizational and time management skills.

  • Role

    Data Scientist

  • Years of Experience

    10 years

Skillsets

  • Jenkins - 3 Years
  • Algorithm implementation
  • Automated Testing - 5 Years
  • CI/CD - 3 Years
  • Data Structure - 6 Years
  • MLFlow - 5 Years
  • Natural Language Processing - 6 Years
  • Reinforcement Learning - 4 Years
  • Sagemaker - 2 Years
  • Snowflake - 2 Years
  • AWS - 3 Years
  • Intelligence gathering
  • SQL - 6 Years
  • Nodejs - 2 Years
  • langchain.js - 3 Years
  • Prompt engineering - 3 Years
  • Vector databases - 3 Years
  • TypeScript - 3 Years
  • model tracking - 3 Years
  • Software Engineering - 10 Years
  • Docker - 2 Years
  • Java - 4 Years
  • Big Data - 3 Years
  • Linear Regression - R or Python - 5 Years
  • PostgreSQL - 5 Years
  • NLP - 5 Years
  • Deep Learning Frameworks - 2 Years
  • HuggingFace - 2 Years
  • TensorFlow - 4 Years
  • Pytorch - 4 Years
  • LLM - 2 Years
  • LLAMA - 2 Years
  • Machine Learning - 6.11 Years
  • Python Programming - 6.11 Years
  • Computer Vision - 2 Years
  • Deep Learning - 5 Years
  • Keras - 5 Years
  • Natural Language Processing (NLP) - 5 Years
  • neural network architectures - 5 Years
  • Hadoop - 2 Years
  • Python - 6 Years
  • AI - 5 Years
  • Data Mining
  • Statistical analysis

Professional Summary

10Years
  • Mar, 2024 - Present 6 months

    Lead Data Scientist

    Affine Analytics
  • Mar, 2024 - Present 6 months

    Lead Data Scientist

    Affine
  • Mar, 2023 - Mar, 20241 yr

    Data Scientist

    London Stock Exchange Group
  • Mar, 2016 - Dec, 2016 9 months

    Application Developer

    Oracle
  • Apr, 2017 - Mar, 20235 yr 11 months

    Officer

    Indian Army
  • Apr, 2017 - Mar, 20235 yr 11 months

    Captain/Technical Officer

    Indian Army
  • Mar, 2016 - Dec, 2016 9 months

    Application Developer

    Oracle India PVT LTD
  • Jan, 2014 - Feb, 20162 yr 1 month

    System Engineer

    TCS

Applications & Tools Known

  • icon-tool

    AWS Cloud

  • icon-tool

    Amazon SageMaker

  • icon-tool

    Azure

  • icon-tool

    Keras

  • icon-tool

    PyTorch

  • icon-tool

    Hugging Face

  • icon-tool

    Tensorflow

  • icon-tool

    neural network architectures

  • icon-tool

    Python

  • icon-tool

    LLM

  • icon-tool

    LLAMA

  • icon-tool

    Generative AI

  • icon-tool

    Machine Learning

  • icon-tool

    Huggingface

  • icon-tool

    ETL pipelines

  • icon-tool

    ChatGPT

  • icon-tool

    Snowflake

  • icon-tool

    Spark

  • icon-tool

    Generative AI

Work History

10Years

Lead Data Scientist

Affine Analytics
Mar, 2024 - Present 6 months

    Spearheading a GenAI-centric project for a global MNC client, focusing on streamlining claims and critical operations through cutting-edge generative AI solutions.

    • Roles & Responsibilities:
    • Leading Data Science Team - The Team of data scientists is lead by me from task assignments to code reviews.
    • Deliverables- Timely and smooth delivery of all deliverables from data science stand point.
    • POC and Solution Design - I am also responsible for doing POC and designing of solution.
    • Review and Analysis - Review of timeline and approaches. This also includes testing(unit and functional) model evaluation and model upgrades

Lead Data Scientist

Affine
Mar, 2024 - Present 6 months
    Spearheading a GenAI-centric project for a global MNC client, focusing on streamlining claims and critical operations through cutting-edge generative AI solutions.

Data Scientist

London Stock Exchange Group
Mar, 2023 - Mar, 20241 yr

    As NLP data scientist, my role is to automate and process the vast amount of text data that comes in various forms. The companies registered with the London Stock Exchange upload bonds in the form of pdf which is further analysed by the LSEG for insights. The complexity of these bonds varies from document to document, the use case is to automate the process of required information retrieval using ML techniques as on date the work is being done by SMEs manually. (Note: there are more than 400 fields)

    Roles & Responsibilities:

    • Sentence splitter - Splitting the pdf into sentences based on coordinates.
    • Sentence classification- Fields that require class assignment based on text.
    • Sagemaker Pipeline for classification - I am also responsible for making a unified model to handle all classification fields and deploy it in the Sagemaker pipeline.
    • This also includes testing(unit and functional) model evaluation and model upgrades

Captain/Technical Officer

Indian Army
Apr, 2017 - Mar, 20235 yr 11 months
    • I was a Technical Entry into the Armed Forces responsible to use and integrate the latest technology into the existing framework.

    Roles & Responsibilities:

    • Analysis of Satellite data using GIS
    • Analysis of Radar data(battlefield radar and WLR) using ML techniques to identify actionable intelligence.
    • Using machine learning to automate troop movement, supply management
    • Research about 3D printing to replenish equipment and tools at high altitude areas.

Officer

Indian Army
Apr, 2017 - Mar, 20235 yr 11 months
    Responsible for integrating the latest technology into the existing framework of the Armed Forces.

Application Developer

Oracle
Mar, 2016 - Dec, 2016 9 months
    • Participated in design and planning exercises for future software rollouts.
    • Worked closely with other team members in such tasks as troubleshooting and debugging.
    • Resolved system test and validation problems to provide normal program functioning.
    • Participated with clients in discussion meetings.
    • Designed and developed application scripts for test scenarios.
    • Updated technical documentation, product
    • Specifications and technical training materials.

Application Developer

Oracle India PVT LTD
Mar, 2016 - Dec, 2016 9 months
    Participated in design and planning exercises for future software rollouts.

System Engineer

TCS
Jan, 2014 - Feb, 20162 yr 1 month
    • Campus placed at TCS and retained as a trainer after completing the training after that was part of the framework team to develop frameworks for various solutions, the framework was used in the Cardiff City Council project, the Gujarat government project.

    Roles & Responsibilities:

    • Develop core framework features like JDBC connection pools and methods to decouple the server and database dependencies.
    • Test and review code when the framework was used.
    • Resolved issues and escalated problems with knowledgeable support and quality service.
    • Proposed technical feasibility solutions for new system designs and suggested options for performance improvement of technical components.
    • Developed IT policies to comply with applicable laws.
    • Managed key information technology and compliance programs for proactive risk Management.

Achievements

  • I was awarded GOLD Award in LSEG for contribution in the Eurobonds project
  • Spearheading a GenAI-centric project for a global MNC client
  • Automating and processing the vast amount of text data using NLP techniques

Major Projects

6Projects

Eurobonds

LSEG
Mar, 2023 - Present1 yr 6 months
    • There are a lot of PDF documents being uploaded to the stock exchange that are called bonds. The complexity of these bonds varies document to document, the use case is to automate the process of required information retrieval using ML techniques as on date the work is being done by SMEs manually. (Note: there are more than 400 fields)

    The information retrieval is of two types

    • Entity extraction- fields such as date amount etc that can be directly extracted from the document by a custom NER model
    • Sentence classification- fields that require class assignment based on text.

    My Responsibilities and contribution:

    • PDF processing: we need to split the PDF into sentences so that the sentence classifier can classify the sentences into desired categories. I was responsible for making the splitter, I used the Pymupdf library.
    • Sagemaker pipelines for classification and extraction I was responsible for the classification pipeline and decided to use the setfit model which is a few-shot learning deep learning model, even though the model produces very good results the actual challenge came from a sheer number of fields in the scope(>400) hence I decided to get creative and combined several fields together and used capabilities of Setfit (differentiable head, head freezing) to combine multiple fields and use logistic regression head over each field to find final classification.
    • Hyperparameter tuning We know that every ML solution requires hyperparameter tuning The dynamic nature of the use case entailed an organic approach for sentence classification I decided to use the optuna backend and hyperparameter search method provided by the setfit library.

GenAI-centric project

Mar, 2024 - Present 6 months
    Project focusing on streamlining claims and critical operations for a global MNC client through generative AI solutions.

Genetic Algorithm - Human VS AI killer Sudoku

Bits Pilani (M.Tech) class room project
Jan, 2023 - Feb, 2023 1 month
    • The project was to make an AI agent that uses a genetic algorithm to play with a human a game of killer sudoku in which a random grid of nXn is initiated and one killer number is selected, both human and AI will have 3 chances if they select two adjacent cells with total equal to the killer number.
    • I was able to make that agent using end-to-end genetic algorithms implementation.

Titanic - Machine Learning from Disaster

kaggle
Nov, 2022 - Dec, 2022 1 month
    • To build a predictive model that answers the question: what sorts of people were more likely to survive? using passenger data
    • I was able to use various ensemble and deep learning methods in this Challenge to achieve 1100 rank out of 14000 teams.
    • This project involves the basics of feature engineering and various machine learning algorithms with hyperparameter tuning.

NLP with disaster tweets

kaggle
Nov, 2021 - Dec, 2021 1 month
    • Twitter has become an important communication channel in times of emergency.
    • The ubiquitousness of smartphones enables people to announce an emergency they're observing in real-time. Because of this, more agencies are interested in programatically monitoring Twitter (i.e. disaster relief organizations and news agencies).
    • But, its not always clear whether a persons words are actually announcing a disaster. the task is to build a machine learning model that predicts which Tweets are about real disasters and which ones arent.

GovFramework

TCS
Apr, 2015 - Jan, 2016 9 months
    • The TCS government project group made a product named GovFramework this can be leveraged to make any project and dependencies were added at run time.
    • I was part of the framework team and was able to devise a method that uses property files to change the runtime dependencies.
    • I was awarded 11000 TCS gems for this contribution.

Education

  • M.Tech

    BITS Pilani (2024)
  • B.Tech

    IERT (2013)
  • M.Tech: Data Science

    BITS Pilani (2024)
  • B.Tech: Computer Science

    IERT, Allahabad (2013)

Certifications

  • Deep Learning

    Kaggle (Oct, 2010)
  • Python for data science

    Udemy (Sep, 2009)
  • Java

  • Feature engineering

  • Deep learning

  • Intermediate machine learning

  • Az-900

  • Python for data science

  • Sql

  • Intro to gis