Vetted Talent

Kundan Singh

Vetted Talent

Experienced Sr. Data Scientist and MLOps professional with 8+ years in the industry, specializing in Generative AI transformations, TensorFlow, and PyTorch. Proven track record of delivering impactful AI solutions and pioneering innovations in chatbot interactions, virtual assistants, and IOT applications. Skilled in open-source model integration, 3D innovation, and end-to-end pipeline development. Postgraduate in AI/ML from NIT Warangal, with a strong background in software development using Python.

Role
Senior Generative AI Engineer
Years of Experience
10.67 years

Skillsets

data-science - 5 Years
data-science - 5 Years
Python - 8 Years
Python - 8 Years
Python - 5 Years
MySQL - 8 Years
MySQL - 8 Years
Azure - 4 Years
Azure - 4 Years
AWS - 3 Years
AWS - 3 Years
TensorFlow - 3.5 Years
PyTorch - 3.5 Years
HuggingFace - 3 Years
LLM - 1 Years
LLAMA - 1.5 Years

Vetted For

18Skills

Roles & Skills
Results
Details

Senior Generative AI EngineerAI Screening
62%

Skills assessed :BERT, Collaboration, Data Engineering, Excellent Communication, GNN, GPT-2, graphs, Large Language Models, Natural Language Processing, Sagemaker, Deep Learning, neural network architectures, PyTorch, TensorFlow, machine_learning, Problem Solving Attitude, Python, Vertex AI
Score: 62/100

Professional Summary

10.67Years

Feb, 2024 - Present2 yr 3 months
Senior Data Scientist
HP
Jan, 2022 - Jan, 20242 yr
Senior Data Scientist
R Systems
Mar, 2020 - Jan, 20221 yr 10 months
Data Scientist
Big Oh Tech
May, 2015 - Dec, 20183 yr 7 months
Senior PHP Developer and Python programmer
PhpYouth Software Solutions Pvt. Ltd
Jan, 2019 - Feb, 20201 yr 1 month
Jr. Data Scientist & Python Developer
Clavax

Applications & Tools Known

Tensorflow
PyTorch
LLM
LLAMA
NLP
Generative AI
Python
Data Engineering
GCP
Azure
AWS (Amazon Web Services)
Neural Network

Work History

10.67Years

Senior Data Scientist

Feb, 2024 - Present2 yr 3 months

Senior Data Scientist

R Systems

Jan, 2022 - Jan, 20242 yr

Talking Chatbot Text-to-Video Transformation: Innovations in GEN AI

Developed a Generative AI application to enhance chatbot interactions by converting responses into compelling videos and hosted in Azure VM.
Leveraged various open-source models, including adaptations from Huggingface, amalgamated and fine-tuned for optimal performance.
Pioneered a multi-step approach: initiated by converting text responses into high-quality audio utilizing BARK an open-source audio model.
Seamlessly synchronized audio with facial expressions by extracting facial landmarks, applying dynamic lip syncing, eyeblinking, etc.
Integrated GFP GAN model as a face enhancer, elevating the visual appeal and expressions of the chatbot speaker.
Spearheading the development of a comprehensive Processing Explanation video, seamlessly combining background visuals, dynamic speaker
Presence with Pytorch 3D, and moving subtitles.
Harnessing ChatGPT APIs for text summarization, empowering concise and impactful content generation, complemented by Stable Diffusion 2.1
For crafting compelling background imagery.

Other Generative AI Research and Developments

Proficient in utilizing Llama2 and various other LLA models, incorporating ChatGPT APIs seamlessly.
Integrated MiniGPT4 successfully on cloud platform, enabling users to engage in Q&A interactions with input images.
Demonstrated expertise in implementing diffusers and transformers, resulting in the creation of StableDiffusionImg2ImgPipeline. This innovative
Pipeline facilitates image updates based on prompts.

Created multiple Virtual Assistant (Native Voice App available on Android and IOS both)

Dialogflow was used to transform output into the user's speech and to receive input from the user's voice as input.
Utilized different Rasa NLU and Action servers for each VA. used Flask as an endpoint for all virtual assistant communication.
To improve the outcome, Entity Extractors, Lookup tables and Synomyms were used. Docker was utilised as a request endpoint.
Deployed on Google Cloud via Kubernetes clusters, coupled with automated testing through GitHub Pipelines.
Integrated into SkullCandy premium earphones, accessible via the Skull-IQ app as the virtual assistant named iHeart.

Created an end to end pipeline in Dataiku for production Deployment with multiple user's collaboration

Currently Dataiku doesn't provide any direct option that multiple engineers can work simultaneously on different task and test them properly.
Created pipeline where an engineer's trained model would be compared with production one and admin would get report mail.
If admin seems fine with results then admin can provide approval to merge the changes and get the model updated.
It enhanced the productivity of development by more than 20% with same workforce.

IOT enabled smart Refrigerator connected app

Model was trained using Azure Machine Learning using data produced by IOT devices and sensors.
Model forecasts if the device status is normal, critical, or something else.
Used Azure app service to create Flask apis for live IOT data with status prediction and historical sensor data of refrigerator.

Custom MLOps platform

Used ClearML as base part of our platform and deploy it on our AWS cloud platform as a service.
Deploy GitLab on the same sever and integrated it with ClearML.
Created GitLab CI/CD pipeline for model training, building, deployment, and testing in a new container. Sends admin report via email.

Other stuffs

Used various Hugging face model pipelines and hosted different models into a cloud VM .
Explored a variety of MLOps or related tools, such as MLFlow, Kubeflow, CML, DVC, etc.
Worked on various AutoML tools, including H20AutoML, Auto-Keras, Auto-Sklearn, and AWS Sagemaker Studio.
Experience on Auto EDA and Data preprocessing tools as well like Pandas Profiling, Sweetviz, Autoviz, Dataprep, etc

Data Scientist

Big Oh Tech

Mar, 2020 - Jan, 20221 yr 10 months

Categorize products on basis of product image and title:

Built a model that was combination of CNN and RNN model to categorize the product.
Used transfer learning with VGG16 to extract the context from the image and flatten that.
Extracted the context from text input data through RNN model having Bi-directional LSTM layers.
Combined both the models using Concatenate layer and got the categorical output.
This model would be used in Government ecommerce marketplace to prevent fraud by saving a product in wrong category

Product part detection (Object detection):

Seller previously used to offer incorrect information about a product that didn't exist in the final item.
YOLO V5 was used to build a custom Computer Vision model that can detect object's component from an image.
Labeling the images using Labelimg Library and trained them using YOLO V5. Identified the products having wrong description.

Jr. Data Scientist & Python Developer

Clavax

Jan, 2019 - Feb, 20201 yr 1 month

Created Rest APIs in a microservices based project using Flask, and worked with Virtual Machines, Docker containers.

Chatbot:

Used libraries such as Spacy, Textblob, and NLTK, implemented several data pre-processing operations such as Stemming, Lemmatization and

Removing Stopwords. In addition, conducted a Sentiment Analysis.

To get the best fit output, played around with the input text data across Rasa's Intent, Entity, Actions, and Stories then integrated it with Telegram.

Text Classification

Create a system that automatically extracts new emails through IMAP and classifies them into useful and non-useful categories.
Forged a classifier using NLTK for pre-processing, tokenizer, and embedding. Employed RNN model with multiple layers, utilizing Adam Optimizer

Senior PHP Developer and Python programmer

PhpYouth Software Solutions Pvt. Ltd

May, 2015 - Dec, 20183 yr 7 months

Education

Masters in AI/ML
NIT Warangal (2021)
B.Tech in Computer Science
MDU Rohtak (2015)

AI-interview Questions & Answers

I have total experience of 8.5 years, and relevant experience in data science is more than 5 years. In June 2015, when I completed my BTEC, I started working on Python. And, later on, I did different certifications with respect to data science from Coursera and migrated myself from Python development to data science. And, I have done my master's as well from an IT War angle in artificial intelligence and machine learning. And, currently, I am working with our systems international private limited. And here, my formal designation is as a senior data scientist, but sometimes I work as a team lead as well. As of now, I am currently working and leading a project which is related to generative AI. We are creating an AI model that, and we are creating an AI model, and our end goal is, like, we are going to create an AI chatbot who's gonna interact with end users in a video format. So, like, any two persons do video calls. So in the same format, our AI chatbot is gonna interact with the end user. So how exactly would it work? Like, it basically contains two parts. 1 is related to the large language model part, like NLP part, and one is related to converting text into video. So I basically lead this complete project. So I basically handle and lead both kinds of scenarios, NLP-related part as well and computer vision-related part as well. In this related part, we are basically using Llama 2, and we are fine-tuning it with respect to different documentation with respect to different scenarios. And, a part of that, with respect to the computer vision side, so how exactly would it work? Firstly, an end user would basically speak their questions. So, firstly, his audio would be converted into text, and that text would be given into the LLM model as an input. And after providing output to that LLM model, the output will basically be provided to our AI model, which basically converts that text into video. So converting text into video, which is basically going to be as an output to the end user. So here, we can use any person's face. Just provide some photos with respect to that question, and we are going to convert that into an avatar of that person. So here, we used the transformer BART transformer, we used for converting text into audio. We used WaveNet. We used GAN, a part of that OpenCV and GAN, we also used and different other tools, like FFPF, which is easy to use there. And apart from that, I have good experience with respect to traditional machine learning algorithms, like classification, regression, a part of that, like object detection related things, like detecting different anomalies from images as well or in terms of different data as well. So these kinds of different things I have performed till now. So, yeah, overall because from the start of my journey as a data scientist, I am working in a service-based organization, so I have different kinds of experience.

Training a large language model with respect to our current scenario or our current data. So different techniques we can use. So, there are different scenarios, basically, different techniques or end scenarios we can perform. 1 is related to the kind of training we are doing. So we can use parameter-based training like EFT. And even though different training in the sense of complete training of all the weights related to large language models. And a part of that, as I said, what I normally prefer is instead of fine-tuning with respect to existing weights of the last language model, instead of fine-tuning all the data, we should train with respect to new neurons and with respect to the do you basically call this technique like PEFT? And, we can use LoRa for that, and we can use Q LoRa. It's although a new technique with respect to this one. And, auto tokenizer, we basically use when we provide our own data to train with respect to large language models. As I said, for example, we are using the LAMA 2 model. And with respect to the LAMA 2 model, we are fine-tuning it with respect to our questions and answers. So in the same specific scenarios and specific format, we have to create our model, and then we have to provide that specific scenario data to be trained on. And, yeah, these things we can perform. And, another technique, I think that's the thing that we can implement.

In detail a method for optimizing performance of a deep learning model without overfitting. Overfitting is a condition which we identify and get to know. Like, let's suppose we have trained our model, and with respect to training data, it's performing well. Let's suppose it's providing an accuracy of 93%, but with respect to testing data, it's not performing even though, let's suppose, 85% accuracy. So this scenario, and this condition is called overfitting, like, with respect to trained data, it's performing very well, but on testing data, it's not performing well. So, to tackle these conditions, what things can we do? Firstly, different scenarios and different methods we can implement. 1 method is to ensure our neural network is not too complicated. It should be simple, not less complicated, but its complications should not be on the higher side. We should use dropout layers after a few steps so that we would not take on the overfitting problem in the future. Additionally, we can use a leaky ReLU activation function. Using activation functions, we can also use a part of that optimizer, like Adam, and these optimizers we can use. Furthermore, we should perform data augmentation before sending our data to be trained on. And we should make sure that the data we are currently testing is not completely different from the data with respect to which we have basically trained our model. Like, our training data and testing data should be related to each other. So, these kind of things, these kind of methods we can implement.

During the development of a deep learning model, you have to implement a specific SoA, what I can see in a forward scenario, the potential issue given in the implementation of the transformer block, that lies in the forward method attention mechanism. And a typical transformer block, the attention mechanism usually involves multi-head self-attention followed by feed-forward layers. However, the provided code you have written here lacks details on the actual implementation of the attention mechanism, specifically with respect to multi-head attention with the query, key, and value matrices, which is essential for the proper function in the architecture.

Debugging a model that performs well on training data but poorly on unseen data is a common problem. The problem statement is that our model is performing well on training data, but poorly, which means it's basically overfitting with respect to the data. So, with respect to this, we can implement different methods. Let's suppose it's related to a deep learning model. So, with respect to that, we can implement dropout, data augmentation. We can change the number of neurons with respect to layers, and we can reduce the number of layers of the deep learning model. And apart from that, we can use leaky ReLU activation type of functions. We can change the activation functions, and optimizers, we can use early stopping, so that our model will not be trained further after a specific scenario. And some kind of, like, we can also implement regularization methods. And even though we can do implementation with respect to key normalization things. And in terms of regression problems, if we are getting overfitting, we can implement regularization methods, such as Ridge or Lasso. Apart from that, we can do different tweakings with respect to our model parameters. Let's suppose we are creating a model that is using decision trees. So, we can increase the number of decision trees and the number of layers, and then we can remove the problem related to unseen things. And, these kind of debugging things, and with respect to debuggings, like, that I explained, it's related to how we can implement things to resolve this problem and to debug on just only identification. Like, what we can do, we can check our model accuracy with respect to our training data and with respect to our current data that we are currently getting to provide predictions. For the testing data, we can check its score and we can check, and compare the score with respect to our training data. If the difference between the scores is very huge, then we can say it's this problem. And, sometimes, we have deployed this model, so data drifting also occurs. Data drifting can be like data drifting or model drifting, so these can be the things of scenarios. And here, at using envelopes, we can also implement and identify with respect to this one, and it can provide us notifications over there. Let's suppose, we have identified with respect to test data whenever our precision would be less than a certain threshold, then we would get notifications. So that kind of thing, we can also implement.

How do we approach selecting a loss function when designing a new deep learning model? Okay. So, the loss function would basically depend on the kind of problems we are implementing. Different kinds of loss functions can be implemented and checked. And, with respect to specific problems, if we are going through a regression problem, then normal loss functions would be RMSE, MSE. And, apart from that, if we are going through classification problems, then kinds of scores would be there, and part of that would be related to classification. After that, I would check what kind of loss function is related to the problem. If it is related to a classifier, then cross-entropy or binary cross-entropy can be implemented. And, if it's related to regression, then mean squared error, root mean squared error, and some kind of even we can create our own loss function as well. Some scenarios basically use, and even though, in current scenarios, what we are getting, let's suppose we are creating in diffusion models here. Sometimes we create our own customized loss function over there.

In the process of training a generative model with TensorFlow, you are defining a loss function. Here is a snapshot of code. And, code is a custom loss function considering the loss that generates text. Why might this loss function be inappropriate, and what kind of loss function should be generated for this task? So, as per my previous questions answered, we can create our own custom loss functions. With respect to this loss function provided, we are basically calculating the mean difference between the true and predicted values. While the function may not work for some tasks, if we are currently talking about generating text, it may not be suitable. In recent text-related tasks, a loss function that considers the probabilistic nature of language is often required. A more suitable loss function for text generation is categorical loss, entropy, or specifically, the difference between the predictive probability distribution of words and the actual distribution of the words. So, with respect to this one, in terms of text-related things, I would say it's not a very good loss function to implement. Instead, using categorical cross-entropy helps the model to learn and generate text by penalizing the derivatives from the actual word distribution. That's something we can implement.

Imagine you are using a Hugging Face transformer library, and you encounter the following pseudo code. So the code is like model transformer from pre-trained and here is the model name. And so upon running this code, received an error, external object has no code. What might be the reason for this? The main reason that I can see and identify just by looking at this is that the transformer model object hasn't any attribute kind of from pre-trained. Okay, so to resolve this issue, you should ensure that you are using the correct class from Hugging Face. So again, the phase transfer library is to do of transfer learning model. You should use a specific model class available in the library, such as we can use the BART model. Like, here for example, we are using the BART model. Okay? So instead of transformer model, we can use BART model then a dot from pre-trained, then we can use like that. So that's the main thing. That's the main reason that we are getting an error with respect to this one. Yeah. You just define here like transformer model, but instead of that, you should define what kind of transformer model you are currently implementing and using, like BART model, or anything.

How do you handle the challenge of integrating state-of-the-art generative models into legacy systems? So, different scenarios and different challenges we can implement with respect to generative AI models. Firstly, let's pick up the generative AI models in terms of we recently created a diffusion model. So, with respect to the diffusion model, we have basically created a diffusion model that generates images from text. Different challenges that we can implement and tackle are handling a diffusion model or any other generative model. Firstly, we have some specific or limited amount of GPU memory available. And when we influence our model, it will first reserve or keep the complete model in its memory. Here are a few things: let's suppose we have deployed our model in a Flask framework, and then later on, we are getting 4 to 5 requests at a time. When it influences 4 to 5 requests at a time, it will surely provide us with an error related to out-of-memory things because of one GPU. Let's suppose we are using 16 GPUs, and our diffusion model is already taking more than 12 GB of GPU memory before containing the trained model in its memory. On the time of inference, sometimes it will go from 12 GB to maybe 15 GB with respect to one request, which is its peak value. But if we are going to send multiple requests, then it will go out of memory. So, we should queue our requests with respect to this one. And a part of that, implementing MLOps implementation with respect to this one is also a hard problem. We should have sufficient machines over there. We can easily upgrade our GPUs over there. We can easily create different replicas with respect to GPUs. These are some specific kinds of challenges we basically face. And many times, we face challenges with respect to library-related things as well. Let's suppose we are currently using PyTorch, and we are using PyTorch 2.0 with CUDA 11.8. So, maybe with respect to this, our model is providing good results, but maybe there are some changes or updates with respect to Python. Later on, it's not providing better results, and it starts providing errors because we just implemented some system updates or we just installed a few libraries, and later on, we identify that our library has been updated, and now it's not compatible with CUDA. These kinds of issues we also face with respect to this one. These are different kinds of challenges we also face. Let's go to the next question.

To identify and handle data anomalies in real time as new data is just fed into the system for training your model. Okay. So, with respect to this one, we can design our pipeline to be capable of handling various data formats and sources and volumes. So, real-time monitoring should be there. Real-time monitoring means implementing monitoring mechanisms to access incoming data for anomalies as it enters the system. So, firstly, before entering any data into the system, we take a look at whether we are getting anomalies over there, and we should implement different anomaly detection systems with respect to the data. And a part of that, we should implement different rules having some thresholds. We should implement some data handling and correction part before sending the data into our system and into our model to be trained on. And after that, we should implement some feedback loop related to the model improvement. This feedback can be related to either manual feedback from using QA or different persons or human feedback or even some kind of feedback looping with respect to different scenarios in rules as well. So, there are features for that, but, yeah, we can implement that thing as well. And even though we should implement documentation and logging with respect to our data and our systems. Like, there should be comprehensive logs and documentation regarding detected anomalies and their handling procedures and then how they are impacting our model training. And a part of that's like, we should implement compliance and governance with respect to that. Like, a data governance policy should be there. And, yeah. These things we can implement. And a part of that, designing a system to identify. Yeah. I think these things we can implement. And even though, let's suppose after some specific time, let's suppose after one month, we are seeing some model drifting or data drifting. Then what we decide is like, we are going to retain our model with respect to previous data and the new data with respect to which our previous model was not trained. So, at that scenario, we also implement data anomalies with respect to this part. And then we should send this data to between, and we should implement data clipping as well. And sometimes we should implement removing those data points. And even in real-life scenarios, what I see is, let's suppose we are creating a classification model, and it is basically having two classes, and here, we have imbalanced data. We currently have 10% of data for one class and 90% of data for another class. So, let's suppose what, minor class, we have different data points where few data points are having anomaly values. So, here, because we already have less data, so we should not delete those data points. But with respect to let's suppose, majority class, data points if we are having few data points where we are having anomalies with respect to data. So, here we can delete those things. Or, but normally, we should implement clipping with respect to what it

Propose a approach to fine tune GPT model, specifically with respect to client domain specific language. So, here, with respect to this one, what things we can implement? Firstly, we should prepare our data in a specific form, like, in a special one which is being accepted by the GPT-2 model. And then later on, we should choose the kind of model training we are basically doing after doing some kind of data preprocessing. In terms of data preprocessing, we should like do some cleaning with respect to our data, remove some noise, some relevant information, or any inconsistency. After that, we should implement the process of fine tuning. Although we can implement some as well. Apart from that, we should first choose the kind of variant and pre-trained of the GPT-2 model we are going to fine tune, and then we should implement the process of fine tuning. So, with respect to this one, we should utilize the transfer learning for fine tuning. We are going to use the EFT implementation, like we are not going to train the complete model weights. We are just going to train with respect to some specific scenarios with respect to our data points, and we should implement some training strategy. Like, training strategy in the sense, like, adjusting learning rates, adjusting the batch size, training epochs, monitoring the models, and monitoring the model performance through evaluation metrics, specify the metrics which are specifically with respect to our client's domain. And a part of that domain, we should implement specific prompting and sampling. We should conduct alternative training and alternating loops with domain-specific prompts, an example of inquiry, an example of different prompts and different scenarios, which would basically provide some, which would increase our model to provide outputs which are actually being provided as an example of whenever we are sending our data to be trained on. And a part of that, validation and hyperparameter tuning, we can implement. And, evaluation and refinement, we can implement. Evaluation in the sense, like, after we have trained our model for our client's domain and client's domain data. So we have trained that, and later on, we can do evaluation, what kind of accuracy, what kind of results we are getting. If they are acceptable, then good. If they are not acceptable, then we should create different scenarios, and then we should refine further with respect to our model, and then we can deploy our model and do different testing. Even though, you can implement MLOps, which can auto train our model with respect to some specific scenarios. Like, after a few months, it would go to train or something.

We have created a bulk model and fine-tuned it with respect to our model. Specifically, we have fine-tuned it with respect to sentiment analysis. To validate the output of BERT-based sentiment analysis model and ensure its accuracy, we should ensure these kinds of accuracy, like, things or scenarios we should implement. We should basically calculate the accuracy of our dataset. We should calculate the accuracy of our model with respect to our test dataset. And with respect to that, during training, we have already done a train-test split and we are providing the training data to the BERT model to be trained on. Then we are calculating the accuracy with respect to our test data. And then we are basically performing some metrics over there. It would basically depend on the business case to business case. In some business cases, we do prefer with respect to precision and some recall. And sometimes we just focus on the f-one score. To evaluate the performance or to check the performance metrics. And sometimes, we just see the confusion matrix and do some identification and evaluation with respect to that. And as part of that, cross-validation techniques, we can also implement. We have created different folds with respect to our data, and then later on, we can check validations with respect to two different folds and take the best accuracy model. And we can deploy that thing as well. So that kind of thing we can also implement. And as part of that, part of that. Continuous model deployment and training, which we normally use in MLOps. So that kind of scenario, we can also implement. Let's suppose we just trained our model, and after two weeks, we are getting some drifting with respect to data and model. And then later on, we can auto fine-tune the model, and can compare its accuracy with respect to the previous model, and auto deploy that. So, yeah, these kind of things that we can implement. Let's go to the next question.

Kundan Singh

Senior Generative AI Engineer

10.67 years

Skillsets

Vetted For

Professional Summary

Applications & Tools Known

Work History

Senior Data Scientist

Senior Data Scientist

Data Scientist

Jr. Data Scientist & Python Developer

Senior PHP Developer and Python programmer

Education

Masters in AI/ML

B.Tech in Computer Science

AI-interview Questions & Answers