Danish Mushtaq

Big Data Consultant having 8+ years of experience with strong theoretical skills and a passion for data

platforms,machine learning and deep learning.

Skilled both in data engineering and DevOps, experienced with large

projects and heterogeneous infrastructures.

Customer-oriented and structured method of working, focused on quality

and maintainability. Highly motivated to work in a team, comfortable in big

companies as well as in small teams.

Role
Data Engineer
Years of Experience
8 years

Skillsets

Terraform
SAS
AWS Glue
AWS Redshift
Cassandra
Apache Kafka
SQL
Cloud DevOps
Lambda Function
AWS CDK
AWS
Git
Spark
Snowflake
Open Source Tools
Data Engineering
Pyspark
Azure
Python Programming
Data Science

Professional Summary

8Years

May, 2022 - Jun, 20231 yr 1 month
Azure Data Engineer
Food Industry
Sep, 2021 - May, 2022 8 months
AWS Big Data Engineer
Fin Tech
Jan, 2020 - Sep, 20211 yr 8 months
Big Data Team Lead
E-commerce Aggregation
Dec, 2015 - Apr, 20193 yr 4 months
Consultant for Fortune 500 Companies
Pfizer, Gilead Life Sciences
Apr, 2019 - Jan, 2020 9 months
Lead Engineer
Social Media

Applications & Tools Known

AWS (Amazon Web Services)
Microsoft Azure

Work History

8Years

Azure Data Engineer

Food Industry

May, 2022 - Jun, 20231 yr 1 month

A large data management project required a multi-company collaboration to enable data transfer/analytics from multiple sources to multiple

destinations-

Ingest data from multiple sources like Data Lake, SQL DB, SQL Data warehouse and SFTP

Implementation of new business rules in Azure Databricks using Python,

SQL and PySpark of da

Development of Hive assets by using Azure Databricks

Technologies include:

Microsoft Azure

PySpark

Python

SQL

AWS Big Data Engineer

Fin Tech

Sep, 2021 - May, 2022 8 months

Responsible for infrastructure creation as well as implementation of data

requirements of different solutions-

Creation of Infrastructure using code

Lambda and API development

Glue woorkflows and Glue jobs development for data loading to Redshift

Technologies include:

AWS CDK ( Cloud Development Kit) for creation of Infrastructure

Python for coding in Lambda functions for data pull from Redshift based

on API request parameters

AWS API Gateway

AWS CodeCommit for Version Control of Infrastructure as well as code

Glue Jobs and Glue workflows for data loading to Redshift

Redshift for data storage/analytics

AWS Managed Kafka

Big Data Team Lead

E-commerce Aggregation

Jan, 2020 - Sep, 20211 yr 8 months

Responsible for development of data platform from scratch-

Design and implementation of data platform on Amazon Web Services

Data Ingestion from Variety of data sources like Amazon, Facebook, Google

Analytics, Web Apps, Static files, databases etc

Data Quality Framework using notification system of AWS

Technologies include:

Daton for data ingestion from Amazon Selling Partner, Facebook and Google

Analytics

AWS as Cloud Platform

Lambda Functions, Python for development of custom setup for data ingestion

AWS Glue, PySpark, dbt for data transformations

Dynamodb for storing templates for data ingestion from static files

Airflow for Orchestration

GitHub for version control of code

AWS Sagemaker for adhoc data analytics

Lead Engineer

Social Media

Apr, 2019 - Jan, 2020 9 months

Responsible for design and development of data platform for a social media marketing company-

Data Ingestion from all major social media platforms

Serverless data platform

Concurrent data load for 100 clients daily

Containerized code for attribution of sales to marketing channels

Technologies include:

AWS as the cloud platform

Serverless AWS services ( S3, Lambda, Step Functions, Aurora Serverless

DB)

PySpark for data transformation

AWS Fargate for execution of containerized code hosted in AWS Elastic

container registry

Consultant for Fortune 500 Companies

Pfizer, Gilead Life Sciences

Dec, 2015 - Apr, 20193 yr 4 months

Large pharma data management projects with the goal to establish platforms with modern architecture. Main focus was the migration of legacy

data, by assuring data quality and transformation into various formats-

Customer consulting with regard to loading / unloading interfaces

Definition of requirements for transformation of legacy data

Implementation of algorithms for data transformation

Tool development for secure data transport

Tool development for tests of data quality/interface implementation

Technologies include:

Standard Linux tools, such as awk, sed, grep, ...

Python for in-depth data analysis

AWS Redshift for Data Storage

Achievements

1) Food Industry Achievements include: • End to End data management solution built using Azure Data Factory and Azure Databricks 2)Fin Tech Achievements include: • Developed 10+ Lambda Functions integrated with API Gateway to serve data to Power BI and other apps • Developed 5+ Lambda Functions event driven by S3 and integrated with Amazon Managed Kafka 3)E-commerce Aggregation- Achievements include: • Designed the solution for data ingestion, analytics and warehousing • Implemented Step Functions and Lambda Functions for data ingestion from Netsuite 4) Social Media Achievements include: • Implementing orchestration of the end to end data pipeline using AWS Step Functions • Defined the template for ETL scripts in Glue • Implemented container execution on AWS Fargate 5) Pfizer, Gilead Life Sciences Achievements include: • Documentation of legacy processes written in SQL and SAS • Design of configurable data load framework to handle Adult Ped Split for Pfizer Prevnar 20 Drug

Major Projects

3Projects

Data Lake on Azure

Jun, 2022 - Apr, 2023 10 months

A large data management project required a multi-company collaboration to enable data transfer/analytics from multiple sources to multiple

destinations.

Ingest data from multiple sources like SQL DB, SQL Data warehouse and SFTP, APIs
Implementation of new business rules in Azure Databricks using Python,
SQL and Spark ( Python, Scala)
Development of Hive assets for use in Dremio

Modern Data Platform for Marketing Attribution

Jun, 2021 - Jun, 20221 yr

Responsible for design and development of data platform for a social media marketing company

Data Ingestion from all major social media platforms

Serverless data platform

Concurrent data load for 100 clients daily

Containerized code for attribution of sales to marketing channe

Modern Data Platform for E-commerce

Jan, 2021 - Jun, 2021 5 months

Development of data platform from scratch.

Design and implementation of data platform on Amazon Web Services
DataIngestionfromVarietyofdatasourceslikeAmazon,Shopify,Internal
Data sources like ERP

Education

B. Tech. Computer Science
National Institute of Technlogy Srinagar (2011)

Certifications

AWS Certified - Data Analytics Speciality
AWS Certified - Solutions Architect
Snowflake SnowPro
Microsoft Azure Fundamentals

Danish Mushtaq

Danish Mushtaq

Data Engineer

8 years

Skillsets

Professional Summary

Applications & Tools Known

Work History

Azure Data Engineer

AWS Big Data Engineer

Big Data Team Lead

Lead Engineer

Consultant for Fortune 500 Companies

Achievements

Major Projects

Data Lake on Azure

Modern Data Platform for Marketing Attribution

Modern Data Platform for E-commerce

Education

B. Tech. Computer Science

Certifications

AWS Certified - Data Analytics Speciality

AWS Certified - Solutions Architect

Snowflake SnowPro

Microsoft Azure Fundamentals