Big Data Consultant having 8+ years of experience with strong theoretical skills and a passion for data
platforms,machine learning and deep learning.
Skilled both in data engineering and DevOps, experienced with large
projects and heterogeneous infrastructures.
Customer-oriented and structured method of working, focused on quality
and maintainability. Highly motivated to work in a team, comfortable in big
companies as well as in small teams.
Azure Data Engineer
Food IndustryAWS Big Data Engineer
Fin TechBig Data Team Lead
E-commerce AggregationConsultant for Fortune 500 Companies
Pfizer, Gilead Life SciencesLead Engineer
Social MediaAWS (Amazon Web Services)
Microsoft Azure
A large data management project required a multi-company collaboration to enable data transfer/analytics from multiple sources to multiple
destinations-
Ingest data from multiple sources like Data Lake, SQL DB, SQL Data warehouse and SFTP
Implementation of new business rules in Azure Databricks using Python,
SQL and PySpark of da
Development of Hive assets by using Azure Databricks
Technologies include:
Microsoft Azure
PySpark
Python
SQL
Responsible for infrastructure creation as well as implementation of data
requirements of different solutions-
Creation of Infrastructure using code
Lambda and API development
Glue woorkflows and Glue jobs development for data loading to Redshift
Technologies include:
AWS CDK ( Cloud Development Kit) for creation of Infrastructure
Python for coding in Lambda functions for data pull from Redshift based
on API request parameters
AWS API Gateway
AWS CodeCommit for Version Control of Infrastructure as well as code
Glue Jobs and Glue workflows for data loading to Redshift
Redshift for data storage/analytics
AWS Managed Kafka
Responsible for development of data platform from scratch-
Design and implementation of data platform on Amazon Web Services
Data Ingestion from Variety of data sources like Amazon, Facebook, Google
Analytics, Web Apps, Static files, databases etc
Data Quality Framework using notification system of AWS
Technologies include:
Daton for data ingestion from Amazon Selling Partner, Facebook and Google
Analytics
AWS as Cloud Platform
Lambda Functions, Python for development of custom setup for data ingestion
AWS Glue, PySpark, dbt for data transformations
Dynamodb for storing templates for data ingestion from static files
Airflow for Orchestration
GitHub for version control of code
AWS Sagemaker for adhoc data analytics
Responsible for design and development of data platform for a social media marketing company-
Data Ingestion from all major social media platforms
Serverless data platform
Concurrent data load for 100 clients daily
Containerized code for attribution of sales to marketing channels
Technologies include:
AWS as the cloud platform
Serverless AWS services ( S3, Lambda, Step Functions, Aurora Serverless
DB)
PySpark for data transformation
AWS Fargate for execution of containerized code hosted in AWS Elastic
container registry
Large pharma data management projects with the goal to establish platforms with modern architecture. Main focus was the migration of legacy
data, by assuring data quality and transformation into various formats-
Customer consulting with regard to loading / unloading interfaces
Definition of requirements for transformation of legacy data
Implementation of algorithms for data transformation
Tool development for secure data transport
Tool development for tests of data quality/interface implementation
Technologies include:
Standard Linux tools, such as awk, sed, grep, ...
Python for in-depth data analysis
AWS Redshift for Data Storage
A large data management project required a multi-company collaboration to enable data transfer/analytics from multiple sources to multiple
destinations.
Responsible for design and development of data platform for a social media marketing company
Data Ingestion from all major social media platforms
Serverless data platform
Concurrent data load for 100 clients daily
Containerized code for attribution of sales to marketing channe
Development of data platform from scratch.