profile-pic

Arpith

Vetted Talent

Arpith

Vetted Talent

As an accomplished Software Engineer with 11+ years of experience and proven track record in leadership. I excel at driving operational improvements and enhancing customer satisfaction.

  • Role

    Full Stack Developer

  • Years of Experience

    12 years

Skillsets

  • Apache Flink - 3 Years
  • Cloud Infrastructure - 6 Years
  • Cloud DevOps - 1 Years
  • Oo Design - 7 Years
  • Python - 9 Years
  • Github - 8 Years
  • MySQL - 8 Years
  • Postgres - 2 Years
  • Prometheus - 2 Years
  • Big Data - 3 Years
  • API - 3 Years
  • Java - 6 Years
  • big data, data science, and data analysis tools - 6 Years
  • Flink - 3 Years
  • Hadoop - 3 Years
  • GCP - 3 Years
  • Kafka - 4 Years
  • Spark - 3 Years
  • Django - 6 Years
  • Elasticsearch - 2 Years
  • Microservice - 4 Years
  • AWS - 6 Years
  • Kafka/Flink/Spark - 6 Years
  • Python/FastAPI - 8 Years

Vetted For

15Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Staff Software Engineer - Payments EconomicsAI Screening
  • 70%
    icon-arrow-down
  • Skills assessed :Collaboration, Communication, Payments systems, service-to-service communication, Stakeholder Management, Architectural Patterns, Architecture, Coding, HLD, LLD, Problem Solving, Product Strategy, SOA, Team Handling, Technical Management
  • Score: 63/90

Professional Summary

12Years
  • Aug, 2023 - Present1 yr 8 months

    Freelancer

    Freelance
  • Jan, 2021 - Jan, 20221 yr

    Staff Engineer

    Egnyte
  • Jan, 2018 - Jan, 20213 yr

    Lead Engineer

    Target India
  • Jan, 2009 - May, 20112 yr 4 months

    Software Developer

    HCL Technologies
  • Jan, 2013 - Feb, 20163 yr 1 month

    Big Data Developer

    Rackspace
  • Jan, 2016 - Feb, 20171 yr 1 month

    Software Engineer

    Kahuna Inc

Applications & Tools Known

  • icon-tool

    PostgreSQL

  • icon-tool

    Django REST framework

  • icon-tool

    Apache Cassandra

  • icon-tool

    Apache HBase

  • icon-tool

    Druid

Work History

12Years

Freelancer

Freelance
Aug, 2023 - Present1 yr 8 months
    • Orchestrated infrastructure scaling for GraphQL(1) servers, handling traffic with multiple load balancers to ensure high availability in critical events.
    • Identified and addressed performance bottlenecks, resulting in a 60% reduction in daily average costs and enabling efficient serving of 2 million concurrent users with 80% fewer instances compared to the previous edition.
    1. GraphQL's graph traversal features led to excessive database calls, impacting performance significantly. Using Dataloader to batch and cache database calls efficiently, reducing the number of queries made. Despite using Dataloader, complex views trigger multiple database calls, leading to slow loading times. Implemented Query Caching using services like Apollo Engine to cache entire GraphQL query responses. 

Staff Engineer

Egnyte
Jan, 2021 - Jan, 20221 yr
    • Designed and implemented Egnyte Search connector using Apache Tika(1) for over

    100 file types, enabling search engine indexing and content analysis.

    • Designed and implemented a migration tool that enabled the indexing of 100TB

    of Autocad files and OCR data for efficient search capability.

    • Implemented Data Deduplicator(2) using MD5 hashing to reduce costs, save

    storage space and improve system performance by eliminating duplicate

    content.

    1. The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (. All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more.
    2. Implemented data duplication of cryptographic hashes, such as MD5, which is known to work across all file types, not limited to images. Utilizing these hashes for file integrity verification, to detect minutest alterations in files for detecting any form of file manipulation.

Lead Engineer

Target India
Jan, 2018 - Jan, 20213 yr
    • Led engineering team to successfully build and deploy real-time analytics dashboard using Apache Flink stored in time-series database Druid.
    • Designed and developed a high-performance real time Streaming Pipeline using Apache Flink and Kafka, processing 4 billion events and 14TB of data per day.
    • Significantly reduced infrastructure costs by 3X by replacing the underlying architecture previously built on Hadoop.
    • Built Credit Card Fraud Detection initial version using Apache Flink datastream API.

Software Engineer

Kahuna Inc
Jan, 2016 - Feb, 20171 yr 1 month
    • Built a multi-channel customer journey visualization platform using Google App Engine, enhancing user engagement and retention for marketers.
    • Built Microservice with high-throughput API to provide ad placement on mobile devices, helping brands to identify, engage, and acquire consumers

Big Data Developer

Rackspace
Jan, 2013 - Feb, 20163 yr 1 month
    • Deployed and administered a Hadoop cluster, Kafka, and Spark with 20+ nodes using Cloud Orchestration based on Openstack Heat on cloud provider such as Azure/GCP/AWS, similar to what orchestration tool Ambari does.
    • Accelerated customer growth by 2X by reducing the time taken to provision new clusters by 5X with SAAS.

Software Developer

HCL Technologies
Jan, 2009 - May, 20112 yr 4 months
    • Built an in-house monitoring application like Graphana in Java.
    • Designed and implemented a powerful metrics and alerting tool to capture multi-dimensional time-series data and integrated with pager duty.

Testimonial

Target

Samrakshini

Linkedin Recommendation

Major Projects

3Projects

Text extraction on all type of files

Egnyte
Jan, 2022 - May, 2022 4 months

    Strengths of Apache Tika:

    • Content Extraction:
    • Apache Tika excels in extracting content from a diverse range of file formats, providing a unified interface for content analysis.
    • Metadata Retrieval:
    • Efficiently retrieves metadata, offering valuable information about documents, including author, creation date, and more.
    • Language Detection:
    • Provides language detection capabilities, aiding in understanding the linguistic context of documents.
    • Extensibility:
    • Highly extensible, allowing users to add custom parsers for specific file formats or customize existing ones.

    Drawbacks and Challenges:

    • Extraction Data Size:
    • Faces challenges with large data extraction, where performance may degrade for extensive documents, impacting processing speed.
    • Missing Mime Type Detection:
    • Tika may encounter difficulties in accurately detecting mime types for certain file formats, leading to potential misclassification.
    • Language Detection Accuracy:
    • While offering language detection, the accuracy may vary depending on the complexity of the document, potentially leading to misidentifications.
    • Resource Intensiveness:
    • Processing resource-intensive files might strain system resources, affecting overall performance and responsiveness.

Realtime Analytics Platform

Target
Jan, 2020 - Aug, 2020 7 months

    Strengths of the Flink Project for Real-Time Analytics:

    • Agility:
    • Flink's high-level API facilitates maintaining a single codebase for the entire search infrastructure process.
    • Provides a framework for expressing complex business logic efficiently.
    • Consistency:
    • Offers at-least-once semantics crucial for reflecting changes in databases.
    • Adaptable to exactly-once requirements for various use cases within the company.
    • Low Latency:
    • Enables rapid updates in search results, ensuring timely reflection of changes like inventory availability.
    • Suitable for dynamic scenarios where low-latency is essential.
    • Cost Efficiency:
    • Handles high-throughput efficiently, resulting in significant cost savings for Alibaba's data processing needs.

    Challenges Faced and Optimization Strategies:

    • External Storage Bottleneck:
    • Identified accessing external storage like HBase as a production bottleneck.
    • Introduced Asynchronous I/O to address this issue, with plans to contribute to the community.
    • State Backends and Latency Optimization:
    • Highlighted differences in latency when using different state backends (filesystem/hashmap vs. rocksdb).
    • Provided insights into optimizing state backend choices based on state size and memory capacity.
    • Resource Allocation for Low Latency:
    • Emphasized the importance of allocating enough resources to reduce latency.
    • Recommended monitoring Flink metrics and scaling up or out based on job requirements.
    • Experimental Results:
    • Shared experimental results for the WindowingJob, showcasing latency reductions with increased parallelism.
    • Illustrated the impact of resource allocation on reducing the 99th percentile latency.

Omni-Channel marketing platform

Kahuna
Jan, 2017 - May, 2017 4 months

    Strengths of Omni-Channel Marketing Platform 

    • Multi-Channel Integration:
    • Integrates seamlessly with various channels, including Yelp, offering a unified platform for marketing efforts.
    • Enhanced Visibility:
    • Leverages Yelp's extensive user base to enhance visibility and reach a diverse audience across multiple channels.
    • Customer Engagement:
    • Facilitates effective customer engagement by utilizing Yelp's features, such as reviews and ratings, to build trust and credibility.
    • Data Analytics:
    • Incorporates robust data analytics capabilities, allowing businesses to gain insights into customer behavior and preferences.
    • Personalized Marketing:
    • Enables personalized marketing strategies by leveraging Yelp data, tailoring messages to specific customer segments.

    Challenges and Considerations:

    • Rate Limiting for Push Notifications:
    • Faces challenges with rate limiting when sending push notifications, requiring careful management to avoid exceeding service limits and ensuring effective communication.
    • Timezone Differences in Messages:
    • Addresses timezone differences as a challenge, necessitating the implementation of strategies to ensure messages are delivered at optimal times across diverse geographical locations.
    • Coordination Across Channels:
    • Manages coordination challenges when orchestrating marketing efforts across multiple channels, ensuring a cohesive and consistent brand message.
    • User Privacy and Permissions:
    • Navigates the complexities of user privacy concerns and permissions, ensuring compliance with regulations and building trust among customers.

Education

  • Bachelor of Engineering Computer Science Designed and implemented

    Visvesvaraya Technological University (2009)
  • Master's Degree in Cloud Computing

    University Of Texas, Dallas (2013)

Certifications

  • Google Analytics