profile-pic

Arpith

Vetted Talent

Arpith

Vetted Talent

As an accomplished Software Engineer with 11+ years of experience and proven track record in leadership. I excel at driving operational improvements and enhancing customer satisfaction.

  • Role

    Machine Leaning Engineering Manager

  • Years of Experience

    12 years

Skillsets

  • Big Data
  • Sagemaker
  • REST
  • PostgreSQL
  • Ml models
  • Hadoop
  • GraphQL
  • Google apps
  • Google App Engine
  • Druid
  • AWS - 6 Years
  • Apache Spark
  • Apache
  • Oozie
  • Azure
  • Python - 9 Years
  • Prometheus - 2 Years
  • Java - 6 Years
  • Kafka - 4 Years
  • Elasticsearch - 2 Years

Vetted For

15Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Staff Software Engineer - Payments EconomicsAI Screening
  • 70%
    icon-arrow-down
  • Skills assessed :Collaboration, Communication, Payments systems, service-to-service communication, Stakeholder Management, Architectural Patterns, Architecture, Coding, HLD, LLD, Problem Solving, Product Strategy, SOA, Team Handling, Technical Management
  • Score: 63/90

Professional Summary

12Years
  • Feb, 2025 - Present 4 months

    Machine Leaning Engineering Manager

    Bluevine
  • Jan, 2023 - Dec, 20241 yr 11 months

    Consultant Engineering Manager

    Consultant Engineering Manager
  • Jan, 2021 - Dec, 20221 yr 11 months

    Staff Engineer

    Egnyte
  • Jan, 2013 - Dec, 20163 yr 11 months

    Big Data Developer

    Rackspace
  • Jan, 2016 - Dec, 20171 yr 11 months

    Software Engineer

    Kahuna Inc
  • Jan, 2018 - Dec, 20213 yr 11 months

    Lead Engineer

    Target
  • Jan, 2009 - Dec, 20112 yr 11 months

    Software Developer

    HCL Technologies

Applications & Tools Known

  • icon-tool

    PostgreSQL

  • icon-tool

    Django REST framework

  • icon-tool

    Apache Cassandra

  • icon-tool

    Apache HBase

  • icon-tool

    Druid

Work History

12Years

Machine Leaning Engineering Manager

Bluevine
Feb, 2025 - Present 4 months
    Led the development of advanced ML models for fraud detection, anomaly analysis and streaming solutions. Fraud and Anomaly Detection Models: Led development of ML models for fraud and anomaly detection, including account fraud, name mismatch, fraud mismatch significantly improving fraud prevention and operational security. Analytics Engine Development: Led development of rollup analytics engine for efficient aggregation, filtering, and computation of entity-specific data, boosting the speed and accuracy of business reporting. Change Data Capture (CDC) Integration: Managed the migration from MySQL to Kafka using Debezium, enabling real-time data streaming and ensuring consistency and high availability across distributed systems. AWS SageMaker ML-Ops Framework: Architected and led an end-to-end ML-Ops framework on AWS SageMaker, overseeing model training, data processing, performance monitoring, and versioning, which streamlined ML workflows and accelerated model delivery.

Consultant Engineering Manager

Consultant Engineering Manager
Jan, 2023 - Dec, 20241 yr 11 months
    Led a team to manage infrastructure for a GraphQL service Responsible for infrastructure scaling for GraphQL servers with multiple load balancers to ensure high availability. Identified and addressed API performance bottlenecks, reducing daily average costs by 60%. Enabled efficient serving of 500K concurrent users with 80% fewer instances.

Staff Engineer

Egnyte
Jan, 2021 - Dec, 20221 yr 11 months
    Engaged in backend engineering for content management systems Designed and implemented Egnyte Search connector using Apache Tika for over 100 file types, enabling search engine indexing and content analysis. Designed and implemented a Migration tool that enabled the indexing of 100TB of Autocad files and OCR data for efficient search capability. Implemented Data Deduplicator using MD5 hashing to reduce costs, save storage space and improve system performance by eliminating duplicate content.

Lead Engineer

Target
Jan, 2018 - Dec, 20213 yr 11 months
    Focused on data analytics and performance optimization Led engineering team to successfully build and deploy a real-time analytics dashboard, enhancing user accessibility and data insights. Developed and optimized core features of a no-code data analysis platform, focusing on constructing complex analytical queries routed to a query federation system. Designed and developed a high-performance real-time streaming pipeline using Apache Flink and Kafka, processing 4 billion events building data-intensive back-end performance optimization. Used Apache Druid for efficient data storage and real-time analytics, contributing to the scalability and extensibility of the platform solutions.

Software Engineer

Kahuna Inc
Jan, 2016 - Dec, 20171 yr 11 months
    Worked on customer engagement solutions Built a multi-channel customer journey visualization platform using Google App Engine. Developed a high-throughput API for ad placements on mobile devices.

Big Data Developer

Rackspace
Jan, 2013 - Dec, 20163 yr 11 months
    Focused on big data solutions and cloud orchestration Deployed and administered a Hadoop cluster and Kafka with 20= nodes Accelerated customer growth by 2X by efficient provisioning of cloud clusters.

Software Developer

HCL Technologies
Jan, 2009 - Dec, 20112 yr 11 months
    Involved in software development and system performance enhancement Led design and development of high-throughput microservices using REST API in Python Achieved a 30% improvement in overall system efficiency.

Testimonial

Target

Samrakshini

Linkedin Recommendation

Major Projects

3Projects

Text extraction on all type of files

Egnyte
Jan, 2022 - May, 2022 4 months

    Strengths of Apache Tika:

    • Content Extraction:
    • Apache Tika excels in extracting content from a diverse range of file formats, providing a unified interface for content analysis.
    • Metadata Retrieval:
    • Efficiently retrieves metadata, offering valuable information about documents, including author, creation date, and more.
    • Language Detection:
    • Provides language detection capabilities, aiding in understanding the linguistic context of documents.
    • Extensibility:
    • Highly extensible, allowing users to add custom parsers for specific file formats or customize existing ones.

    Drawbacks and Challenges:

    • Extraction Data Size:
    • Faces challenges with large data extraction, where performance may degrade for extensive documents, impacting processing speed.
    • Missing Mime Type Detection:
    • Tika may encounter difficulties in accurately detecting mime types for certain file formats, leading to potential misclassification.
    • Language Detection Accuracy:
    • While offering language detection, the accuracy may vary depending on the complexity of the document, potentially leading to misidentifications.
    • Resource Intensiveness:
    • Processing resource-intensive files might strain system resources, affecting overall performance and responsiveness.

Realtime Analytics Platform

Target
Jan, 2020 - Aug, 2020 7 months

    Strengths of the Flink Project for Real-Time Analytics:

    • Agility:
    • Flink's high-level API facilitates maintaining a single codebase for the entire search infrastructure process.
    • Provides a framework for expressing complex business logic efficiently.
    • Consistency:
    • Offers at-least-once semantics crucial for reflecting changes in databases.
    • Adaptable to exactly-once requirements for various use cases within the company.
    • Low Latency:
    • Enables rapid updates in search results, ensuring timely reflection of changes like inventory availability.
    • Suitable for dynamic scenarios where low-latency is essential.
    • Cost Efficiency:
    • Handles high-throughput efficiently, resulting in significant cost savings for Alibaba's data processing needs.

    Challenges Faced and Optimization Strategies:

    • External Storage Bottleneck:
    • Identified accessing external storage like HBase as a production bottleneck.
    • Introduced Asynchronous I/O to address this issue, with plans to contribute to the community.
    • State Backends and Latency Optimization:
    • Highlighted differences in latency when using different state backends (filesystem/hashmap vs. rocksdb).
    • Provided insights into optimizing state backend choices based on state size and memory capacity.
    • Resource Allocation for Low Latency:
    • Emphasized the importance of allocating enough resources to reduce latency.
    • Recommended monitoring Flink metrics and scaling up or out based on job requirements.
    • Experimental Results:
    • Shared experimental results for the WindowingJob, showcasing latency reductions with increased parallelism.
    • Illustrated the impact of resource allocation on reducing the 99th percentile latency.

Omni-Channel marketing platform

Kahuna
Jan, 2017 - May, 2017 4 months

    Strengths of Omni-Channel Marketing Platform 

    • Multi-Channel Integration:
    • Integrates seamlessly with various channels, including Yelp, offering a unified platform for marketing efforts.
    • Enhanced Visibility:
    • Leverages Yelp's extensive user base to enhance visibility and reach a diverse audience across multiple channels.
    • Customer Engagement:
    • Facilitates effective customer engagement by utilizing Yelp's features, such as reviews and ratings, to build trust and credibility.
    • Data Analytics:
    • Incorporates robust data analytics capabilities, allowing businesses to gain insights into customer behavior and preferences.
    • Personalized Marketing:
    • Enables personalized marketing strategies by leveraging Yelp data, tailoring messages to specific customer segments.

    Challenges and Considerations:

    • Rate Limiting for Push Notifications:
    • Faces challenges with rate limiting when sending push notifications, requiring careful management to avoid exceeding service limits and ensuring effective communication.
    • Timezone Differences in Messages:
    • Addresses timezone differences as a challenge, necessitating the implementation of strategies to ensure messages are delivered at optimal times across diverse geographical locations.
    • Coordination Across Channels:
    • Manages coordination challenges when orchestrating marketing efforts across multiple channels, ensuring a cohesive and consistent brand message.
    • User Privacy and Permissions:
    • Navigates the complexities of user privacy concerns and permissions, ensuring compliance with regulations and building trust among customers.

Education

  • Master of Science

    The University of Texas, Dallas (2013)
  • Bachelor of Engineering

    Visvesvaraya Technological University (2009)

Certifications

  • Google Analytics