profile-pic
Vetted Talent

Chakri Muthyala

Vetted Talent
Passionate Machine Learning engineer with 5+ years' experience in computer vision and machine learning. Developed ML algorithms for production-scale business solutions. Expertise in the development of the entire computer vision life-cycle for edge and cloud. Deep Learning and Computer Vision expertise applied from academic research to industry. Strong problem-solving skills, thrive in collaborative and independent environments, excelling in high-velocity teams.
  • Role

    Data Scientist & FastAPI Engineer

  • Years of Experience

    11.2 years

  • Professional Portfolio

    View here

Skillsets

  • Nvidia deepstream
  • Prometheus
  • Plotly
  • Pinecone
  • Pet
  • pandas
  • Openmmlab
  • OpenCV
  • OpenAI APIs
  • Pytorch3D
  • NumPy
  • Mmocr
  • MLlib
  • MLFlow
  • Milvus
  • MATLAB
  • Llamacpp
  • Linux
  • spaCy
  • Layoutlmv2
  • Chemdat extractor
  • ZeroMQ
  • Timm
  • TensorRT
  • Streamlit
  • SQLAlchemy
  • SQL
  • Kubeflow
  • Sktime
  • Shell Scripting
  • Sentence-transformers
  • Selenium
  • Scrapy
  • Scikit-learn
  • Scikit-image
  • Ray
  • Python
  • Apache Spark
  • Apache Pinot
  • Apache Kafka
  • AWS
  • TensorFlow
  • Python
  • AWS
  • TensorFlow
  • Apache TVM
  • LlamaIndex - 2 Years
  • LangChain - 3 Years
  • AWS - 5 Years
  • Docker - 4 Years
  • TensorFlow - 5 Years
  • PyTorch - 5 Years
  • Python - 7 Years
  • Python - 7 Years
  • Domino
  • Kornia
  • Keras
  • HuggingFace
  • Google Cloud
  • Git
  • Flask
  • FastAPI
  • Facebook prophet
  • Python - 5 Years
  • Detectron2
  • Delta Lake
  • Dask
  • ChatGPT
  • Celery
  • Bitbucket
  • Aws sagemaker

Vetted For

13Skills
  • Roles & Skills
  • Results
  • Details
  • icon-skill_image
    Machine Learning EngineerAI Screening
  • 64%
    icon-arrow-down
  • Skills assessed :PySpark, PyTorch, TensorFlow, CI/CD, Data Pipeline, Data Streaming Tool, NO SQL, OOP programming language, Python, data-science, GCP, Java, SQL
  • Score: 64/100

Professional Summary

11.2Years
  • Nov, 2024 - Present1 yr 7 months

    Senior Data Scientist

    Turing
  • Sep, 2023 - Apr, 20251 yr 7 months

    Senior Machine Learning Engineer

    Johnson & Johnson Innovative Medicine
  • Jun, 2021 - Apr, 20253 yr 10 months

    Senior Machine Learning Engineer

    Johnson & Johnson Innovative Medicine
  • Aug, 2019 - Dec, 20201 yr 4 months

    Software Development Engineer II - Computer Vision and IoT

    NearBuzz
  • Jan, 2021 - Mar, 2021 2 months

    Research Assistant

    Oregon Health & Science University
  • Apr, 2021 - May, 2021 1 month

    Machine Learning Engineer

    DeepGears
  • Sep, 2017 - Jul, 20191 yr 10 months

    Software Development Engineer I - Computer Vision and IoT

    NearBuzz
  • May, 2016 - Apr, 2017 11 months

    Research Fellow

    L V Prasad Eye Innovations

Applications & Tools Known

  • icon-tool

    Python

  • icon-tool

    PyTorch

  • icon-tool

    Tensorflow

  • icon-tool

    AWS (Amazon Web Services)

  • icon-tool

    Azure

  • icon-tool

    GCP

  • icon-tool

    MySQL

  • icon-tool

    MongoDB

  • icon-tool

    FastAPI

  • icon-tool

    Flask

  • icon-tool

    NoSQL

  • icon-tool

    Docker

  • icon-tool

    Kubernetes

  • icon-tool

    Bash

Work History

11.2Years

Senior Data Scientist

Turing
Nov, 2024 - Present1 yr 7 months

Senior Machine Learning Engineer

Johnson & Johnson Innovative Medicine
Sep, 2023 - Apr, 20251 yr 7 months

Senior Machine Learning Engineer

Johnson & Johnson Innovative Medicine
Jun, 2021 - Apr, 20253 yr 10 months

Machine Learning Engineer

DeepGears
Apr, 2021 - May, 2021 1 month

Research Assistant

Oregon Health & Science University
Jan, 2021 - Mar, 2021 2 months

Software Development Engineer II - Computer Vision and IoT

NearBuzz
Aug, 2019 - Dec, 20201 yr 4 months
    Developed efficient unconstrained face analysis models for low-end hardware using Keras. Experimented inference optimization techniques in Keras. Implemented academic papers and customized them using Pytorch. Deployed multi-model deep learning pipeline on Nvidia Jetson Nano. Converted and approximated layers for ONNX compatibility using Keras. Evaluated models on Nvidia Deep Stream, pyMNN library, and Apache TVM. Successfully ported models and decreased latency by 3x. Developed low latency data streaming on AWS using Kinesis, Lambda, RDS, and S3. Customized docker images and deployed on Jetson Nano and cloud services. Deployed on 500+ Nvidia Jetson Nano devices for real time analytics. Generated 3D SMPL body models from 2D silhouettes by optimizing projections with Pytorch3D. Converted SMPL models to PyTorch3D tensors. Implemented portrait segmentation models. Deployed as containerized pipeline on Amazon ECS with Kinesis integration.

Software Development Engineer I - Computer Vision and IoT

NearBuzz
Sep, 2017 - Jul, 20191 yr 10 months
    Created real-time detection and tracking pipeline for crowds. Built gaze estimation model based on facial keypoints. Employed TensorFlow for matching algorithms. Utilized ZeroMQ and multi-threading for metadata extraction and preprocessing. Streamed tracked data through GCP PubSub, DataFlow, BigQuery, and visualized in DataStudio. Generated 10M+ data for Indian facial metrics using web crawling with Selenium and GCP Vision API. Utilized AWS S3 and SQL for data storage, with analysis via SQLAlchemy, Pandas, and Plotly. Created synthetic dataset for Indian-specific attribute classification.

Research Fellow

L V Prasad Eye Innovations
May, 2016 - Apr, 2017 11 months
    Developed machine vision algorithm for correction and removal of image artifacts produced by optical glare. Featured in MIT News Letter and RaspberryPi Magazine. Used Python, OpenCV and MATLAB. Tested on Odroid-XU4 embedded board. Developed hardware accelerated pupil detection and area estimation solution using OpenCL. Integrated bio-sensors to embedded device using Python, C and Arduino. Deployed device with low-cost hardware.

Achievements

  • Advanced Data Analysis: Proficiency in utilizing advanced data analysis techniques to extract insights and inform business decisions. Software Development: Skilled in software development, including coding, debugging, and deploying software solutions. Database Management: Expertise in managing and optimizing databases to ensure efficient data storage and retrieval. IT Project Management: Strong capability in managing IT projects, ensuring they meet technical requirements and are delivered on time. Cloud Computing: Experience in utilizing cloud computing platforms for scalable and efficient computing solutions. Networking: Understanding of network architecture and the ability to manage network resources effectively. AI and Machine Learning: Familiarity with AI and machine learning concepts, and the ability to apply these in practical scenarios. Technical Troubleshooting: Adept at identifying and resolving technical issues, ensuring minimal disruption to operations. System Integration: Competent in integrating various systems and technologies to create cohesive and efficient IT environments.

Major Projects

3Projects

Crystallization Analysis

Johnson and Johnson
Jun, 2021 - Present5 yr

    Developed instance segmentation models for monochrome images.

    Applied SSL and N-shot learning for domain adaptation and annotation reduction.

    Utilized AWS Sagemaker and S3FS for model training/testing.

    Created semi-auto annotation tool with LabelMe, Python, and MMDetection.

    Implemented image-processing and time series ML algorithms using OpenCV, scikit-learn, sktime and pandas.

    Built user-defined data analysis pipeline with Shell, Docker, RDS, S3, and Lambda.

    Deployed forecasting algorithms via FastAPI, Ray, and sktime.

    Developed custom plots with Plotly, Dash, and SQL on AWS RDS.

    Led early-stage SaaS platform development and deployment on Kubernetes with stakeholder communication.

    Co-authored and published in a Data Science Consortium.

    This solution would save several man hours and reduce the lead time in generating the results to understand the crystallization.

Chatbot for Medical Research Insights

Johnson and Johnson
Oct, 2023 - Present2 yr 8 months

    Handled and process large text datasets efficiently.

    Implement pre-processing techniques for text data, including tokenization, stemming, and lemmatization.

    Basic customization of language models or adapt existing LLMs to specific tasks.

    Established task specific workflow using LLMs, utilizing orchestration frameworks like LangChain, LlamIndex,

    prompt tuning methods and tools such as prompt-flow.

    Utilized LLMs for text generation, summarization, Retrieval-Augmented Generation, and other NLP tasks.

    Optimized and fine-tune LLMs for enhanced efficiency and performance.

    Incorporated Milvus vector database for semantic search across vast repositories of medical research papers.

    Developed an web application using Streamlit, allowing users to easily query, upload documents, and receive

    insights through a user-friendly interface.

    Utilized Azure platform for all AI engineering.

    An LLM pipeline to streamline the analysis and comprehension of research papers, facilitating easier access.

Segmentation of Individual Meibomian Glands

Johnson and Johnson
Oct, 2021 - Apr, 2022 6 months

    Pre-processed dataset to remove noisy annotations and bias with pandas, numpy, and OpenCV.

    Customized segmentation models with pytorch, TIMM, MMSegmentation, and Tensorflow.

    Utilized GCP VertexAI for data management and training.

    Co-authored and published it in Data Science Consortium.

    Auto assessment of meibomian glands to quantify dry-eye syndrome.

Education

  • Bachelor of Technology in Electronics and Communication Engineering

    Rajiv Gandhi University Of Knowledge Technologies (2017)

AI-interview Questions & Answers

So, currently, my name is Chekvi. I have around 6 years of experience in machine learning and MLOps, specifically in text technology where I worked on multiple machine learning projects, currently at Johnson and Johnson, particularly on image segmentation and deploying those machine learning models in AWS and Azure cloud platforms using various services, such as S3 or Azure storage buckets, SQL or MongoDB. I had an idea of developing machine learning models and machine learning model management. Apart from that, I can also deploy those models using ASR server frameworks like FastAPI. I also have experience using Python Chantise to develop and deploy convolutional neural models. Previously, I worked for multiple companies where I worked on image reconstruction and embedded edge hardware. Deploying machine learning models for embedded hardware. You can assume that I have a complete understanding of both edge and cloud, as well as different cloud stacks and hardware platforms, like the NVIDIA Jetson Nano or optimized inference libraries like tensorrt. My hobbies include Sakan.

A Python context manager in Python is a construct that provides a way to allocate and release resources precisely, whenever they are called or whenever they are needed. In general, it helps manage resources like database connections. The context manager's keyword is with, used in the with statement. It ensures that the resource is opened and closed properly. For example, in database test transactions, it will open a connection and handle operations and then close the connection site safely. We can use the with statement. So, with the with statement, we can safely open and close a particular resource, helping to save and release the operations and the work properly.

Using no SQL. NoSQL is primarily meant for unstructured data, ending up in features of the noise scale or the flexibility, scalability, and speed. While the NoSQL scale can handle various data without predefining, just like in SQL. It scales very well for large amounts of data even without defining. We know that SQL databases offer more flexibility and easy access. And so the NoSQL can also handle non-textual or textual data types, such as images, so the user can have flexibility to access the NoSQL data. The NoSQL data generally stores each record as a document and can be accessed through a particular record ID. But the cons for NoSQL are its memory and ease of access. SQL with defined data types is highly optimized for the particular types only; it's easier and has faster access compared to NoSQL.

In real-time data streaming, data modeling practices differ when using NoSQL. It's called data modeling. This is the first one lacks flexibility. As mentioned previously, NoSQL provides dynamic schemas ideal for variant and evolving data formats in real-time states, whereas SQL requires offline time, not scaling. And the next part is scaling. NoSQL excels in scaling crucial for high-volume data. Well, SQL type databases scale well with when you provide more CPU cores, but NoSQL excels when you can provide more instances. As we stated, data structure, NoSQL can model data as key-value based documents or graphs suitable for complex or nested real-time data. SQL is like a standard Excel sheet where you have tables across. The data integrity when it comes to data integrity, still focuses on atomic properties, assuming string data consistency. NoSQL prioritizes performance and flexibility using eventual consistency modeling. The query in SQL has a rich query language, whereas NoSQL query can be simpler but less.

Polymorphism and inheritance in JavaScript can streamline the development of an analytics platform on GCP in multiple ways. The first one is making use of inheritance because of its inherent property. We can use reusable common inheritance classes with inherent common functionalities, like data connection and preprocessing, to bring it to more specific analytics tasks. This reduces code duplication and promotes code restructuring. The flexibility with polymorphism is that it can define a common interface for different data analytics methods. For example, Neha's class, like, data processing interface can be implemented differently, but we can adjust our data types and use them interchangeably in multiple platforms. By using inheritance, the code becomes more organized, easy to fix bugs, and maintainable. Integration with Java classes can be designed to interact with GCP services, like BigQuery and Dataflow. It also supports multiple Java lines that can be supported by multiple GCP services without impacting system scalability. The features, such as building scalability, can add up to different data volumes, and this includes the exploration of testing.

As a Python decorator, a Python decorator can enhance function interaction with the streaming data resource. Technical routing can add additional functionalities, such as logging, error handling, or performance measures in the version of the functional flow of the streaming data. So the high-level example would be, suppose you have a function called stream data process that processes data from a streaming source. You can create a decorator on top of that, where you can create a decorator with long error logs to learn errors that are occurring during data processing. So by adding the decorators, it will trigger the logging errors without modifying the streaming data source, without making any impact on analytics performing on streaming data.

The given SQL query has few stakes that can be enhanced. 1st, the select clause is incomplete. It should specify the column must be retrieved from customer data. The where clause is incorrect. Instead of it, you should use equal to, you should use equal for correction. Quotes around if e r is numeric, we don't run it should not be in quotes To get the only product is the particular inquiry is the select and where clauses. So we need to specify the columns when you're selecting from customer data. So we need to specify the columns like name and revenue from customers' data where country equals US by revenue.

Lee, there is no error in the given code. I cannot identify any specific errors in the demand folder. We can use Python instead of assembly code, it would be ideal. It would be more Python if we can use double quotes when calling particular C's within model dot CSV. Add error try and exit methods to find I'm warning that a particular model can ask for improvements and start defining all 1, call to, call 3 within that particular idea could be naturally created as variables, like input file output file and the columns of interest. If we can call such particular way it can be in all Pythonic and easy to check. It is to modify the particular code or if we should be good if you can add a lot of loggers also.

Use the open-closed principle for evolving an OS SQL database scheme. The C/S three-layer workflow. To use an open clause principle for evolving the data scheme. In analysis, you work with abstract schema recognition, like, a web-based schema class that is testable, but not modifiable. You can also have a schema version featuring toggles with automated testing once you go back.

To implement a real time streaming service on G three following the Alice code substitution, principles, or also called as LSP. First, we need to define base interfaces, like create base interfaces for streaming components, like produce. What are the producers? What are the consumers? What are the processes with very general returns? We need to inherit and extend them, like develop specific components, like using BigQuery as a producer or another service as a consumer that inherit these interfaces, ensuring they can replace the base without altering the existing behavior. We also do the integration, like using TCP services, like PubSub for messaging, queuing, data flow, or data streaming. And should ensure that all components adhere to LSP. And the components can be interchanged or upgraded without affecting the system's integrity and showing a robust, scalable rail.

To automate TensorFlow model training on GCP, we can use Cloud Storage to store all the data and leverage the well-defined platform services available on GCP. Using GCP's resources, we can utilize existing TensorFlow models or boost existing transfer flow model architectures with resource provisioning and scaling. You can also implement CICD to automate the model training and deployment pipeline. You can also use a plan for monitoring and logging inside the model training process. Model training is going to involve various features, I mean, either our model disposition or your accuracy. There is another thing called order in.

To apply interface execution transfer in Java-based internet assemblies, we need to have simple, specific interfaces, like creating small, basic interfaces for different functionalities, such as data ingestion processing. Our output is that we need to implement service components that implement only the devices that are relevant to their particular functionality. And the last one is flexibility and scalability, where this approach allows this ISP to integrate and offer flexibility as components can be easily modified or extended without affecting others. The key is to ensure interfaces are focused and not overloaded with underwritten methods, aligning the ISP for maintaining a modular design.