About Me

Hello, and welcome! I'm Dwith Chenna, an R&D professional focused on algorithm development and optimization in computer vision, deep learning, and Edge AI. With over a decade of international experience, I specialize in enhancing performance and efficiency of deep learning models on constrained hardware.

In my career journey, I've had the privilege of impactful roles at prominent organizations like the Center of Devices and Radiology Health (CDRH) at the FDA, Cadence Design Systems, Magic Leap and currently at AMD. Solving the complex challenges of developing and optimizing deep learning models on resource-constrained hardware, such as Digital Signal Processors (DSPs) and Neural Processing Unit (NPU).

I am actively involved in the tech community, participating in conferences, reviewing research, and speaking at industry events. Thanks for stopping by—let's explore the future of technology together!

What I Do

  • Web development icon

    AI Inference Optimization

    Enabling end-to-end AI inferencing solution productization for AMD's CPU/NPU/Embedded devices, with expertise in usability analysis and performance data generation of Model Inferencing flow.

  • Mobile app icon

    Computer Vision & Deep Learning

    Developing foundation models for performance on hardware through efficient architecture. Expertise in quantization, optimization, and tuning performance of Deep Learning models on Vision DSP.

  • Design icon

    Product Development

    Developing benchmarking plans to support robust product development. Engaging with software developers to work on product development, analyzing product specification/usability, and understanding customer pain points.

Resume

Work Experience

  1. MTS Product Engineer, AI Inference

    AMD • 2024 — Present

    • Enabling end-to-end AI inferencing solution productization for AMD's CPU/NPU/Embedded devices
    • Conducting usability analysis and performance data generation of Model Inferencing flow
    • Developing benchmarking plans to support robust product development
    • Engaging with software developers to work on product development, analyzing product specification/usability, and understanding customer pain points
    • Research and develop guidelines for advanced quantization techniques to optimize AI models for deployment on AMD CPU/GPU/NPU devices, ensuring high performance and reduced resource consumption.
    • Understanding the product usage in a holistic way to discover the possible customer pain points in software components and ecosystems (Compiler, Quantizer, Optimizer, Runtime, Profiler, Visualizer, Debugger etc.)
    • Develop benchmarking plans to support robust product development
    • Interact with internal and external customers to understand their issues, helping them to enable their work ow
    • Collaborate with the sales and marketing teams, R&D on strategic business engagements.

  2. Senior Software Engineer, Computer Vision

    Magic Leap • 2020 — 2024

    • Developed foundation models for performance on hardware through efficient architecture
    • Specialized in quantization, optimization, and tuning performance of Deep Learning models on Vision DSP
    • Solved complex challenges of developing and optimizing deep learning models on resource-constrained hardware
    • Worked with Digital Signal Processors (DSPs) and Neural Processing Units (NPU)

  3. Lead Design Engineer

    Cadence Design Systems • 2017 — 2020

    • Individual contributor to the AI and Deep learning team in Cadence's Tensilica business unit
    • Primarily involved in development of deep learning and computer vision application on Tensilia Vision line of DSPs
    • Developed DSP implementation for Classification, Super Resolution, Object Detection and Segmentation Networks
    • Optimize implementation to driver highly visible performance metric (e.g. latency, memory, bandwidth)
    • Acquisition of a deep knowledge of OpenVX and Neural Networks API standards for efficient implementation
    • Participation in the definition of next-generation Vision platforms and DSP architectures

  4. ORISE Reasearch Fellow

    FDA - Center of Devices and Radiology Health (CDRH) • 2016 — 2017

    • Development of mass fever screening for detection and preventing spread of viral diseases
    • Non-rigid image registration algorithm for thermal and visible image registration
    • Feature detection in visible images using trained cascade classifier using neural networks
    • Development of advanced numerical methods predicting temperature using Data Analytic Toolbox
    • Extensive analysis of different effects of registration models and other variables on performance of system

Education

  1. Master of Science in Electrical and Computer Engineering

    University of Maryland, College Park

  2. Bachelor of Technology in Electronics and Communication Engineering

    National Institute of Technology - Warangal

Awards and Honors

  1. Compute's Top 30 Early Career Professional

    IEEE Computer Society • Dec. 2024

  2. AMD Spotlight Award

    Advanced Micro Devices, San Jose, CA • Nov. 2024

  3. Industry Rising Star Award

    IEEE Computer Society, Santa Clara Valley • Aug. 2024

  4. ORISE Research Fellowship

    Medical Devices FDA, Silver Spring, MD • May 2016-May 2017

Invited Talks / Speaking Engagements

  1. Quantization Techniques for Efficient Deployment of Large Language Models

    Embedded Vision Summit, EdgeAI and Vision Alliance • May 2025

  2. Efficient LLM Deployment at the Edge Through Quantization

    ACM technical speaker series • July 2024

  3. DNN Quantization: Theory to Practice

    Embedded Vision Summit, EdgeAI and Vision Alliance • May 2024

  4. Practical Approaches to DNN Quantization

    Embedded Vision Summit, EdgeAI and Vision Alliance • May 2023

Certifications & Skills

  1. Self-Driving Car Specialization

    Coursera (deeplearning.ai) • 2021

  2. Generative Adversarial Networks (GANs) Specialization

    Coursera (deeplearning.ai) • 2021

  3. AI for Medical Diagnosis

    Coursera (deeplearning.ai) • 2021

  4. Technical Skills

    Programming: Python, C/C++, OpenCL, OpenVX, TensorFlow, PyTorch, OpenCV
    AI/ML: Computer Vision, Deep Learning, Model Optimization, Quantization
    Hardware: DSPs, NPUs, Edge Computing, Embedded Systems
    Tools: Git, Docker, Linux, Jupyter, Visual Studio

Blogs

  • AMD Quark Quantizer for Efficient AI Model Deployment

    AMD Developer

    AMD Quark Quantizer for Efficient AI Model Deployment

    This blog introduces the new AMD Quark Quantizer and highlights some of the features it offers, providing a foundation for exploring various other quantization techniques and mechanisms supported by AMD Quark.

    Read More
  • AI inference in edge computing

    Edge Computing

    AI Inference in Edge Computing: Benefits and Use Cases

    Exploring the advantages of running AI inference at the edge and examining real-world applications across various industries.

    Read More
  • Hardware Accelerators for AI

    Hardware Acceleration

    Improving AI Inference Performance with Hardware Accelerators

    An in-depth look at how specialized hardware accelerators are transforming AI inference performance and efficiency.

    Read More
  • Quantization Analysis

    Edge AI

    Quantization of Convolutional Neural Networks: Quantization Analysis

    A technical exploration of quantization analysis techniques for convolutional neural networks and their impact on model performance.

    Read More
  • Model Quantization

    Edge AI

    Quantization of Convolutional Neural Networks: Model Quantization

    Examining the principles and techniques of model quantization for convolutional neural networks to optimize deployment.

    Read More
  • Practical CNN Quantization: Theory to Practice

    Edge AI

    From Theory to Practice: Quantizing Convolutional Neural Networks for Practical Deployment

    Bridging the gap between theoretical quantization techniques and practical deployment considerations for convolutional neural networks.

    Read More

Publications

  • EdgeAI in Self-Sustaining Systems With AI and IoT

    Dwith Chenna, The Convergence of Self-Sustaining Systems With AI and IoT, pp. 174-205. IGI Global, 2024.

  • Quantization of Convolutional Neural Networks: A Practical Approach

    Dwith Chenna, International Journal for Research Trends and Innovation, Volume 8, Issue 12, Dec 2023.

  • Evolution of Convolutional Neural Network (CNN): Compute vs Memory bandwidth for Edge AI

    Dwith Chenna, Feed Forward Magazine, Volume 2 Issue 3, July-Sept 2023.

  • Facial and Oral Temperature Data from a Large Set of Human Subject Volunteers

    Wang, Q.; Zhou, Y.; Ghassemi, P.; Chenna, D.; Chen, M.; Casamento, J.; Pfefer, J.; McBride D., PhysioNet, Version 1.0.0, 2023

  • A Novel Wavelet Based Image Fusion for Brain Tumor Detection

    A. S. Vivek Angoth, CYN Dwith, International Journal of Computer Vision & Signal Processing, 14(6), 2023, pp. 18-28.

  • Efficient Architecture for Variable Block Size Motion Estimation in H.264/AVC

    P.Muralidhar, C.B.Rama Rao, and CYN Dwith, ACEEE journal on Signal and Image Processing, Vol 5,pp.77-84, Jan.2014.

  • Wireless Home Automation using social networking websites

    A. Gupta, C. Y. N. Dwith and B. A. V. Ramakanth, 20th Annual International Conference on Advanced Computing and Communications (ADCOM), Bangalore, 2014, pp. 12-15.

  • Fixed-point implementation of Convolutional Neural Networks on Digital Signal Processor(DSP)

    Dwith Chenna, IJAIML, Volume 2, Issue 01, Jan-Dec 2023.

  • Free-Form Deformation for Registration of Visible and Infrared Facial Images in Fever Screening

    Chenna, Y.N.D.; Ghassemi, P.; Pfefer, T.J.; Casamento, J.; Wang, Q., Sensors, 18, 125, Jan 2018.

  • Multi-Modality image registration for effective thermographic fever screening

    Dwith, C.Y.N.; Ghassemi, P.; Pfefer, J.; Casamento, J.; Wang, Q., Proceedings of the SPIE, Multimodal Biomedical Imaging XII, Volume 10057, Jan 2017.

  • Parallel Implementation of LBP Based Face Recognition on GPU Using OpenCL

    Dwith CYN, Rathna GN, 13th International Conference on Parallel and Distributed Computing Applications and Technologies(PDCAT), pp 755760, May 2012.

  • Parallel Texture Classification with Local Binary Pattern Descriptors using OpenCL

    C. Y. N. Dwith and G. N. Rathna, International Journal of Computer Applications (IJCA), pp.755-760, 2012.

  • Wavelet based image fusion algorithm for malignant brain tumor detection

    Dwith CYN, Vivek A, Amarjot S., International Journal of Image, Graphics and Singal Processing (IJIGSP), 2013.

  • Contact

    Contact Form