AI ACCELERATOR

Dedicated Artificial Intelligence (AI) IP: ‘GPT’ AI Accelerator

The configurable ‘GPT’ AI accelerator is designed for Convolutional Neural Network (CNN) inference. The highly-optimized design provides for convolution, pooling, dropout, padding, and programmable activation functions. Running at up to 1GHz in 28nm technology, the design contains building blocks of 288 MAC units, scalable to beyond 2,000 MACs. Performance, power, and accuracy can be optimized using built-in integer or 16-bit floating point operations. GPT provides a library of popular CNN networks; support for frameworks such as Tensorflow and Caffe is in development.

Features

Independent of external controller
Accelerates high dimensional tensors
Highly parallel with multi-tasking or multiple data sources
Optimized for performance / power / area
IDE for programming / debugging / profiling
Easy to develop application extensions
Software eco-system
Support for C/C++/Assembly
Cycle accurate simulator
Compatible with open platforms (e.g. CaffeNET, TensorFlow, etc.)
Embedded CNN API
Image Processing API
Developer Support API
Extensive optimization framework
Automatic inference bit conversion
Scalable tensor computations
Configurable memory
Supports HSA Runtime API for easy heterogeneous computing SoC design

Specifications

Clock speed: 500MHz-1.2GHz
Core memory: 500KB-2MB/core
8/16 bit fixed point
12/16 bit floating point
1-8 cores
Up to 2.8 TFLOPS or TOPS (16-bit)
Power: <1W