AI ACCELERATOR
Dedicated Artificial Intelligence (AI) IP: ‘GPT’ AI Accelerator
The configurable ‘GPT’ AI accelerator is designed for Convolutional Neural Network (CNN) inference. The highly-optimized design provides for convolution, pooling, dropout, padding, and programmable activation functions. Running at up to 1GHz in 28nm technology, the design contains building blocks of 288 MAC units, scalable to beyond 2,000 MACs. Performance, power, and accuracy can be optimized using built-in integer or 16-bit floating point operations. GPT provides a library of popular CNN networks; support for frameworks such as Tensorflow and Caffe is in development.
Features
-
Independent of external controller
-
Accelerates high dimensional tensors
-
Highly parallel with multi-tasking or multiple data sources
-
Optimized for performance / power / area
-
IDE for programming / debugging / profiling
-
Easy to develop application extensions
-
Software eco-system
-
Support for C/C++/Assembly
-
Cycle accurate simulator
-
Compatible with open platforms (e.g. CaffeNET, TensorFlow, etc.)
-
Embedded CNN API
-
Image Processing API
-
Developer Support API
-
Extensive optimization framework
-
Automatic inference bit conversion
-
Scalable tensor computations
-
Configurable memory
-
Supports HSA Runtime API for easy heterogeneous computing SoC design
Specifications
-
Clock speed: 500MHz-1.2GHz
-
Core memory: 500KB-2MB/core
-
8/16 bit fixed point
-
12/16 bit floating point
-
1-8 cores
-
Up to 2.8 TFLOPS or TOPS (16-bit)
-
Power: <1W