Post date: Sep 22, 2020 11:43:9 PM
Original paper:
CMSIS-NN Neural Network Kernels Boost Efficiency in Microcontrollers by ~5x
Results using CIFAR-10 below.
CIFAR-10 CNN as implemented by CMSIS-NN.
Performance of CMSIS-NN over baseline CNN written using arm_conv in CMSIS-DSP.
Platform: NUCLEO-F746ZG mbed board with an Arm Cortex-M7 core running at 216 MHz.
Note: using state-of-art DNN, researchers has achieved ≥93% of accuracy on CIFAR-10.
Performance on STM32 compared with other SoCs, as reported by AiOTA Labs.
The players:
STM32F7 from STmicro. Up to 216 MHz.
GAP8 from Greenwaves with CNN benchmarks. Max freq 175 MHz "Cluster", 250 MHz "Fabric Controller". RISC-V core.
i.Mx 6ULL from NXP. This guy goes up to 900 MHz.
Looks like the GAP8 is many times more power-efficient than Cortex. At 10 FPS, the GAP8 needs 3.7 mW versus 60 mW on STM32. But it comes at a price. Very heavy price. The cost of an Arduino-compatible GAP8 board is 100,00€. And it's not running CMSIS-NN either. CMSIS-NN is for ARM processors only.
Back to CMSIS-NN.
Training is supported by TensorFlow Lite for Microcontrollers