ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
4
Most read
6
Most read
CMSIS-NN, INTRO
ÐÂ?Öñ´aÞr
Anthony Liu, 2018/03/08
1
RESOURCES
? Source: https://github.com/ARM-software/CMSIS_5
? Web 1: https://developer.arm.com/embedded/cmsis
? Web 2: http://www2.keil.com/mdk5/cmsis/
? Paper: https://arxiv.org/abs/1801.06601
? Manual: http://arm-software.github.io/CMSIS_5/NN/html/
index.html
2
CMSIS 5.3.0
? http://www2.keil.com/mdk5/cmsis/
? https://developer.arm.com/embedded/cmsis
? Cortex Microcontroller Software Interface Standard
? CMSIS-NN ?rst appeared in 5.2.1 dev 3
CMSIS-CORE CMSIS-RTOS CMSIS-DSP
CMSIS-Driver CMSIS-SVD CMSIS-DAP
CMSIS-Pack CMSIS-NNCMSIS-Zone
(planned)
3
CMSIS
https://developer.arm.com/embedded/cmsis
4
CMSIS-NN
? DSP: Cortex-M0 (N) / Cortex-M3 (N) Cortex-M4 (Y) / Cortex-M7 (Y) / Cortex-M33 (Optional)
? For inference only with limited computation power
? CPU: Dozens MHz to 192MHz Cortex-M4, 400 MHz Cortex-M7
? MEMORY: Dozens KB to a few MB
? Kernels Support: q7t and q15_t fractional data type: [ -1.0, 1.0 )
? Functions
? Neural Network Convolution Functions
? Neural Network Activation Functions
? Fully-connected Layer Functions
? Neural Network Pooling Functions
? Softmax Functions
5
SUPPORT
? Data conversion
? arm_q7_to_q15_no_shift
? arm_q7_to_q15_reordered_no_shift
6
CONVOLUTION
? arm_convolve_HWC_q7_basic
? arm_convolve_HWC_q15_basic
? arm_convolve_HWC_q7_fast
? arm_convolve_HWC_q7_fast_nonsquare
? arm_convolve_HWC_q7_RGB
? arm_convolve_HWC_q15_fast
? arm_convolve_1x1_HWC_q7_fast_nonsquare
? arm_depthwise_separable_conv_HWC_q7
? arm_depthwise_separable_conv_HWC_q7_nonsquare
7
ACTIVATION
? ReLU
? arm_relu_q7
? arm_relu_q15
? Sigmoid / Tanh
? arm_nn_activations_direct_q7
? arm_nn_activations_direct_q15
8
POOLING
? Supports 1.7 format max-pooling and
average-pooling
? arm_maxpool_q7_HWC
? arm_avepool_q7_HWC
9
SOFTMAX
? EXP(2) based softmax function
? arm_softmax_q7
? arm_softmax_q15
10
FULLY-CONNECTED LAYER
? arm_fully_connected_q7
? arm_fully_connected_q7_opt
? arm_fully_connected_q15
? arm_fully_connected_q15_opt
? arm_fully_connected_mat_q7_vec_q15
? arm_fully_connected_mat_q7_vec_q15_opt
11
FOOTPRINT - 9,306
text data bss dec hex filename
132 0 0 132 84 ./SoftmaxFunctions/arm_softmax_q15.o
154 0 0 154 9a ./SoftmaxFunctions/arm_softmax_q7.o
544 0 0 544 220 ./PoolingFunctions/arm_pool_q7_HWC.o
2816 0 0 2816 b00 ./NNSupportFunctions/arm_nntables.o
84 0 0 84 54 ./NNSupportFunctions/arm_q7_to_q15_no_shift.o
72 0 0 72 48 ./NNSupportFunctions/arm_q7_to_q15_reordered_no_shift.o
102 0 0 102 66 ./FullyConnectedFunctions/arm_fully_connected_q15.o
88 0 0 88 58 ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15.o
476 0 0 476 1dc ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15_opt.o
486 0 0 486 1e6 ./FullyConnectedFunctions/arm_fully_connected_q15_opt.o
86 0 0 86 56 ./FullyConnectedFunctions/arm_fully_connected_q7.o
532 0 0 532 214 ./FullyConnectedFunctions/arm_fully_connected_q7_opt.o
266 0 0 266 10a ./ConvolutionFunctions/arm_convolve_1x1_HWC_q7_fast_nonsquare.o
404 0 0 404 194 ./ConvolutionFunctions/arm_convolve_HWC_q15_basic.o
450 0 0 450 1c2 ./ConvolutionFunctions/arm_convolve_HWC_q15_fast.o
426 0 0 426 1aa ./ConvolutionFunctions/arm_convolve_HWC_q7_basic.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast.o
434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast_nonsquare.o
428 0 0 428 1ac ./ConvolutionFunctions/arm_convolve_HWC_q7_RGB.o
298 0 0 298 12a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7.o
378 0 0 378 17a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7_nonsquare.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15.o
4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15_reordered.o
104 0 0 104 68 ./ActivationFunctions/arm_nn_activations_q15.o
48 0 0 48 30 ./ActivationFunctions/arm_nn_activations_q7.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q15.o
28 0 0 28 1c ./ActivationFunctions/arm_relu_q7.o
EXAMPLE - CIFAR-10
? arm_convolve_HWC_q7_RGB()
? arm_relu_q7()
? arm_maxpool_q7_HWC()
? arm_convolve_HWC_q7_fast()
? arm_relu_q7()
? arm_avepool_q7_HWC()
13
? arm_convolve_HWC_q7_fast()
? arm_relu_q7()
? arm_avepool_q7_HWC()
? arm_fully_connected_q7()
? arm_softmax_q7()
? conv1_wt: 2,400
? conv1_bias: 32
? conv2_wt: 12,800
? conv2_bias: 16
? conv3_wt: 12,800
? conv3_bias: 32
? ip1_wt: 10
? ip1_bias: 10
? input_data: 3K
? output_data: 10
? col_buffer: 3,200
? scratch_buffer: 40K
PERFORMANCE
CIFAR-10
speed show case
GRU
power-save show case
Ad

Recommended

PDF
Day2
emiliomerayo
?
DOCX
Jvzoo 16
ALLEEMAHAMMAD
?
PDF
Art?r?lm?? Ger?eklik Programlar?
Ayla Sava???
?
DOCX
Warrier plus
ALLEEMAHAMMAD
?
PPTX
O'Reilly AI Conf
Neil Tan
?
PPTX
Deep Learning in Your Browser
Oswald Campesato
?
PPTX
TensorFlow in Your Browser
Oswald Campesato
?
PPTX
Introduction to Deep Learning and TensorFlow
Oswald Campesato
?
PPTX
Intro to Deep Learning, TensorFlow, and tensorflow.js
Oswald Campesato
?
PPTX
Deep Learning and TensorFlow
Oswald Campesato
?
PPTX
Introduction to Deep Learning, Keras, and Tensorflow
Oswald Campesato
?
PDF
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
?
PDF
Memory efficient pytorch
Hyungjoo Cho
?
PDF
DeepXplore: Automated Whitebox Testing of Deep Learning
Masahiro Sakai
?
PPTX
H2 o berkeleydltf
Oswald Campesato
?
PDF
Lecture 4: Deep Learning Frameworks
Mohamed Loey
?
PDF
¶Ù±·±·¤Î¥â¥Ç¥ëÌØ»¯¥Ï©`¥É¥¦¥§¥¢¤òÉú³É¤¹¤ë¥ª©`¥×¥ó¥½©`¥¹¥³¥ó¥Ñ¥¤¥é±·±·²µ±ð²Ô¤Î¥Ç¥â
Shinya Takamaeda-Y
?
PPTX
team12.project_ver_1_(1).pptx
RitwikShrivastava1
?
PDF
Language Language Models (in 2023) - OpenAI
SamuelButler15
?
PDF
TensorFlow example for AI Ukraine2016
Andrii Babii
?
PDF
Introduction to Chainer
Seiya Tokui
?
PDF
[̨ž³È˹¤ÖǻیWУ] ÐÂÖñ·ÖУµÚÒ»ÆÚ½Y˜Iµä¶Y - Ö÷î}ÑÝÖv
̨Íå×ÊÁÏ¿ÆÑ§Äê»á
?
PDF
Tutorial-on-DNN-07-Co-design-Precision.pdf
Duy-Hieu Bui
?
PDF
Pr083 Non-local Neural Networks
Taeoh Kim
?
PPTX
Paralell
Mark Vicuna
?
PDF
HKG18-312 - CMSIS-NN
Linaro
?
PDF
L05.pdf
TRNHONGLINHBCHCM
?
PDF
Model Compression
DarshanG13
?
PPTX
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
?
PDF
Mastering AI Workflows with FME by Mark Do?ring
Safe Software
?

More Related Content

Similar to CMSIS-NN (20)

PPTX
Intro to Deep Learning, TensorFlow, and tensorflow.js
Oswald Campesato
?
PPTX
Deep Learning and TensorFlow
Oswald Campesato
?
PPTX
Introduction to Deep Learning, Keras, and Tensorflow
Oswald Campesato
?
PDF
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
?
PDF
Memory efficient pytorch
Hyungjoo Cho
?
PDF
DeepXplore: Automated Whitebox Testing of Deep Learning
Masahiro Sakai
?
PPTX
H2 o berkeleydltf
Oswald Campesato
?
PDF
Lecture 4: Deep Learning Frameworks
Mohamed Loey
?
PDF
¶Ù±·±·¤Î¥â¥Ç¥ëÌØ»¯¥Ï©`¥É¥¦¥§¥¢¤òÉú³É¤¹¤ë¥ª©`¥×¥ó¥½©`¥¹¥³¥ó¥Ñ¥¤¥é±·±·²µ±ð²Ô¤Î¥Ç¥â
Shinya Takamaeda-Y
?
PPTX
team12.project_ver_1_(1).pptx
RitwikShrivastava1
?
PDF
Language Language Models (in 2023) - OpenAI
SamuelButler15
?
PDF
TensorFlow example for AI Ukraine2016
Andrii Babii
?
PDF
Introduction to Chainer
Seiya Tokui
?
PDF
[̨ž³È˹¤ÖǻیWУ] ÐÂÖñ·ÖУµÚÒ»ÆÚ½Y˜Iµä¶Y - Ö÷î}ÑÝÖv
̨Íå×ÊÁÏ¿ÆÑ§Äê»á
?
PDF
Tutorial-on-DNN-07-Co-design-Precision.pdf
Duy-Hieu Bui
?
PDF
Pr083 Non-local Neural Networks
Taeoh Kim
?
PPTX
Paralell
Mark Vicuna
?
PDF
HKG18-312 - CMSIS-NN
Linaro
?
PDF
L05.pdf
TRNHONGLINHBCHCM
?
PDF
Model Compression
DarshanG13
?
Intro to Deep Learning, TensorFlow, and tensorflow.js
Oswald Campesato
?
Deep Learning and TensorFlow
Oswald Campesato
?
Introduction to Deep Learning, Keras, and Tensorflow
Oswald Campesato
?
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
?
Memory efficient pytorch
Hyungjoo Cho
?
DeepXplore: Automated Whitebox Testing of Deep Learning
Masahiro Sakai
?
H2 o berkeleydltf
Oswald Campesato
?
Lecture 4: Deep Learning Frameworks
Mohamed Loey
?
¶Ù±·±·¤Î¥â¥Ç¥ëÌØ»¯¥Ï©`¥É¥¦¥§¥¢¤òÉú³É¤¹¤ë¥ª©`¥×¥ó¥½©`¥¹¥³¥ó¥Ñ¥¤¥é±·±·²µ±ð²Ô¤Î¥Ç¥â
Shinya Takamaeda-Y
?
team12.project_ver_1_(1).pptx
RitwikShrivastava1
?
Language Language Models (in 2023) - OpenAI
SamuelButler15
?
TensorFlow example for AI Ukraine2016
Andrii Babii
?
Introduction to Chainer
Seiya Tokui
?
[̨ž³È˹¤ÖǻیWУ] ÐÂÖñ·ÖУµÚÒ»ÆÚ½Y˜Iµä¶Y - Ö÷î}ÑÝÖv
̨Íå×ÊÁÏ¿ÆÑ§Äê»á
?
Tutorial-on-DNN-07-Co-design-Precision.pdf
Duy-Hieu Bui
?
Pr083 Non-local Neural Networks
Taeoh Kim
?
Paralell
Mark Vicuna
?
HKG18-312 - CMSIS-NN
Linaro
?
Model Compression
DarshanG13
?

Recently uploaded (20)

PPTX
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
?
PDF
Mastering AI Workflows with FME by Mark Do?ring
Safe Software
?
PDF
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
?
PDF
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
?
PDF
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
?
PDF
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
?
PDF
The Future of Product Management in AI ERA.pdf
Alyona Owens
?
PPTX
UserCon Belgium: Honey, VMware increased my bill
stijn40
?
PPTX
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
?
PDF
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
?
PDF
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
?
PDF
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
?
PDF
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
?
PDF
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
?
PDF
The Growing Value and Application of FME & GenAI
Safe Software
?
PDF
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
?
PPTX
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
?
PDF
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
?
PDF
¡°MPU+: A Transformative Solution for Next-Gen AI at the Edge,¡± a Presentation...
Edge AI and Vision Alliance
?
PDF
Python Conference Singapore - 19 Jun 2025
ninefyi
?
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
?
Mastering AI Workflows with FME by Mark Do?ring
Safe Software
?
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
?
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
?
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
?
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
?
The Future of Product Management in AI ERA.pdf
Alyona Owens
?
UserCon Belgium: Honey, VMware increased my bill
stijn40
?
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
?
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
?
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
?
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
?
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
?
A Constitutional Quagmire - Ethical Minefields of AI, Cyber, and Privacy.pdf
Priyanka Aash
?
The Growing Value and Application of FME & GenAI
Safe Software
?
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
?
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
?
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
?
¡°MPU+: A Transformative Solution for Next-Gen AI at the Edge,¡± a Presentation...
Edge AI and Vision Alliance
?
Python Conference Singapore - 19 Jun 2025
ninefyi
?
Ad

CMSIS-NN

  • 2. RESOURCES ? Source: https://github.com/ARM-software/CMSIS_5 ? Web 1: https://developer.arm.com/embedded/cmsis ? Web 2: http://www2.keil.com/mdk5/cmsis/ ? Paper: https://arxiv.org/abs/1801.06601 ? Manual: http://arm-software.github.io/CMSIS_5/NN/html/ index.html 2
  • 3. CMSIS 5.3.0 ? http://www2.keil.com/mdk5/cmsis/ ? https://developer.arm.com/embedded/cmsis ? Cortex Microcontroller Software Interface Standard ? CMSIS-NN ?rst appeared in 5.2.1 dev 3 CMSIS-CORE CMSIS-RTOS CMSIS-DSP CMSIS-Driver CMSIS-SVD CMSIS-DAP CMSIS-Pack CMSIS-NNCMSIS-Zone (planned) 3
  • 5. CMSIS-NN ? DSP: Cortex-M0 (N) / Cortex-M3 (N) Cortex-M4 (Y) / Cortex-M7 (Y) / Cortex-M33 (Optional) ? For inference only with limited computation power ? CPU: Dozens MHz to 192MHz Cortex-M4, 400 MHz Cortex-M7 ? MEMORY: Dozens KB to a few MB ? Kernels Support: q7t and q15_t fractional data type: [ -1.0, 1.0 ) ? Functions ? Neural Network Convolution Functions ? Neural Network Activation Functions ? Fully-connected Layer Functions ? Neural Network Pooling Functions ? Softmax Functions 5
  • 6. SUPPORT ? Data conversion ? arm_q7_to_q15_no_shift ? arm_q7_to_q15_reordered_no_shift 6
  • 7. CONVOLUTION ? arm_convolve_HWC_q7_basic ? arm_convolve_HWC_q15_basic ? arm_convolve_HWC_q7_fast ? arm_convolve_HWC_q7_fast_nonsquare ? arm_convolve_HWC_q7_RGB ? arm_convolve_HWC_q15_fast ? arm_convolve_1x1_HWC_q7_fast_nonsquare ? arm_depthwise_separable_conv_HWC_q7 ? arm_depthwise_separable_conv_HWC_q7_nonsquare 7
  • 8. ACTIVATION ? ReLU ? arm_relu_q7 ? arm_relu_q15 ? Sigmoid / Tanh ? arm_nn_activations_direct_q7 ? arm_nn_activations_direct_q15 8
  • 9. POOLING ? Supports 1.7 format max-pooling and average-pooling ? arm_maxpool_q7_HWC ? arm_avepool_q7_HWC 9
  • 10. SOFTMAX ? EXP(2) based softmax function ? arm_softmax_q7 ? arm_softmax_q15 10
  • 11. FULLY-CONNECTED LAYER ? arm_fully_connected_q7 ? arm_fully_connected_q7_opt ? arm_fully_connected_q15 ? arm_fully_connected_q15_opt ? arm_fully_connected_mat_q7_vec_q15 ? arm_fully_connected_mat_q7_vec_q15_opt 11
  • 12. FOOTPRINT - 9,306 text data bss dec hex filename 132 0 0 132 84 ./SoftmaxFunctions/arm_softmax_q15.o 154 0 0 154 9a ./SoftmaxFunctions/arm_softmax_q7.o 544 0 0 544 220 ./PoolingFunctions/arm_pool_q7_HWC.o 2816 0 0 2816 b00 ./NNSupportFunctions/arm_nntables.o 84 0 0 84 54 ./NNSupportFunctions/arm_q7_to_q15_no_shift.o 72 0 0 72 48 ./NNSupportFunctions/arm_q7_to_q15_reordered_no_shift.o 102 0 0 102 66 ./FullyConnectedFunctions/arm_fully_connected_q15.o 88 0 0 88 58 ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15.o 476 0 0 476 1dc ./FullyConnectedFunctions/arm_fully_connected_mat_q7_vec_q15_opt.o 486 0 0 486 1e6 ./FullyConnectedFunctions/arm_fully_connected_q15_opt.o 86 0 0 86 56 ./FullyConnectedFunctions/arm_fully_connected_q7.o 532 0 0 532 214 ./FullyConnectedFunctions/arm_fully_connected_q7_opt.o 266 0 0 266 10a ./ConvolutionFunctions/arm_convolve_1x1_HWC_q7_fast_nonsquare.o 404 0 0 404 194 ./ConvolutionFunctions/arm_convolve_HWC_q15_basic.o 450 0 0 450 1c2 ./ConvolutionFunctions/arm_convolve_HWC_q15_fast.o 426 0 0 426 1aa ./ConvolutionFunctions/arm_convolve_HWC_q7_basic.o 434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast.o 434 0 0 434 1b2 ./ConvolutionFunctions/arm_convolve_HWC_q7_fast_nonsquare.o 428 0 0 428 1ac ./ConvolutionFunctions/arm_convolve_HWC_q7_RGB.o 298 0 0 298 12a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7.o 378 0 0 378 17a ./ConvolutionFunctions/arm_depthwise_separable_conv_HWC_q7_nonsquare.o 4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15.o 4 0 0 4 4 ./ConvolutionFunctions/arm_nn_mat_mult_kernel_q7_q15_reordered.o 104 0 0 104 68 ./ActivationFunctions/arm_nn_activations_q15.o 48 0 0 48 30 ./ActivationFunctions/arm_nn_activations_q7.o 28 0 0 28 1c ./ActivationFunctions/arm_relu_q15.o 28 0 0 28 1c ./ActivationFunctions/arm_relu_q7.o
  • 13. EXAMPLE - CIFAR-10 ? arm_convolve_HWC_q7_RGB() ? arm_relu_q7() ? arm_maxpool_q7_HWC() ? arm_convolve_HWC_q7_fast() ? arm_relu_q7() ? arm_avepool_q7_HWC() 13 ? arm_convolve_HWC_q7_fast() ? arm_relu_q7() ? arm_avepool_q7_HWC() ? arm_fully_connected_q7() ? arm_softmax_q7() ? conv1_wt: 2,400 ? conv1_bias: 32 ? conv2_wt: 12,800 ? conv2_bias: 16 ? conv3_wt: 12,800 ? conv3_bias: 32 ? ip1_wt: 10 ? ip1_bias: 10 ? input_data: 3K ? output_data: 10 ? col_buffer: 3,200 ? scratch_buffer: 40K