ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
M. S. Ramaiah School of Advanced Studies 
1 
M. Sc. (Engg.) in Electronics System Design Engineering 
GREESHMA S 
CWB0913004 , FT-20136thModule Presentation 
Module code : ESE2511 
Module name : Microcontrollers and Interfacing 
Module leader: Mr. Nagananda S.N. 
Presentation on : 07/05/2014 
ARM Boards for DSP Applications
M. S. Ramaiah School of Advanced Studies 
2 
•INTRODUCTION 
•ARM9E-S 
•DM3730 
•FUNCTIONALBLOCKDIAGRAM 
•BLOCKDIAGRAM 
•SOFTWAREARCHITECTURE 
•CHARACTERISTICSOFDSPPROCESSORS 
•FEATURESOFDM3730 
•REPRESENTINGADIGITALSIGNAL 
•ADDITIONANDSUBTRACTIONOFFIXED-POINTSIGNAL 
Overview
M. S. Ramaiah School of Advanced Studies 
3 
•MULTIPLICATIONANDDIVISIONOFFIXED-POINTSIGNAL 
•SQUAREROOTOFFIXEDPOINTSIGNAL 
•DSPONARM9E 
•DSPONARM10E 
•FIRFILTER 
•IIRFILTER 
•THEDISCRETEANDFASTFOURIERTRANSFORM 
•APPLICATIONS 
•CONCLUSION 
•REFERENCES 
Overview
M. S. Ramaiah School of Advanced Studies 
4Introduction 
Emergingstandardsforalgorithmsinmanyapplicationareashaveputfurtherdemandsontheabilityofprocessingplatformstodeliverefficientcontrolcapability 
ARM’sapproachhasbeentodesignRISCcorearchitectureswithinstructionsetsthatprovideefficientsupportforparticularapplications,withoptimalbalancebetweenhardwareandsoftwareimplementation 
Toacceleratesignal-processingalgorithmsARMaddsnewDSPinstructionstotheARMinstructionset 
ARMDSPextensionsbroadenthesuitabilityoftheARMCPUfamilytoapplicationsthatrequireintensivesignalprocessingandatthesametimeretainingthepowerandefficiencyofahighperformanceRISCmicrocontroller 
TheARMDSPextensionshavealreadybeenimplementedintheARM926EJ-S, ARM946E-S,ARM966E-S,ARM9E-S
M. S. Ramaiah School of Advanced Studies 
5Introduction 
Processing digitalized signals requires high memory bandwidths and fast multiplyaccumulate operations 
A microcontroller handles the user interface, and a separate DSP processor manipulate digitalized signals such as audio 
A single-core design can reduce cost and power consumption over a two-core solution 
The ARMv5TE extensions available in the ARM9E and later cores provideefficient multiply accumulate operations 
DSP applications are typically multiply and load-store intensive 
Filtering is most commonly used signal processing operation 
Another very common algorithm is the Discrete Fourier Transform
M. S. Ramaiah School of Advanced Studies 
6 
Introduction
M. S. Ramaiah School of Advanced Studies 
7ARM9E-S 
The ARM9E-S core has the ARM architecture v5TE 
This includes an enhanced multiplier design for improved DSP performance 
It is a 32-bit microcontroller 
It offers high performance for very low power consumption and gate count 
The ARM architecture is based on Reduced Instruction Set Computer (RISC) principles 
The reduced instruction set and related decode mechanism are much simpler than those of Complex Instruction Set Computer (CISC) designs 
This simplicity gives 
•a high instruction throughput 
•an excellent real-time interrupt response 
•a small, cost effective, processor macrocell
M. S. Ramaiah School of Advanced Studies 
8DM3730 
Based on enhanced device architecture 
Integrated on TI’s advanced 45-nm technology 
Device supports HLOS and RTOS 
Fully backward compatible
M. S. Ramaiah School of Advanced Studies 
9Functional Block DiagramFigure 1 : DM3730 Functional Block Diagram
M. S. Ramaiah School of Advanced Studies 
10 
Block Diagram 
Benefits 
•2000DMIPS for Oss like linux, Win CE, RTOS 
•3-D graphics up to 20M polygons per second for robust GUIs 
•Backward compatible with OMAP3530 
Figure 2 : DM3730 BlockDiagram 
Application 
•Smart connected devices 
•Patient monitoring 
•Media Player
M. S. Ramaiah School of Advanced Studies 
11Software ArchitectureFigure 3 : Software Architecture of DM3730 
Industry Standard OS component 
TI provider component 
Open Source
M. S. Ramaiah School of Advanced Studies 
12 
Characteristics of DSP processor 
Harvard Architecture 
High performance MAC 
Saturating math 
SIMD instruction for parallel computation 
Barrel shifters 
Floating point hardware
M. S. Ramaiah School of Advanced Studies 
13Features of DM3730 
ARM microprocessor subsystem 
Enhanced direct memory access controller 
Video hardware accelerators 
Tile based architecture delivering up to 20MPoly/sec 
DSP instructions/data little Endian 
NEON multimedia architecture 
Load store architecture with Non-aligned support 
64 32-Bit General purpose registers 
Six ALUs, each supports single 32-bit, dual 16-bit, or quad-8 bit , Arithmetic per clock cycle
M. S. Ramaiah School of Advanced Studies 
14Representing a Digital Signal Figure 4 : Digitalizing an Analogue Signal 
xis signal and t is time 
In an analogue signal x[t ], the index tand the value x are both continuous real variables 
ARM uses fixed point representation
M. S. Ramaiah School of Advanced Studies 
15 
Addition and Subtraction of Fixed-Point Signals 
The general case is to convert the signal equation 
Fixed-point format 
or in integer C 
n = m = d. Therefore normal integer addition gives a fixed-point 
Provided d = m or d = n
M. S. Ramaiah School of Advanced Studies 
16Contd… 
There are four common ways you can prevent overflow 
•Ensure that the X[t ]and C[t ] representations have one bit of spare headroom each 
•Use a larger container type for Y than for X and C 
•Use a smaller Q representation for y[t ] 
•For example, if d = n − 1 = m − 1, then the operation becomes 
•Use saturation
M. S. Ramaiah School of Advanced Studies 
17Multiplication of Fixed-Point Signals 
The general case is to convert the signal equation 
Fixed point format 
or in integer CDivision of Fixed-Point Signals 
The general case is to convert the signal equation 
fixed point format 
or in integer C
M. S. Ramaiah School of Advanced Studies 
18Square Root of a Fixed-Point Signals 
The general case is to convert the signal equation 
Fixed point format 
or in integer C
M. S. Ramaiah School of Advanced Studies 
19DSP on the ARM9E 
The ARM9E core has a very fast pipelined multiplier array that performs a 32-bit by 16-bit multiply in a single issue cycleWriting DSP Code for the ARM9E 
The ARMv5TE architecture multiply operations are capable of unpacking 16-bit halvesfrom 32-bit words and multiplying them 
The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy 
Multiply is the same speed as multiply accumulate. Use the SMLAxy instructionrather than a separate multiply and add
M. S. Ramaiah School of Advanced Studies 
20DSP on the ARM10E 
The ARM10E implements a background loading mechanism to accelerate load and storemultiples 
It uses a 64-bit-wide data path that can transfer two registers on every background cycleWriting DSP Code for the ARM10E 
Load and store multiples run in the background to give a high memory bandwidth 
Ensure data arrays are 64-bit aligned so that load and store multiple operations canTransfer two words per cycle 
The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy 
The SMLAxy instruction takes one cycle more than SMULxy
M. S. Ramaiah School of Advanced Studies 
21FIR filters 
The finite impulse response (FIR) filter is a basicbuilding block of many DSP applications 
FIR filter to remove unwanted frequency ranges, boostcertain frequencies, or implement special effects 
The FIR filter is the simplest type of digital filter 
The filtered sample y(t)depends linearly on afixed, finite number of unfilteredsamples x(t) 
Calculating accumulated values A[t ]
M. S. Ramaiah School of Advanced Studies 
22IIR filters 
An infinite impulse response (IIR) filter is a digital filter that depends linearly on a finite number of input samplesand a finite number of previous filter outputs 
Mathematically 
Factorize the filter into a series of bi quads—anIIR filter with M = L = 2 
Z-Transform
M. S. Ramaiah School of Advanced Studies 
23The Discrete Fourier TransformThe Fast Fourier Transform 
The DiscreteFourier Transform (DFT)converts a time domain signal to a frequency domain signal 
A FFT is an algorithm to compute the discrete Fourier transform and its inverse
M. S. Ramaiah School of Advanced Studies 
24Applications 
Portable data terminals 
Navigation 
Auto Infotainment 
Gaming 
Medical Imaging 
Home automation 
Single board
M. S. Ramaiah School of Advanced Studies 
25Conclusion 
DM3730 cost effective 
It is low power and has high performance 
DM3730 delivers a nearly 40% increase in ARM performance 
Over 50% increase in DSP performance 
Has twice the graphics capability, while reducing power consumption 
Use a fixed-point representation for DSP applications where speed is critical withmoderate dynamic range
M. S. Ramaiah School of Advanced Studies 
26Reference 
1.DM3730, http:// www.ti.com/lit/ds/symlink/dm3730.pdf 
2.DM3730, http://www.ti.com/lit/ml/sprt571/sprt571.pdf 
3.DM3730, http://media.digikey.com/pdf/ DM3730_AM3703TorpedoSOMBrief.pdf
M. S. Ramaiah School of Advanced Studies 
27

More Related Content

ARM Boards for DSP Applications

  • 1. M. S. Ramaiah School of Advanced Studies 1 M. Sc. (Engg.) in Electronics System Design Engineering GREESHMA S CWB0913004 , FT-20136thModule Presentation Module code : ESE2511 Module name : Microcontrollers and Interfacing Module leader: Mr. Nagananda S.N. Presentation on : 07/05/2014 ARM Boards for DSP Applications
  • 2. M. S. Ramaiah School of Advanced Studies 2 •INTRODUCTION •ARM9E-S •DM3730 •FUNCTIONALBLOCKDIAGRAM •BLOCKDIAGRAM •SOFTWAREARCHITECTURE •CHARACTERISTICSOFDSPPROCESSORS •FEATURESOFDM3730 •REPRESENTINGADIGITALSIGNAL •ADDITIONANDSUBTRACTIONOFFIXED-POINTSIGNAL Overview
  • 3. M. S. Ramaiah School of Advanced Studies 3 •MULTIPLICATIONANDDIVISIONOFFIXED-POINTSIGNAL •SQUAREROOTOFFIXEDPOINTSIGNAL •DSPONARM9E •DSPONARM10E •FIRFILTER •IIRFILTER •THEDISCRETEANDFASTFOURIERTRANSFORM •APPLICATIONS •CONCLUSION •REFERENCES Overview
  • 4. M. S. Ramaiah School of Advanced Studies 4Introduction Emergingstandardsforalgorithmsinmanyapplicationareashaveputfurtherdemandsontheabilityofprocessingplatformstodeliverefficientcontrolcapability ARM’sapproachhasbeentodesignRISCcorearchitectureswithinstructionsetsthatprovideefficientsupportforparticularapplications,withoptimalbalancebetweenhardwareandsoftwareimplementation Toacceleratesignal-processingalgorithmsARMaddsnewDSPinstructionstotheARMinstructionset ARMDSPextensionsbroadenthesuitabilityoftheARMCPUfamilytoapplicationsthatrequireintensivesignalprocessingandatthesametimeretainingthepowerandefficiencyofahighperformanceRISCmicrocontroller TheARMDSPextensionshavealreadybeenimplementedintheARM926EJ-S, ARM946E-S,ARM966E-S,ARM9E-S
  • 5. M. S. Ramaiah School of Advanced Studies 5Introduction Processing digitalized signals requires high memory bandwidths and fast multiplyaccumulate operations A microcontroller handles the user interface, and a separate DSP processor manipulate digitalized signals such as audio A single-core design can reduce cost and power consumption over a two-core solution The ARMv5TE extensions available in the ARM9E and later cores provideefficient multiply accumulate operations DSP applications are typically multiply and load-store intensive Filtering is most commonly used signal processing operation Another very common algorithm is the Discrete Fourier Transform
  • 6. M. S. Ramaiah School of Advanced Studies 6 Introduction
  • 7. M. S. Ramaiah School of Advanced Studies 7ARM9E-S The ARM9E-S core has the ARM architecture v5TE This includes an enhanced multiplier design for improved DSP performance It is a 32-bit microcontroller It offers high performance for very low power consumption and gate count The ARM architecture is based on Reduced Instruction Set Computer (RISC) principles The reduced instruction set and related decode mechanism are much simpler than those of Complex Instruction Set Computer (CISC) designs This simplicity gives •a high instruction throughput •an excellent real-time interrupt response •a small, cost effective, processor macrocell
  • 8. M. S. Ramaiah School of Advanced Studies 8DM3730 Based on enhanced device architecture Integrated on TI’s advanced 45-nm technology Device supports HLOS and RTOS Fully backward compatible
  • 9. M. S. Ramaiah School of Advanced Studies 9Functional Block DiagramFigure 1 : DM3730 Functional Block Diagram
  • 10. M. S. Ramaiah School of Advanced Studies 10 Block Diagram Benefits •2000DMIPS for Oss like linux, Win CE, RTOS •3-D graphics up to 20M polygons per second for robust GUIs •Backward compatible with OMAP3530 Figure 2 : DM3730 BlockDiagram Application •Smart connected devices •Patient monitoring •Media Player
  • 11. M. S. Ramaiah School of Advanced Studies 11Software ArchitectureFigure 3 : Software Architecture of DM3730 Industry Standard OS component TI provider component Open Source
  • 12. M. S. Ramaiah School of Advanced Studies 12 Characteristics of DSP processor Harvard Architecture High performance MAC Saturating math SIMD instruction for parallel computation Barrel shifters Floating point hardware
  • 13. M. S. Ramaiah School of Advanced Studies 13Features of DM3730 ARM microprocessor subsystem Enhanced direct memory access controller Video hardware accelerators Tile based architecture delivering up to 20MPoly/sec DSP instructions/data little Endian NEON multimedia architecture Load store architecture with Non-aligned support 64 32-Bit General purpose registers Six ALUs, each supports single 32-bit, dual 16-bit, or quad-8 bit , Arithmetic per clock cycle
  • 14. M. S. Ramaiah School of Advanced Studies 14Representing a Digital Signal Figure 4 : Digitalizing an Analogue Signal xis signal and t is time In an analogue signal x[t ], the index tand the value x are both continuous real variables ARM uses fixed point representation
  • 15. M. S. Ramaiah School of Advanced Studies 15 Addition and Subtraction of Fixed-Point Signals The general case is to convert the signal equation Fixed-point format or in integer C n = m = d. Therefore normal integer addition gives a fixed-point Provided d = m or d = n
  • 16. M. S. Ramaiah School of Advanced Studies 16Contd… There are four common ways you can prevent overflow •Ensure that the X[t ]and C[t ] representations have one bit of spare headroom each •Use a larger container type for Y than for X and C •Use a smaller Q representation for y[t ] •For example, if d = n − 1 = m − 1, then the operation becomes •Use saturation
  • 17. M. S. Ramaiah School of Advanced Studies 17Multiplication of Fixed-Point Signals The general case is to convert the signal equation Fixed point format or in integer CDivision of Fixed-Point Signals The general case is to convert the signal equation fixed point format or in integer C
  • 18. M. S. Ramaiah School of Advanced Studies 18Square Root of a Fixed-Point Signals The general case is to convert the signal equation Fixed point format or in integer C
  • 19. M. S. Ramaiah School of Advanced Studies 19DSP on the ARM9E The ARM9E core has a very fast pipelined multiplier array that performs a 32-bit by 16-bit multiply in a single issue cycleWriting DSP Code for the ARM9E The ARMv5TE architecture multiply operations are capable of unpacking 16-bit halvesfrom 32-bit words and multiplying them The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy Multiply is the same speed as multiply accumulate. Use the SMLAxy instructionrather than a separate multiply and add
  • 20. M. S. Ramaiah School of Advanced Studies 20DSP on the ARM10E The ARM10E implements a background loading mechanism to accelerate load and storemultiples It uses a 64-bit-wide data path that can transfer two registers on every background cycleWriting DSP Code for the ARM10E Load and store multiples run in the background to give a high memory bandwidth Ensure data arrays are 64-bit aligned so that load and store multiple operations canTransfer two words per cycle The multiply operations do not early terminate. Therefore use MUL and MLA for multiplying 32-bit integers. For 16-bit values use SMULxy and SMLAxy The SMLAxy instruction takes one cycle more than SMULxy
  • 21. M. S. Ramaiah School of Advanced Studies 21FIR filters The finite impulse response (FIR) filter is a basicbuilding block of many DSP applications FIR filter to remove unwanted frequency ranges, boostcertain frequencies, or implement special effects The FIR filter is the simplest type of digital filter The filtered sample y(t)depends linearly on afixed, finite number of unfilteredsamples x(t) Calculating accumulated values A[t ]
  • 22. M. S. Ramaiah School of Advanced Studies 22IIR filters An infinite impulse response (IIR) filter is a digital filter that depends linearly on a finite number of input samplesand a finite number of previous filter outputs Mathematically Factorize the filter into a series of bi quads—anIIR filter with M = L = 2 Z-Transform
  • 23. M. S. Ramaiah School of Advanced Studies 23The Discrete Fourier TransformThe Fast Fourier Transform The DiscreteFourier Transform (DFT)converts a time domain signal to a frequency domain signal A FFT is an algorithm to compute the discrete Fourier transform and its inverse
  • 24. M. S. Ramaiah School of Advanced Studies 24Applications Portable data terminals Navigation Auto Infotainment Gaming Medical Imaging Home automation Single board
  • 25. M. S. Ramaiah School of Advanced Studies 25Conclusion DM3730 cost effective It is low power and has high performance DM3730 delivers a nearly 40% increase in ARM performance Over 50% increase in DSP performance Has twice the graphics capability, while reducing power consumption Use a fixed-point representation for DSP applications where speed is critical withmoderate dynamic range
  • 26. M. S. Ramaiah School of Advanced Studies 26Reference 1.DM3730, http:// www.ti.com/lit/ds/symlink/dm3730.pdf 2.DM3730, http://www.ti.com/lit/ml/sprt571/sprt571.pdf 3.DM3730, http://media.digikey.com/pdf/ DM3730_AM3703TorpedoSOMBrief.pdf
  • 27. M. S. Ramaiah School of Advanced Studies 27