際際滷

際際滷Share a Scribd company logo
ARM 仗仂亳于 Intel:
 仄仂弍亳仍仆舒 舒
   ミ歳笑 仍
仍舒仆
   于亠亟亠仆亳亠
   丐亠仄亳仆仂仍仂亞亳
   ARM
   Intel
   仂亳于仂仂礌亳亠
ARM
   仂亟 仂仆仂于舒仆亳: 1990
   舒仗仂仍仂亢亠仆亳亠: 仆亞仍亳
   丼亳仍仂 仂亟仆亳从仂于: 1500 (2008)
   仂亟从亳 :
     IP-弍仍仂从亳 (仍亳亠仆亰亳亳)
Intel
   仂亟 仂仆仂于舒仆亳 1968
   舒仗仂仍仂亢亠仆亳亠: 弌丿
   丼亳仍仂 仂亟仆亳从仂于: 100000 (2012)
   仂亟从亳:
       x86 仗仂亠仂,
       丼亳仗亠,
       SSD-仆舒从仂仗亳亠仍亳,
       弌亠亠于仂亠 仂弍仂亟仂于舒仆亳亠,
       ...
仂弍亳仍仆舒 舒
丕仂亶于舒
 亳从仂-亠于亠
 丐亠仍亠于亳亰仂
 仂弍从亳
 丕仍舒弍从亳
 仍舒仆亠
 丐亠仍亠仂仆
丐亠仆亟亠仆亳亳
 弌仆亳亢亠仆亳亠 仗仂亠弍仍亠仆亳
   仆亠亞亳亳
 丕仄亠仆亠仆亳亠 亞舒弍舒亳仂于 亳   仆亠仆亠
   于亠舒
 从亳于仆仂亠 亳仗仂仍亰仂于舒仆亳亠
   亠亠亶
丐亊
亳亠从舒 亳 仄亳从仂舒亳亠从舒
亳仍亳亠仍仆亶 从仂仆于亠亶亠




   IF (Instruction Fetch)  仗仂仍亠仆亳亠 亳仆从亳亳,
   ID (Instruction Decode)  舒从仂亟亳仂于舒仆亳亠 亳仆从亳亳,
   EX (Execute)  于仗仂仍仆亠仆亳亠,
   MEM (Memory access)  亟仂仗 从 仗舒仄亳,
   WB (Register write back)  亰舒仗亳 于 亠亞亳.
ARM
亳亠从舒 ARM
 RISC
 32bit, 64bit*
 Cortex 亠仄亠亶于仂:
     A  application
     M  microcontroller
     R  realtime
   ISA: ARMV5, ARMV7..
   舒亳亠仆亳:
   Thumb1-2, Jazelle, NEON, vFP
   丕仍仂于仆仂亠 亳仗仂仍仆亠仆亳亠
丕仍仂于仆仂亠 亳仗仂仍仆亠仆亳亠
仗亠舒亳 于仗仂仍仆磳 亳仍亳 仆亠 于 亰舒于亳亳仄仂亳 仂 亠从亳 仍舒亞仂于
仗仂亠仂舒

弌 从仂亟                              ARM assembler
while (i != j) {                   loop CMP Ri, Rj;
 if (i > j)                        SUBGT Ri, Ri, Rj ;
     i -= j;                       SUBLT Rj, Rj, Ri ;
 else                              BNE loop ;
     j -= i;
}
Thumb1,2
 仂弍亠仆仆仂亳
   36 亳仆从亳亶
   16 弍亳仆亠
   仗仂仍亰亠 仍亳 仗仂仍仂于亳仆舒 亠亞亳仂于
   亠仆亳亶 仂弍亠仄 从仂亟舒
    于亠亳亳 Thumb 2 亟仂弍舒于仍亠仆 32 亳仆从亳亳
Jazelle
 Jazelle  亠仆仂仍仂亞亳 于仗仂仍仆亠仆亳 java 弍舒亶
  从仂亟舒 弍亠亰 舒仆仍亳亳
 Jazelle DBX (Dynamic Bytecode eXecution 
  亟亳仆舒仄亳亠从仂亠 于仗仂仍仆亠仆亳亠 弍舒亶-从仂亟仂于)
  仗仂舒于仍磳 从舒从 仂仗仂亠仂
 Jazelle RCT (Runtime Compiler Target 
  仗仂亟亟亠亢从舒 亟亳仆舒仄亳亠从亳 从仂仄仗亳仍仂仂于)
  仗亠亠于仂亟亳 1 弍舒亶-从仂亟 于 1 仄舒亳仆仆
  亳仆从亳
Cortex A15
   32bit
   ARMv7-A ISA
   28nm* 亠仗仂亠
   1,2 - 2,5GHz
   丕仍亠仆仆亶 仗亠亟从舒亰舒亠仍 仗亠亠仂亟仂于
   仂仍亠亠 亳仍仂 OOO 亳仆从亳亶
   NEON 亳仆从亳亳 亰舒 1 舒从*
   仂亟亟亠亢从舒 于亳舒仍亳亰舒亳亳
   Security Extensions
Cortex A15
ARM Cortex A15 vs A9
big.LITTLE
 LITTLE: A53
 亅仆亠亞仂亠从亳于仆亶
 仂仂亶, in-order, 8 舒亟亳亶



 Big: A57
 仂亳亰于仂亟亳亠仍仆亶
 弌仍仂亢仆亶, OOO, 仄仆仂亞仂* 舒亟亳亶
big.LITTLE
INTEL
亳亠从舒 86
   1978
   CISC*
   弍舒仆舒 仂于仄亠亳仄仂
   舒亳亠仆亳:
     MMX, SSE - SSE4.2, AVX, AVX2,
     AES
     x64
     Intel VT
     NX
Tick-Tock
Atom
仂弍亠仆仆仂亳:
 32bit
 x86 ISA
 32nm  14nm* 亠仗仂亠, 25mm2, ~50 仄仍仆 舒仆亰亳仂仂于
 0,6  2,13 GHz
 32Kb L1 I-cache 亳 D-cache
 1-2 磲舒 (2-4 仗仂仂从舒 - HyperThreading)
 0.65W - 13W Max TDP

弌亠 仗亳仄亠仆亠仆亳 亳 亠弍仂于舒仆亳:
 仂弍亳仍仆亠 仂亶于舒, Netbook
      仂亠弍仍亠仆亳亠 仆亠亞亳亳 于舒亢仆亠亠, 亠仄 仗仂亳亰于仂亟亳亠仍仆仂
      仂亳亰于仂亟亳亠仍仆仂 亟仂舒仂仆舒 亟仍 亠亳仆亞舒 仆亠仆亠舒
   弌仂于仄亠亳仄仂  x86
      亞仂仄仆仂亠 亳仍仂 仗仂亞舒仄仄 亳 弌
      束x86 于仂 于亠仄損
亳从仂舒亳亠从舒 Atom
        舒于亳仍仂 BigCore: 1% 仗仂亳亰于仂亟亳亠仍仆仂亳 ~ 2% 仗仂亠弍仍亠仆亳 仆亠亞亳亳
        舒于亳仍仂 Atom: 1% 仗仂亳亰于仂亟亳亠仍仆仂亳 ~ 1% 仗仂亠弍仍亠仆亳 仆亠亞亳亳

 弌仗亠从舒仍仆舒 舒亳亠从舒
 In-order
 弌仂于仄亠亳仄仂  x86
     仆从亳亳 仗仂亳亰于仂仍仆仂亶 亟仍亳仆
       (CISC)
     2 亟亠从仂亟亠舒
 个仆从亳仂仆舒仍仆亠 仄仂亟仍亳
     亳仆亳仄仄 仄仂亟仍亠亶 亟仍 仆亳亢亠仆亳亠
       仗仂亠弍仍亠仆亳 仆亠亞亳亳
     2 亠仍仂亳仍亠仆仆 丕 (jmp, shift)
     亠 亠仍仂亳仍亠仆仆 仄仆仂亢亠仆亳亶 亳
       亟亠仍亠仆亳亶
     2 仄仂亟仍 于亠亠于亠仆仆仂亶 舒亳仄亠亳从亳
亠从仂亟亠
  ADD             SIN




    uOP


        uOP
   uOP      uOP
         uOP
SSE
 SSE  Streaming SIMD Extensions
Intel vs ARM
Intel                      ARM
 86 从仂亟 磦仍磳          亅仆亠亞仂仗仂亠弍仍亠仆亳亠
   舒仆亟舒仂仄               舒仗仂舒仆亠仆亳亠 仆舒
 丐亠仗仂亠                 仆从亠
 仂亳亰于仂亟亳亠仍仆仂        弌仂亳仄仂



                       ?
弌弌  !

More Related Content

ARM vs Intel microarchitecture

  • 1. ARM 仗仂亳于 Intel: 仄仂弍亳仍仆舒 舒 ミ歳笑 仍
  • 2. 仍舒仆 于亠亟亠仆亳亠 丐亠仄亳仆仂仍仂亞亳 ARM Intel 仂亳于仂仂礌亳亠
  • 3.
  • 4. ARM 仂亟 仂仆仂于舒仆亳: 1990 舒仗仂仍仂亢亠仆亳亠: 仆亞仍亳 丼亳仍仂 仂亟仆亳从仂于: 1500 (2008) 仂亟从亳 : IP-弍仍仂从亳 (仍亳亠仆亰亳亳)
  • 5. Intel 仂亟 仂仆仂于舒仆亳 1968 舒仗仂仍仂亢亠仆亳亠: 弌丿 丼亳仍仂 仂亟仆亳从仂于: 100000 (2012) 仂亟从亳: x86 仗仂亠仂, 丼亳仗亠, SSD-仆舒从仂仗亳亠仍亳, 弌亠亠于仂亠 仂弍仂亟仂于舒仆亳亠, ...
  • 6. 仂弍亳仍仆舒 舒 丕仂亶于舒 亳从仂-亠于亠 丐亠仍亠于亳亰仂 仂弍从亳 丕仍舒弍从亳 仍舒仆亠 丐亠仍亠仂仆 丐亠仆亟亠仆亳亳 弌仆亳亢亠仆亳亠 仗仂亠弍仍亠仆亳 仆亠亞亳亳 丕仄亠仆亠仆亳亠 亞舒弍舒亳仂于 亳 仆亠仆亠 于亠舒 从亳于仆仂亠 亳仗仂仍亰仂于舒仆亳亠 亠亠亶
  • 9. 亳仍亳亠仍仆亶 从仂仆于亠亶亠 IF (Instruction Fetch) 仗仂仍亠仆亳亠 亳仆从亳亳, ID (Instruction Decode) 舒从仂亟亳仂于舒仆亳亠 亳仆从亳亳, EX (Execute) 于仗仂仍仆亠仆亳亠, MEM (Memory access) 亟仂仗 从 仗舒仄亳, WB (Register write back) 亰舒仗亳 于 亠亞亳.
  • 10. ARM
  • 11. 亳亠从舒 ARM RISC 32bit, 64bit* Cortex 亠仄亠亶于仂: A application M microcontroller R realtime ISA: ARMV5, ARMV7.. 舒亳亠仆亳: Thumb1-2, Jazelle, NEON, vFP 丕仍仂于仆仂亠 亳仗仂仍仆亠仆亳亠
  • 12. 丕仍仂于仆仂亠 亳仗仂仍仆亠仆亳亠 仗亠舒亳 于仗仂仍仆磳 亳仍亳 仆亠 于 亰舒于亳亳仄仂亳 仂 亠从亳 仍舒亞仂于 仗仂亠仂舒 弌 从仂亟 ARM assembler while (i != j) { loop CMP Ri, Rj; if (i > j) SUBGT Ri, Ri, Rj ; i -= j; SUBLT Rj, Rj, Ri ; else BNE loop ; j -= i; }
  • 13. Thumb1,2 仂弍亠仆仆仂亳 36 亳仆从亳亶 16 弍亳仆亠 仗仂仍亰亠 仍亳 仗仂仍仂于亳仆舒 亠亞亳仂于 亠仆亳亶 仂弍亠仄 从仂亟舒 于亠亳亳 Thumb 2 亟仂弍舒于仍亠仆 32 亳仆从亳亳
  • 14. Jazelle Jazelle 亠仆仂仍仂亞亳 于仗仂仍仆亠仆亳 java 弍舒亶 从仂亟舒 弍亠亰 舒仆仍亳亳 Jazelle DBX (Dynamic Bytecode eXecution 亟亳仆舒仄亳亠从仂亠 于仗仂仍仆亠仆亳亠 弍舒亶-从仂亟仂于) 仗仂舒于仍磳 从舒从 仂仗仂亠仂 Jazelle RCT (Runtime Compiler Target 仗仂亟亟亠亢从舒 亟亳仆舒仄亳亠从亳 从仂仄仗亳仍仂仂于) 仗亠亠于仂亟亳 1 弍舒亶-从仂亟 于 1 仄舒亳仆仆 亳仆从亳
  • 15. Cortex A15 32bit ARMv7-A ISA 28nm* 亠仗仂亠 1,2 - 2,5GHz 丕仍亠仆仆亶 仗亠亟从舒亰舒亠仍 仗亠亠仂亟仂于 仂仍亠亠 亳仍仂 OOO 亳仆从亳亶 NEON 亳仆从亳亳 亰舒 1 舒从* 仂亟亟亠亢从舒 于亳舒仍亳亰舒亳亳 Security Extensions
  • 17. ARM Cortex A15 vs A9
  • 18. big.LITTLE LITTLE: A53 亅仆亠亞仂亠从亳于仆亶 仂仂亶, in-order, 8 舒亟亳亶 Big: A57 仂亳亰于仂亟亳亠仍仆亶 弌仍仂亢仆亶, OOO, 仄仆仂亞仂* 舒亟亳亶
  • 20. INTEL
  • 21. 亳亠从舒 86 1978 CISC* 弍舒仆舒 仂于仄亠亳仄仂 舒亳亠仆亳: MMX, SSE - SSE4.2, AVX, AVX2, AES x64 Intel VT NX
  • 23. Atom 仂弍亠仆仆仂亳: 32bit x86 ISA 32nm 14nm* 亠仗仂亠, 25mm2, ~50 仄仍仆 舒仆亰亳仂仂于 0,6 2,13 GHz 32Kb L1 I-cache 亳 D-cache 1-2 磲舒 (2-4 仗仂仂从舒 - HyperThreading) 0.65W - 13W Max TDP 弌亠 仗亳仄亠仆亠仆亳 亳 亠弍仂于舒仆亳: 仂弍亳仍仆亠 仂亶于舒, Netbook 仂亠弍仍亠仆亳亠 仆亠亞亳亳 于舒亢仆亠亠, 亠仄 仗仂亳亰于仂亟亳亠仍仆仂 仂亳亰于仂亟亳亠仍仆仂 亟仂舒仂仆舒 亟仍 亠亳仆亞舒 仆亠仆亠舒 弌仂于仄亠亳仄仂 x86 亞仂仄仆仂亠 亳仍仂 仗仂亞舒仄仄 亳 弌 束x86 于仂 于亠仄損
  • 24. 亳从仂舒亳亠从舒 Atom 舒于亳仍仂 BigCore: 1% 仗仂亳亰于仂亟亳亠仍仆仂亳 ~ 2% 仗仂亠弍仍亠仆亳 仆亠亞亳亳 舒于亳仍仂 Atom: 1% 仗仂亳亰于仂亟亳亠仍仆仂亳 ~ 1% 仗仂亠弍仍亠仆亳 仆亠亞亳亳 弌仗亠从舒仍仆舒 舒亳亠从舒 In-order 弌仂于仄亠亳仄仂 x86 仆从亳亳 仗仂亳亰于仂仍仆仂亶 亟仍亳仆 (CISC) 2 亟亠从仂亟亠舒 个仆从亳仂仆舒仍仆亠 仄仂亟仍亳 亳仆亳仄仄 仄仂亟仍亠亶 亟仍 仆亳亢亠仆亳亠 仗仂亠弍仍亠仆亳 仆亠亞亳亳 2 亠仍仂亳仍亠仆仆 丕 (jmp, shift) 亠 亠仍仂亳仍亠仆仆 仄仆仂亢亠仆亳亶 亳 亟亠仍亠仆亳亶 2 仄仂亟仍 于亠亠于亠仆仆仂亶 舒亳仄亠亳从亳
  • 25. 亠从仂亟亠 ADD SIN uOP uOP uOP uOP uOP
  • 26. SSE SSE Streaming SIMD Extensions
  • 27. Intel vs ARM Intel ARM 86 从仂亟 磦仍磳 亅仆亠亞仂仗仂亠弍仍亠仆亳亠 舒仆亟舒仂仄 舒仗仂舒仆亠仆亳亠 仆舒 丐亠仗仂亠 仆从亠 仂亳亰于仂亟亳亠仍仆仂 弌仂亳仄仂 ?

Editor's Notes

  1. 丼亠仆亳亠 亳仆从亳亳 亳 亠 亟亠从仂亟亳仂于舒仆亳亠仂亳从 于亠 于磶舒仆仆 亟舒仆仆, 仆亠仂弍仂亟亳仄 亟仍 仂弍舒弍仂从亳 亳仆从亳亳弍舒弍仂从舒 亳仆从亳亳仂仗 于 仗舒仄 (2 舒从仂于舒 亳仆从亳)舒仗亳 亠亰仍舒仂于
  2. 32 弍亳仆亠 亳仆从亳亳: 于亠于仍亠仆亳 亳 仍仂于仆仂亞仂 于仗仂仍仆亠仆亳
  3. 弌仂舒礌亳亠 仗仂亠仂舒 仗亠亠从仍ム舒亠 仗亠 弍亳仂仄 (24亶)ARM CPSR (Current Program Status Register). The 'T'-bit must be cleared and the 'J'-bit set.
  4. 亠于亶 舒仄 仗仂 仗仂亟亟亠亢从仂亶 于亳舒仍亳亰舒亳亳
  5. The introduction of Large Physical Address Extensions (LPAE) enables the processor to access up to 1TB of memoryPerformance and power optimized L1 caches combine minimal access latency techniques to maximize performance and minimize power consumption. Caches are 32KB for instruction and 32KB for data. Also providing the option for cache coherence for enhanced inter-processor communication or support of rich SMP capable OS for simplified multicore software developmenCoreLink CCN-504 extends the capabilities of your SoC. Up to 16 cores on the same silicon die are possible with this fully-coherent, high-performance many-core solution. With up to 1TB/s of system bandwidth, and support for large L3 caches, SoC designers can address the needs of networking, server, and other enterprise-class devices.
  6. Pipeline depth:A15 15A9 8
  7. A53 64bit A7A57 64bit A15
  8. In-order processor
  9. 弌舒亠亞亳, 仗仂亟亟亠亢舒仆亳亠 亰舒从仂仆舒 舒
  10. Front-end:32KB, 8-way set associative, first-level instruction cache,Branch prediction units and ITLB,Two instruction decoders, each can decode up to one instruction per cycle.JEU jump execution unitAGU - Address Generation UnitTLB TranslationLookup Buffer (仍 仗亠亠于仂亟舒 于亳舒仍仆 舒亟亠仂于 于 亳亰亳亠从亳亠.个亳亰亳亠从亳亠 舒亟亠舒 亰舒亠仄 亳仗仂仍亰ム 亟仍 仂弍舒亠仆亳 于 从 亟舒仆仆)PMH - Page Miss Handler (Virtual->Physical Translation)BIU Bus Interface Unit 从仂仆仂仍仍亠 亳仆 亳 L2;The memory execution sub-system (MEU) can support 48-bit linear address for Intel64 Architecture, either 32-bit or 36-bit physical addressing modes. The MEUprovides: 24KB first level data cache, Hardware prefetching for L1 data cache, Two levels of DTLB for 4KByte and larger paging structure. Hardware pagewalker to service DTLB and ITLB misses. Two address generation units (port 0 supports loads and stores, port 1 supportsLEA and stack operations) Store-forwarding support for integer operations 8 write combining buffers.The bus logic sub-system provides 512KB, 8-way set associative, unified L2 cache, Hardware prefetching for L2 and interface logic to the front side bus.
  11. 仂仍亳仆于仂 亳仆从亳亶 仗亠亠于仂亟 于 1 仄ミ笑壬5% 亳仆从亳亶 亠弍ム 舒亰弍亳于从亳 仆舒 仄ミ笑壬舒亰弍亳于从舒 仆舒 仄ミ笑壬 仆亠 亟舒亠 仂仂弍 仗亠亳仄亠于 仆舒 in order