This document discusses the implementation of a basic MIPS processor including building the datapath, control implementation, pipelining, and handling hazards. It describes the MIPS instruction set and 5-stage pipeline. The datapath is built from components like registers, ALUs, and adders. Control signals are designed for different instructions. Pipelining is implemented using techniques like forwarding and branch prediction to handle data and control hazards between stages. Exceptions are handled using status registers or vectored interrupts.
The document provides an overview of pipelining in computer processors. It discusses how pipelining can increase processor performance by overlapping the execution of multiple instructions. It describes the five stages of instruction execution in a MIPS processor: fetch, decode, execute, memory, and writeback. It also discusses three types of hazards that can occur in pipelining - structural hazards, data hazards, and control/branch hazards. For each hazard, it provides an example and discusses possible solutions like forwarding, stalling, and branch prediction.
The document discusses parallel processing and pipelining techniques in computer organization. It covers topics like parallel processing concepts and classifications, pipelining concepts and how it increases computational speed, arithmetic and instruction pipelining, handling pipeline hazards like data dependencies and branches. The key advantages of pipelining include decomposing tasks into sequential sub-operations that can complete concurrently, improving throughput and achieving speedup close to the number of pipeline stages when the number of tasks is large.
This document discusses instruction pipelining in computer processors. It begins by defining pipelining and explaining how it works like an assembly line to increase throughput. It then discusses different types of pipelines and introduces the MIPS instruction pipeline as an example. The document goes on to explain different types of pipeline hazards like structural hazards, control hazards, and data hazards. It provides examples of how to detect and resolve these hazards through techniques like forwarding, stalling, predicting, and delayed branching. Key concepts covered include pipeline registers, control signals, forwarding units, and branch prediction buffers.
Pipelining of Processors Computer ArchitectureHaris456
Ìý
Pipelining is a technique used in microprocessors to overlap the execution of multiple instructions to increase throughput. It works by dividing the instruction execution process into discrete stages, such as fetch, decode, execute, memory, and write-back. When an instruction enters one stage, the previous instruction can enter the next stage, allowing the processor to complete more than one instruction per clock cycle. Pipelining reduces the time needed to complete a series of instructions by allowing the stages to process separate instructions simultaneously rather than sequentially.
The document provides an overview of pipelining in computer processors. It discusses how pipelining works by dividing processor operations like fetch, decode, execute, memory, and write-back into discrete stages that can overlap, improving throughput. Key points made include:
- Pipelining allows multiple instructions to be in different stages of completion at the same time, improving instruction throughput.
- The document uses an example of a sequential laundry process versus a pipelined laundry process to illustrate how pipelining improves efficiency.
- It describes the five main stages of a RISC instruction set pipeline - fetch, decode, execute, memory, and write-back. The work done and data passed between each stage
This document summarizes key aspects of CPU processor design, including:
1) It examines two implementations of a MIPS processor - a simple single-cycle version and a more realistic pipelined version. The pipelined version breaks instruction execution into five stages to improve performance.
2) It discusses hazards that can occur in a pipeline like data hazards and branch hazards. Techniques like forwarding, stalling, and branch prediction are used to resolve hazards.
3) The control logic for the pipelined MIPS processor is explained, including how it detects hazards and forwards data between stages when needed. Stalls are also inserted when necessary to ensure correctness.
This document discusses parallel processing and pipelining techniques used to improve computer performance. It covers parallel processing classifications including SISD, SIMD, MISD, and MIMD models. Pipelining is defined as decomposing tasks into sequential suboperations that execute concurrently. Arithmetic and instruction pipelines are described as having multiple stages to overlap processing of different instructions. Vector processing and array processors are mentioned as techniques to perform simultaneous operations on multiple data items.
Computer arithmetic in computer architectureishapadhy
Ìý
The document discusses Flynn's Taxonomy, which classifies computer architectures based on the number of instruction and data streams. It proposes four categories: SISD, SIMD, MISD, and MIMD. SISD refers to a single instruction single data stream architecture, like the classical von Neumann model. SIMD uses a single instruction on multiple data streams, for applications like image processing. MIMD uses multiple instruction and data streams and is most common, allowing distributed computing across independent computers. The document also discusses parallel processing, pipeline processing in computers, and hazards that can occur in instruction pipelines.
Pipelining is a technique used in computer processors to overlap the execution of instructions to enhance performance. It works by dividing instruction execution into discrete stages, such as fetch, decode, execute, memory, and write-back, so that multiple instructions can be in different stages at the same time. In a pipelined processor, the average time to complete an instruction is reduced compared to a non-pipelined processor, leading to higher throughput. However, special techniques are needed to handle data and structural hazards that can occur when instructions interact in unexpected ways within the pipeline.
Parallel processing involves performing multiple tasks simultaneously to increase computational speed. It can be achieved through pipelining, where instructions are overlapped in execution, or vector/array processors where the same operation is performed on multiple data elements at once. The main types are SIMD (single instruction multiple data) and MIMD (multiple instruction multiple data). Pipelining provides higher throughput by keeping the pipeline full but requires handling dependencies between instructions to avoid hazards slowing things down.
This document provides an overview of implementing a simplified MIPS processor with a memory-reference instructions, arithmetic-logical instructions, and control flow instructions. It discusses:
1. Using a program counter to fetch instructions from memory and reading register operands.
2. Executing most instructions via fetching, operand fetching, execution, and storing in a single cycle.
3. Building a datapath with functional units for instruction fetching, ALU operations, memory references, and branches/jumps.
4. Implementing control using a finite state machine that sets multiplexers and control lines based on the instruction.
This document provides an overview of implementing a simplified MIPS processor with a memory-reference instructions, arithmetic-logical instructions, and control flow instructions. It discusses:
1. Using a program counter to fetch instructions from memory and reading register operands.
2. Executing most instructions via fetching, operand fetching, execution, and storing in a single cycle.
3. Building a datapath with functional units for instruction fetching, ALU operations, memory references, and branches/jumps.
4. Implementing control using a finite state machine that sets multiplexers and control lines based on the instruction.
The document discusses computer organization and architecture. It covers topics related to basic processing units including fundamental concepts, register transfer, execution of instructions, and multiple bus organization. It also describes hardwired control, microprogrammed control, and microinstruction sequencing. Specifically, it explains how basic processing units work, the components of a CPU, register transfer, executing instructions via fetch-decode-execute cycles, and address generation techniques for microprogrammed control units.
This document summarizes the key components and organization of superscalar processor pipelines. It discusses how superscalar processors can execute multiple instructions per cycle by exploiting instruction-level parallelism. The document outlines the major stages in a superscalar pipeline including instruction fetch, decode, dispatch, execution, completion, and retirement. It also discusses limiting factors like structural hazards from resource conflicts, data hazards from dependencies between instructions, and control hazards from branches.
This document discusses instruction pipelining and main memory. It begins by explaining how an instruction pipeline works, overlapping the fetch, decode, and execute phases of instruction processing. It notes some difficulties in pipelining including resource conflicts, data dependencies, and branch instructions. It then discusses pipeline control and performance, noting that pipelining provides faster processing by decomposing tasks into sequential sub-operations that can overlap. It concludes by answering questions about pipelining hazards and calculating pipeline metrics for example processors.
A digital signal processor (DSP) is a specialized microprocessor optimized for digital signal processing. DSPs have architectures designed for fast arithmetic operations and efficient data handling needed for tasks like audio processing. Common DSP features include instruction sets for single-instruction multiple-data operations, hardware looping for repetitive tasks, and pipelined execution to improve performance. DSPs are commonly used for applications involving digital audio, images, and communication signals.
This slide contain the description about the various technique related to parallel Processing(vector Processing and array processor), Arithmetic pipeline, Instruction Pipeline, SIMD processor, Attached array processor
The document discusses pipelining in processors. It begins by explaining that pipelining overlaps the execution of instructions to improve performance. It then describes the basic concept of dividing instruction execution into stages connected in a pipeline. It provides details on the stages in a five-stage pipeline model and how instructions can be fetched and executed in an overlapped, pipelined manner. However, it notes pipelining issues can occur if there are dependencies between instructions. It discusses different types of hazards and techniques like forwarding, stalling, and branch prediction that are used to handle hazards in pipelined processors.
Here are the answers to the questions:
1. Pipeline cycle time = Maximum delay of any stage + Latch delay
= 90 ns + 10 ns = 100 ns
2. Non-pipeline execution time for one task = Total delay of all stages
= 60 + 50 + 90 + 80 = 280 ns
3. Speed up ratio = Non-pipeline time/Pipeline time
= 280/100 = 2.8
4. Pipeline time for 1000 tasks = Pipeline cycle time x Number of tasks
= 100 ns x 1000 = 100,000 ns = 100 μs
5. Sequential time for 1000 tasks = Non-pipeline time per task x Number of tasks
= 280 ns x 1000 = 280,
This document discusses instruction level parallelism and techniques for exploiting it. It covers topics like pipelining, instruction dependencies, hazards, and approaches to overcoming limitations on parallelism both through dynamic scheduling in hardware and through static transformations by compilers. Key limitations to parallelism discussed are branches, dependencies between instructions, and pipeline stalls caused by dependencies. The document provides an overview of these core computer architecture concepts.
Registers are temporary storage areas within the CPU that can hold instructions and data during processing. They allow for faster access and transfer of information compared to main memory. There are different types of registers that serve specific purposes, such as the program counter, accumulator, and memory address register. Buses are communication pathways that connect the CPU to other computer components like memory and expansion cards. The internal bus connects the CPU to RAM while the expansion bus allows additional devices to connect. Factors that influence data transfer speeds include RAM size, CPU speed and generation, register size, bus width and speed, and cache memory amount.
Advanced Pipelining in ARM Processors.pptxJoyChowdhury30
Ìý
The document discusses advanced pipelining techniques in ARM processors. It begins with an overview of pipelining and its benefits of improving throughput by executing multiple instructions simultaneously. ARM processors implement different numbers of pipeline stages - 3 stages in ARM7, 5 stages in ARM9, 6 stages in ARM10, and 7 stages in ARM11. Issues like control hazards, data hazards, and interrupts are addressed through techniques like data forwarding, branch prediction, and out-of-order execution. The 6-stage pipeline in ARM10 achieves double the throughput of ARM7 while compromising on latency. Branch target buffers are used to reduce delays from branch instructions. The 7-stage pipeline in ARM11 and above further improves performance using advanced data
The document discusses parallel processing and provides classifications of parallel computer architectures. It describes Flynn's classification of computer architectures as single instruction stream single data stream (SISD), single instruction stream multiple data stream (SIMD), multiple instruction stream single data stream (MISD), and multiple instruction stream multiple data stream (MIMD). It also discusses pipeline computers, array processors, and multiprocessor systems as different architectural configurations for parallel computers. Pipelining is described as a technique to decompose a process into sub-operations that execute concurrently in dedicated segments to achieve overlapping computation.
The document discusses value education and harmony at different levels. It covers:
- The need for value education to correctly identify aspirations and understand universal human values.
- Value education should include understanding oneself, relationships, society, nature, and the goal of human life.
- The process of value education begins with self-exploration by verifying proposals based on natural acceptance and experience.
- Basic human aspirations are happiness, prosperity, and their continuity, which require right understanding, relationships, and physical facilities.
- Harmony exists at the levels of the human being between self and body, the family, society, and nature which consists of physical, bio, animal, and human orders.
The document discusses memory hierarchy and technologies. It describes the different levels of memory from fastest to slowest as processor registers, cache memory (levels 1 and 2), main memory, and secondary storage. The main memory technologies discussed are SRAM, DRAM, ROM, flash memory, and magnetic disks. Cache memory aims to speed up access time by exploiting locality of reference and uses mapping functions like direct mapping to determine cache locations.
Computer arithmetic in computer architectureishapadhy
Ìý
The document discusses Flynn's Taxonomy, which classifies computer architectures based on the number of instruction and data streams. It proposes four categories: SISD, SIMD, MISD, and MIMD. SISD refers to a single instruction single data stream architecture, like the classical von Neumann model. SIMD uses a single instruction on multiple data streams, for applications like image processing. MIMD uses multiple instruction and data streams and is most common, allowing distributed computing across independent computers. The document also discusses parallel processing, pipeline processing in computers, and hazards that can occur in instruction pipelines.
Pipelining is a technique used in computer processors to overlap the execution of instructions to enhance performance. It works by dividing instruction execution into discrete stages, such as fetch, decode, execute, memory, and write-back, so that multiple instructions can be in different stages at the same time. In a pipelined processor, the average time to complete an instruction is reduced compared to a non-pipelined processor, leading to higher throughput. However, special techniques are needed to handle data and structural hazards that can occur when instructions interact in unexpected ways within the pipeline.
Parallel processing involves performing multiple tasks simultaneously to increase computational speed. It can be achieved through pipelining, where instructions are overlapped in execution, or vector/array processors where the same operation is performed on multiple data elements at once. The main types are SIMD (single instruction multiple data) and MIMD (multiple instruction multiple data). Pipelining provides higher throughput by keeping the pipeline full but requires handling dependencies between instructions to avoid hazards slowing things down.
This document provides an overview of implementing a simplified MIPS processor with a memory-reference instructions, arithmetic-logical instructions, and control flow instructions. It discusses:
1. Using a program counter to fetch instructions from memory and reading register operands.
2. Executing most instructions via fetching, operand fetching, execution, and storing in a single cycle.
3. Building a datapath with functional units for instruction fetching, ALU operations, memory references, and branches/jumps.
4. Implementing control using a finite state machine that sets multiplexers and control lines based on the instruction.
This document provides an overview of implementing a simplified MIPS processor with a memory-reference instructions, arithmetic-logical instructions, and control flow instructions. It discusses:
1. Using a program counter to fetch instructions from memory and reading register operands.
2. Executing most instructions via fetching, operand fetching, execution, and storing in a single cycle.
3. Building a datapath with functional units for instruction fetching, ALU operations, memory references, and branches/jumps.
4. Implementing control using a finite state machine that sets multiplexers and control lines based on the instruction.
The document discusses computer organization and architecture. It covers topics related to basic processing units including fundamental concepts, register transfer, execution of instructions, and multiple bus organization. It also describes hardwired control, microprogrammed control, and microinstruction sequencing. Specifically, it explains how basic processing units work, the components of a CPU, register transfer, executing instructions via fetch-decode-execute cycles, and address generation techniques for microprogrammed control units.
This document summarizes the key components and organization of superscalar processor pipelines. It discusses how superscalar processors can execute multiple instructions per cycle by exploiting instruction-level parallelism. The document outlines the major stages in a superscalar pipeline including instruction fetch, decode, dispatch, execution, completion, and retirement. It also discusses limiting factors like structural hazards from resource conflicts, data hazards from dependencies between instructions, and control hazards from branches.
This document discusses instruction pipelining and main memory. It begins by explaining how an instruction pipeline works, overlapping the fetch, decode, and execute phases of instruction processing. It notes some difficulties in pipelining including resource conflicts, data dependencies, and branch instructions. It then discusses pipeline control and performance, noting that pipelining provides faster processing by decomposing tasks into sequential sub-operations that can overlap. It concludes by answering questions about pipelining hazards and calculating pipeline metrics for example processors.
A digital signal processor (DSP) is a specialized microprocessor optimized for digital signal processing. DSPs have architectures designed for fast arithmetic operations and efficient data handling needed for tasks like audio processing. Common DSP features include instruction sets for single-instruction multiple-data operations, hardware looping for repetitive tasks, and pipelined execution to improve performance. DSPs are commonly used for applications involving digital audio, images, and communication signals.
This slide contain the description about the various technique related to parallel Processing(vector Processing and array processor), Arithmetic pipeline, Instruction Pipeline, SIMD processor, Attached array processor
The document discusses pipelining in processors. It begins by explaining that pipelining overlaps the execution of instructions to improve performance. It then describes the basic concept of dividing instruction execution into stages connected in a pipeline. It provides details on the stages in a five-stage pipeline model and how instructions can be fetched and executed in an overlapped, pipelined manner. However, it notes pipelining issues can occur if there are dependencies between instructions. It discusses different types of hazards and techniques like forwarding, stalling, and branch prediction that are used to handle hazards in pipelined processors.
Here are the answers to the questions:
1. Pipeline cycle time = Maximum delay of any stage + Latch delay
= 90 ns + 10 ns = 100 ns
2. Non-pipeline execution time for one task = Total delay of all stages
= 60 + 50 + 90 + 80 = 280 ns
3. Speed up ratio = Non-pipeline time/Pipeline time
= 280/100 = 2.8
4. Pipeline time for 1000 tasks = Pipeline cycle time x Number of tasks
= 100 ns x 1000 = 100,000 ns = 100 μs
5. Sequential time for 1000 tasks = Non-pipeline time per task x Number of tasks
= 280 ns x 1000 = 280,
This document discusses instruction level parallelism and techniques for exploiting it. It covers topics like pipelining, instruction dependencies, hazards, and approaches to overcoming limitations on parallelism both through dynamic scheduling in hardware and through static transformations by compilers. Key limitations to parallelism discussed are branches, dependencies between instructions, and pipeline stalls caused by dependencies. The document provides an overview of these core computer architecture concepts.
Registers are temporary storage areas within the CPU that can hold instructions and data during processing. They allow for faster access and transfer of information compared to main memory. There are different types of registers that serve specific purposes, such as the program counter, accumulator, and memory address register. Buses are communication pathways that connect the CPU to other computer components like memory and expansion cards. The internal bus connects the CPU to RAM while the expansion bus allows additional devices to connect. Factors that influence data transfer speeds include RAM size, CPU speed and generation, register size, bus width and speed, and cache memory amount.
Advanced Pipelining in ARM Processors.pptxJoyChowdhury30
Ìý
The document discusses advanced pipelining techniques in ARM processors. It begins with an overview of pipelining and its benefits of improving throughput by executing multiple instructions simultaneously. ARM processors implement different numbers of pipeline stages - 3 stages in ARM7, 5 stages in ARM9, 6 stages in ARM10, and 7 stages in ARM11. Issues like control hazards, data hazards, and interrupts are addressed through techniques like data forwarding, branch prediction, and out-of-order execution. The 6-stage pipeline in ARM10 achieves double the throughput of ARM7 while compromising on latency. Branch target buffers are used to reduce delays from branch instructions. The 7-stage pipeline in ARM11 and above further improves performance using advanced data
The document discusses parallel processing and provides classifications of parallel computer architectures. It describes Flynn's classification of computer architectures as single instruction stream single data stream (SISD), single instruction stream multiple data stream (SIMD), multiple instruction stream single data stream (MISD), and multiple instruction stream multiple data stream (MIMD). It also discusses pipeline computers, array processors, and multiprocessor systems as different architectural configurations for parallel computers. Pipelining is described as a technique to decompose a process into sub-operations that execute concurrently in dedicated segments to achieve overlapping computation.
The document discusses value education and harmony at different levels. It covers:
- The need for value education to correctly identify aspirations and understand universal human values.
- Value education should include understanding oneself, relationships, society, nature, and the goal of human life.
- The process of value education begins with self-exploration by verifying proposals based on natural acceptance and experience.
- Basic human aspirations are happiness, prosperity, and their continuity, which require right understanding, relationships, and physical facilities.
- Harmony exists at the levels of the human being between self and body, the family, society, and nature which consists of physical, bio, animal, and human orders.
The document discusses memory hierarchy and technologies. It describes the different levels of memory from fastest to slowest as processor registers, cache memory (levels 1 and 2), main memory, and secondary storage. The main memory technologies discussed are SRAM, DRAM, ROM, flash memory, and magnetic disks. Cache memory aims to speed up access time by exploiting locality of reference and uses mapping functions like direct mapping to determine cache locations.
Unit IV discusses parallelism and parallel processing architectures. It introduces Flynn's classifications of parallel systems as SISD, MIMD, SIMD, and SPMD. Hardware approaches to parallelism include multicore processors, shared memory multiprocessors, and message-passing systems like clusters, GPUs, and warehouse-scale computers. The goals of parallelism are to increase computational speed and throughput by processing data concurrently across multiple processors.
1) The ALU performs arithmetic operations like addition, subtraction, multiplication and division on fixed point and floating point numbers. Fixed point uses integers while floating point uses a sign, mantissa, and exponent.
2) Binary numbers are added using half adders and full adders which are logic circuits that implement addition using truth tables and K-maps. Subtraction is done using 1's or 2's complement representations.
3) Multiplication is done using sequential or Booth's algorithm approaches while division uses restoring or non-restoring algorithms. Floating point uses similar addition and subtraction steps but first normalizes the exponents.
The document discusses several key concepts in computer architecture:
- It describes functional units, instruction representation, logical operations, decision making, and MIPS addressing.
- It discusses techniques for improving performance like parallelism, pipelining, and prediction.
- It explains the hierarchy of computer memory and how redundancy improves dependability.
The document outlines the units covered in a computer networks course, including an introduction, data link layer and media access, network layer, transport layer, and application layer. It provides the unit breakdown for a sample PPT on computer networks taught at Kongunadu College of Engineering and Technology's Department of Computer Science and Engineering.
Indian Soil Classification System in Geotechnical EngineeringRajani Vyawahare
Ìý
This PowerPoint presentation provides a comprehensive overview of the Indian Soil Classification System, widely used in geotechnical engineering for identifying and categorizing soils based on their properties. It covers essential aspects such as particle size distribution, sieve analysis, and Atterberg consistency limits, which play a crucial role in determining soil behavior for construction and foundation design. The presentation explains the classification of soil based on particle size, including gravel, sand, silt, and clay, and details the sieve analysis experiment used to determine grain size distribution. Additionally, it explores the Atterberg consistency limits, such as the liquid limit, plastic limit, and shrinkage limit, along with a plasticity chart to assess soil plasticity and its impact on engineering applications. Furthermore, it discusses the Indian Standard Soil Classification (IS 1498:1970) and its significance in construction, along with a comparison to the Unified Soil Classification System (USCS). With detailed explanations, graphs, charts, and practical applications, this presentation serves as a valuable resource for students, civil engineers, and researchers in the field of geotechnical engineering.
The Golden Gate Bridge a structural marvel inspired by mother nature.pptxAkankshaRawat75
Ìý
The Golden Gate Bridge is a 6 lane suspension bridge spans the Golden Gate Strait, connecting the city of San Francisco to Marin County, California.
It provides a vital transportation link between the Pacific Ocean and the San Francisco Bay.
Lecture -3 Cold water supply system.pptxrabiaatif2
Ìý
The presentation on Cold Water Supply explored the fundamental principles of water distribution in buildings. It covered sources of cold water, including municipal supply, wells, and rainwater harvesting. Key components such as storage tanks, pipes, valves, and pumps were discussed for efficient water delivery. Various distribution systems, including direct and indirect supply methods, were analyzed for residential and commercial applications. The presentation emphasized water quality, pressure regulation, and contamination prevention. Common issues like pipe corrosion, leaks, and pressure drops were addressed along with maintenance strategies. Diagrams and case studies illustrated system layouts and best practices for optimal performance.
Engineering at Lovely Professional University (LPU).pdfSona
Ìý
LPU’s engineering programs provide students with the skills and knowledge to excel in the rapidly evolving tech industry, ensuring a bright and successful future. With world-class infrastructure, top-tier placements, and global exposure, LPU stands as a premier destination for aspiring engineers.
Optimization of Cumulative Energy, Exergy Consumption and Environmental Life ...J. Agricultural Machinery
Ìý
Optimal use of resources, including energy, is one of the most important principles in modern and sustainable agricultural systems. Exergy analysis and life cycle assessment were used to study the efficient use of inputs, energy consumption reduction, and various environmental effects in the corn production system in Lorestan province, Iran. The required data were collected from farmers in Lorestan province using random sampling. The Cobb-Douglas equation and data envelopment analysis were utilized for modeling and optimizing cumulative energy and exergy consumption (CEnC and CExC) and devising strategies to mitigate the environmental impacts of corn production. The Cobb-Douglas equation results revealed that electricity, diesel fuel, and N-fertilizer were the major contributors to CExC in the corn production system. According to the Data Envelopment Analysis (DEA) results, the average efficiency of all farms in terms of CExC was 94.7% in the CCR model and 97.8% in the BCC model. Furthermore, the results indicated that there was excessive consumption of inputs, particularly potassium and phosphate fertilizers. By adopting more suitable methods based on DEA of efficient farmers, it was possible to save 6.47, 10.42, 7.40, 13.32, 31.29, 3.25, and 6.78% in the exergy consumption of diesel fuel, electricity, machinery, chemical fertilizers, biocides, seeds, and irrigation, respectively.
IPC-9716_2024 Requirements for Automated Optical Inspection (AOI) Process Con...ssuserd9338b
Ìý
CA UNIT III.pptx
2. UNIT III
A Basic MIPS implementation – Building a
Datapath – Control Implementation Scheme –
Pipelining – Pipelined datapath and control –
Handling Data Hazards & Control Hazards –
Exceptions.
3. Basic MIPS Implementation
• MIPS - Million Instructions Per Second
• Instruction set is divided into three classes:
1. Memory-reference – load word (lw) and store
word (sw)
2. Arithmetic-logical – add, sub, AND, OR
3. Branch instruction – branch equal (beq) and
jump (j)
4. MIPS Instruction Execution :
MIPS instructions classically take five steps:
ï‚ž Fetch instruction from memory
ï‚ž Read registers while decoding the instruction.
The format of MIPS instructions allows reading
and decoding to occur simultaneously
ï‚ž Execute the operation or calculate an address
ï‚ž Access an operand in data memory
ï‚ž Write the result into a register
7. Building a Datapath
ï‚ž Elements that process data and addresses in
the CPU - Memories, registers, Program
Counter(PC), ALUs, adders.
8. Datapath for different instructions:
ï‚ž Datapath for arithmetic-logic instructions
ï‚ž Datapath for Load and Store word instructions
ï‚ž Datapath for Branch instructions
ï‚ž Creating a Single Datapath
9. 1. Datapath for arithmetic-logic
instructions
• R-type instructions set. Eg. sub $s1, $s3, $s2
10. 2. Datapath for Load and Store
word instructions
• Load or store instruction set. Eg. Sw $s1, -50
12. 4. Creating a Single Datapath
• To execute a instructions at least one clock
cycle is required.
13. Control implementation Scheme
• Designing the main control unit
• The ALU control
• Operation of the Datapath
• Datapath for an R-type instruction
• Datapath for Load word instruction
• Datapath for Branch-on-equal instruction
20. Pipeline
• Pipelining is an implementation technique in
which multiple instructions are overlapped in
execution. Used to increase the speed and
performance of the processor.
• Two stage methods available for pipeline:
1. Four stage
2. Six Stage
24. Pipeline Stages in the MIPS
1. Fetch instruction
2. Decode and read register simultaneously
3. Execute the operation
4. Access an operand in data memory
5. Write result to register/memory
25. Pipeline Hazards
• A hazards is a condition that prevents an
instruction in the pipeline from executing its next
scheduled pipeline stage.
• Any reason that cause the pipeline to stall is called
a hazard.
Types of hazards:
1. Structural hazards
When two instructions require the use of a given
hardware resource at the same time.
2. Data hazards
Instructions depends on result of prior computation
which is not ready (computed or stored) yet.
26. 3. Control hazards
Arise from the pipelining of branches and other
instructions that change the PC.
Unconditional branching:
27. Branch Penalty:
Time lost due to branch instruction is
known as branch penalty.
Factor effecting branch penalty:
It is more for complex instructions
For a longer pipeline, branch penalty is
more.
28. Pipeline Performance
• Clock cycle – time period for the clock cycle it
is necessary to consider stage delay and
interstage delay.
Time ïƒ
(cycle)
29. Pipeline Performance
Total time required is, Tm = [m + (n-1)]t
Speedup factor: non-pipelined processor is
defined as,
Sm = (nm / m +(n-1) )
Where, n = segment/stages
M = no. of ins
30. Pipeline Performance
Efficiency: The ratio of speedup factor and the
number of stages in the pipeline.
Em = n / m+(n-1)
Performance/Cost Ratio (PCR): Total cost of
pipeline is estimated by c + mh, where
PCR = 1 / ((t/ m+d)(c+mh)
31. Pipeline Datapath and control
Pipeline processor consists of a sequence of data
processing circuits, called elements, stages or
segments
Each stage consists of two major blocks:
ï‚žMultiword input register
ï‚žDatapath circuit
R 1 C1 R2 C2 R3 C3……Rm Cm
CONTROL UNIT
Data
in
Data
out
32. • Implementation of Two stage instruction
pipelining
Breaks a single instruction into two parts:
1. A fetch stage s1
2. An execute stage s2
33. 4 stage Instruction pipelining with
CPU
• CPU is directly connected to cache memory
and it is represented as I-cache and D-cache.
• This permits both instruction and memory data
to be accessed with same clock cycle.
1. IF: instruction fetch and decode using I
cache.(S1)
2. OL: operand loading from D-cache to register
file.(S2)
3. EX: data processing using the ALU and
register file.(S3)
4. OS: operand storing to the D-cache from
register file.(S4)
35. Implementation of MIPS
instruction pipeline
• Five stage pipeline
1. IF : Instruction Fetch S1
2. ID : Instruction Decode S2
3. EX : Execution S3
4. MEM : Data memory access S4
5. WB : Write Back S5
37. Pipeline Control
• Signle-cycle datapath. The datapath uses the
control logic for PC source, register destination
number and ALU control.
ï‚ž Control Lines into 5 groups.
1. Instruction Fetch: Control signals to read
instruction and write PC
2. Instruction decode/register file read: control is
not required to this pipeline
3. Execution/address calculation: Control signals
are Regdst, ALUop or src.
4. Memory access: Signals are Branch,Memread
and Memwrite.
5. Write-back : Mem to reg and RegWrite
39. Handling data hazards
Operand forwarding
• A simple h/w techniques which can handle
data hazard is called Operand Forwarding or
register by passing.
• ALU results are fed back as ALU i/ps.
• The forwarding logic detects the previous ALU
operation has the operand for current
instruction, it forwards ALU o/p instead of
register file.
• Ex: add $s1,$s2,$s3
mul $s4,$s1,$s5
ïƒ mul $s4, o/p of s2+s3,$s5
42. Handling data hazards in software
• S/W detects the data dependencies.
• Necessary delay b/w 2 instructions by inserting NOP
(no-operation) instructions as follows:
I1 : MUL R2, R3, R4
NOP
NOP
ADD R4, R5, R6
Disadvantages of adding NOP instructions:
• It leads to larger code size.
• It has several H/W implementation.
43. Side Effects
• When destination register of the current instruction
is the source register of the next instruction there is
a data dependency. It is explicit and it is identified
by register name.
• The addition of these 2 numbers may be
accomplished as follows:
ADD R1,R3
ADD with carry R2, R4
R2<- [R2]+[R4]+Carry
44. Handling Control Hazards
Instruction queue and prefetching
• Fetch unit fetches and stores instruction queue.
• A separate unit called Dispatch unit.
• If instruction queue failed to give the instruction to
dispatch unit, fetch instruction automatically
transfer the instruction for decoding.
• Fetch unit always keep the queue full.
47. Instruction queue and prefetching
Branch Folding
• Fetch unit executes branch instruction
concurrently with the execution of other
instruction is known as branch folding.
• Branch folding occurs only if there exists at least
one instruction in the queue other than the branch
instruction.
48. Approaches to Deal
• The conditional branching is a major factor that
affects the performance of instruction pipelining.
1. Multiple streams
2. Prefetch branch target
3. Loop buffer
4. Branch prediction
49. 1.Multiple streams
- Two stream to Store fetched instruction.
2.Prefetch branch target
- Prefetched the target instruction if its recognized
3.Loop buffer
- Small/Temporary memory to store recently
prefetched instructions
50. 4. Branch prediction
- To check whether a branch will be valid or not valid.
These techniques reduce the branch penalty.
Techniques:
•predict never taken,
•predict always taken,
•predict by opcode,
•taken and not taken switch,
•branch history table
51. Branch prediction strategies
Static branch strategy: branch can be predicted
based on branch code type.
Dynamic branch strategy : uses recent branch
history during the program execution to predict
whether branch should be taken when next time
occurs.
52. Branch prediction strategies
The recent branch information includes branch
prediction statistics such as:
T – Taken
N – Not taken
NN - Last two branches not taken
NT - Not branch taken and previous taken
TT - Both last two branch taken
TN - Last branch taken and Previous not taken
53. Branch target buffer
• The recent branch
information is stored in
the buffer called BTB.
• Stores branch
information for
prediction
55. Exception
• One of the difficult parts of control is to
implement exceptions and interrupts-
events other than branches or jumps.
Handling exceptions in the MIPS
Architecture
Types of Exception:
1. Execution of an undefined instruction
2. Arithmetic overflow in the instruction add
$1,$2,$2.
56. Methods to communicate
exceptions
1. Status register method
The MIPS architecture uses a status
register, which holds a field that indicates
the reason for the exception.
2. Vectored interrupts method
Vector interrupts are used. The address to
which control is transferred is determined
by the cause of the exception.
57. Exception in a pipelined
Implementation
ï‚ž Multiple exception can occur in a single
clock cycle.
1. Imprecise Interrupts or Imprecise
exception
2. Precise Interrupts or precise exception
Editor's Notes
#31: Efficiency - The ratio of the output to the input of any system
#32: Efficiency - The ratio of the output to the input of any system
#33: Efficiency - The ratio of the output to the input of any system
#34: Efficiency - The ratio of the output to the input of any system
#36: Efficiency - The ratio of the output to the input of any system
#37: Efficiency - The ratio of the output to the input of any system
#38: Efficiency - The ratio of the output to the input of any system
#39: Efficiency - The ratio of the output to the input of any system
#40: Efficiency - The ratio of the output to the input of any system
#41: Efficiency - The ratio of the output to the input of any system
#42: Efficiency - The ratio of the output to the input of any system
#43: Efficiency - The ratio of the output to the input of any system
#44: Efficiency - The ratio of the output to the input of any system
#45: Efficiency - The ratio of the output to the input of any system
#46: Efficiency - The ratio of the output to the input of any system
#47: Efficiency - The ratio of the output to the input of any system
#48: Efficiency - The ratio of the output to the input of any system
#49: Efficiency - The ratio of the output to the input of any system
#50: Efficiency - The ratio of the output to the input of any system
#51: Efficiency - The ratio of the output to the input of any system
#52: Efficiency - The ratio of the output to the input of any system
#53: Efficiency - The ratio of the output to the input of any system
#54: Efficiency - The ratio of the output to the input of any system
#55: Efficiency - The ratio of the output to the input of any system