ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
20IT011-COMPILER DESIGN
Text Book :
Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, ¡°Compilers - Principles,
Techniques and Tools¡±, Pearson Education, 2nd Edition, 2018.
COURSE OUTCOME
CO1: Ability to understand the functionalities of the different phases of compiler.
CO2: Ability to understand syntax-directed translation and run-time environment.
CO3: Ability to implement code optimization techniques.
CO4: Ability to apply different parsing algorithms to develop the parsers for a
given grammar.
CO5: Ability to design a scanner and a parser using LEX and YACC tools.
UNIT I
INTRODUCTION TO COMPILERS
UNIT-I SYLLABUS
Structure of a compiler ¨C Lexical Analysis ¨C Role
of Lexical Analyzer -Input Buffering ¨C
Specification of Tokens ¨C Recognition of Tokens-
Lex ¨C Finite Automata-Regular Expressions to
Automata ¨C Minimizing DFA.
STRUCTURE OF A COMPILER
? Definition: A program that accept as input a program text in a certain language
and produces as output a program text in another language (Grune et all)
? Compiler is a translator program that reads a program written in one language -
the source language- and translates it into an equivalent program in another
language-the target language.
COMPILATION PROCESS
PHASES OF A COMPILER
Lexical Analysis
? In a compiler, linear analysis is called lexical analysis or scanning.
? It takes source code as input
? It reads one character at a time and convert into lexemes( represent in the
form of tokens)
Example: position = initial + rate * 60
? The identifier position.
? The assignment symbol =.
? The identifier initial.
? The plus sign.
? The identifier rate.
? The multiplication sign.
? The number 60.
<Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
Contd..
? Needs to record each id attribute: keep a symbol table
? Lexical Analysis eliminates white spaces, etc..
Syntax Analysis
? It is called as parsing or Syntax Analysis
? It takes token as input and generate parse tree as output.
? The parser checks that the expression made by tokens is
syntactically correct or not.
? Context free grammars formalize the rules and guide syntax analysis.
Semantic Analysis
? It checks the parse tree follows the rules of language
? It keeps tracks of identifiers, types and expressions
? Output is annotated as tree syntax
Intermediate Code Generator
? After semantic analysis, some compilers generate an explicit
intermediate representation of the source program.
? This representation should be easy to produce and easy to translate
into the target program.
Code Optimization
? It is used to improve the intermediate code
? So the output can run fast and take less space
? It removes unnecessary lines of code and arrange the sequence in
order to speed up the execution.
Code Generation
? The final phase of the compiler is the generation of target code,
consisting of relocatable machine code or assembly code.
? The intermediate instructions are each translated into sequence of
machine instructions that perform the same task.
Symbol table Manager
? An essential function of a compiler is to record the identifiers used in
the source program and collect its information.
? A symbol table is a data structure containing a record for each
identifier with fields for attributes.(such as, its type, its scope, if
procedure names then the number and type of arguments etc.,)
? The data structure allows to find the record for each identifier and
store or retrieve data from that record quickly.
Example: sum=initial + value * 10 1 sum
2 initial
3 value
4 ¡­.
Error Handling and Reporting
? Each phase can encounter errors. After detecting an error, a phase must deal that
error, so that compilation can proceed ,allowing further errors to be detected.
? Lexical phase can detect error where the characters remaining in the input do not
form any token.
? The syntax and semantic phases handle large fraction of errors. The stream of
tokens violates the syntax rules are determined by syntax phase.
? During semantic, the compiler tries to detect constructs that have the right
syntactic structure but no meaning to the operation involved.
Role of Lexical Analyser
Role of lexical Analyzer
? As the first phase of a compiler, the main task of the lexical analyzer is
to read the input characters of the source program, group them into
lexemes, and produce as output a sequence of tokens for each lexeme
in the source program.
? The stream of tokens is sent to the parser for syntax analysis.
? It is common for the lexical analyzer to interact with the symbol table
as well.
? When the lexical analyzer discovers a lexeme constituting an
identifier, it needs to enter that lexeme into the symbol table.
Contd..
? In some cases, information regarding the kind of identifier may be
read from the symbol table by the lexical analyzer to assist it in
determining the proper token it must pass to the parser.
Interaction between the lexical analyzer and the parser
? The interaction is implemented by having the parser call the lexical
analyzer.
? The call, suggested by the getNextToken command, causes the
lexical analyzer to read characters from its input until it can identify
the next lexeme and produce for it the next token, which it returns to
the parser.
? Since the lexical analyzer is the part of the compiler that reads the
source text, it may perform certain other tasks besides identification of
lexemes
Contd..
? One such task is stripping out comments and whitespace (blank,
newline, tab, and perhaps other characters that are used to separate
tokens in the input).
? Another task is correlating error messages generated by the compiler
with the source program.
? lexical analyzers are divided into a cascade of two processes:
? Scanning consists of the simple processes that do not require tokenization of the
input, such as deletion of comments and compaction of consecutive whitespace
characters into one.
? Lexical analysis proper is the more complex portion, where the scanner produces the
sequence of tokens as output.
Lexical Analysis and Parsing
? Simplicity of design ¨C avoid complexity to work together .
? Improving Compiler Efficiency - A separate lexical analyzer allows us to apply
specialized techniques that serve only the lexical task, not the job of parsing. In
addition, specialized buffering techniques for reading input characters can speed
up the compiler significantly.
? Enhancing Compiler Portability -Input-device-specific peculiarities can be
restricted to the lexical analyzer.
Tokens, Patterns, Lexemes
? A token is a pair consisting of a token name and an optional
attribute value.
? A pattern is a description of the form that the lexemes of a token
may take.
? A lexeme is a sequence of characters in the source program that
matches the pattern for a token and is identified by the lexical analyzer
as an instance of that token.
Example
Attributes for tokens
Lexical Errors
? Some errors are out of power of lexical analyzer to recognize:
?fi(a==f(x))..
? However it may be able to recognize errors like:
?d=2r
? Such errors are recognized when no pattern for tokens matches a character
sequence
Example
Error Recovery
? Panic Mode: Successive characters are ignored until we reach to a well
formed token
? Delete one character from the remaining input
? Insert a missing character into the remaining input
? Replace a character by another character
? Transpose two adjacent characters
INPUT BUFFERING
Input Buffering
? Input buffer helps to find the correct lexeme.
? Sometimes lexical analyzer needs to look ahead some symbols to decide about
the token to return
- in C language: we need to look after -,= or < to decide what token to return
- in FORTRAN: DO 5 I=1.25
? We need to introduce a two buffer scheme to handle large look-aheads safely
Buffer Pairs
Two pointers to the input are maintained:
I. Pointer lexemeBegin, marks the beginning of the current lexeme,
whose extent we are attempting to determine.
II. Pointer forward scans ahead until a pattern match is found
Sentinels
? A sentinel is a special character or token used to mark the end of
a sequence.
? In the context of a lexical analyzer, a sentinel can be utilized to
indicate the end of the input source code or the end of a specific
token.
An Introduction to the Compiler Designss
QUIZ
1. ______________ is the heart of the compiler.
2.what will be the next phase of Intermediate code generator?
3. What is the output of Lexical analyzer?
4. Token consist of pairs, what are they?
5. What is the command ,the parser generate to lexical analyzer?
6. What is the drawback in 1-Buffer?
7. What the two pointers used in Buffering technique?
8. What are the Back-end in Phases of compiler?
SPECIFICATION OF TOKENS
SPECIFICATION OF TOKENS
? In theory of compilation regular expressions are used to formalize the
specification of tokens.
? The specification of tokens depends on the pattern of the lexeme.
Example:
letter(letter+digit)*
? Each regular expression is a pattern specifying the form of strings.
Contd..
I) Alphabet, Strings, Languages
II) Operations on Language
III) Regular Expression
IV) Regular Definition
Strings and Languages
Strings and Languages
Symbol:
? A symbol is an abstract entity that we shall not define formally.
Examples:
? Letters: A|B|C¡­..|Z |a|b¡­.|z
? Digit:0,1,2¡­9
? Special Characters
Contd..
Alphabet:
? Alphabet is a non-empty finite set of symbols. it is denoted by the
symbol ?(Sigma).
Example:
? ?={0,1} is the binary alphabet which consisting of digits
? ?={a,b,c} is an alphabet which consisting of letters.
? ASCII is an alphabet ;it is used in many softwares.
? Unicode is an alphabet , which includes 1,00,000 characters.
Contd..
String:
? A String is defined as finite sequence of symbols over an alphabet ? .
It is denoted as ¡®w¡¯.
? In language theory, the terms ¡°sentence¡± and ¡°word¡± are often used as
synonyms for ¡°string¡±
Examples:
1.Alphabet ?={ a, b }
Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,¡­¡­..}
2. ?={ 0, 1 }
Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111,
0000,....}
Different Notation used in a string
? Length
? Prefix:- any number of leading symbols
? Proper Prefix :- any number of leading symbols without € and
original string
? Suffix:- tailing symbols
? Proper Suffix:- any number of trailing symbols without € and
original string
? Substring
OPERATIONS ON LANGUAGES
Operations on Languages
REGULAR EXPRESSION
Definition
- A regular expression over ? can be defined as
? ? is a regular expression for empty set
? ? is a regular expression for null string {?}
? If ¡®a¡¯ is a symbol in ? then ¡®a¡¯ is RE for the set {a}.
? If R and S are two RE then
? UNION of two R+E is a RE
? CONCATENATION of two RS is a RE
? KLEEN CLOSURE of any R* S* is a RE
Identity Rule or Algebric rule
EXAMPLE
Example
EXAMPLES
1) Design a RE to accept all possible combinations of a¡¯s and b¡¯s over
?={ a, b}
2) Design a RE which Starts with 1 and end with 0.
Ad

Recommended

Compiler Design
Compiler Design
Anujashejwal
?
Chapter 2.pptx compiler design lecture note
Chapter 2.pptx compiler design lecture note
adugnanegero
?
compiler.pdfljdvgepitju4io3elkhldhyreyio4uw
compiler.pdfljdvgepitju4io3elkhldhyreyio4uw
abhinandpk2405
?
Cd ch2 - lexical analysis
Cd ch2 - lexical analysis
mengistu23
?
Data design and analysis of computing tools
Data design and analysis of computing tools
KamranAli649587
?
Compiler Design.pptx
Compiler Design.pptx
SouvikRoy149
?
Role-of-lexical-analysis
Role-of-lexical-analysis
Dattatray Gandhmal
?
Unit1.ppt
Unit1.ppt
BerlinShaheema2
?
11700220036.pdf
11700220036.pdf
SouvikRoy149
?
SS & CD Module 3
SS & CD Module 3
ShwetaNirmanik
?
Module 2
Module 2
ShwetaNirmanik
?
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
?
Lexical Analysis
Lexical Analysis
Munni28
?
Structure of the compiler
Structure of the compiler
Sudhaa Ravi
?
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy719186
?
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
GunjalSanjay
?
1._Introduction_.pptx
1._Introduction_.pptx
Anbarasan Radhakrishnan R
?
COMPILER DESIGN.pdf
COMPILER DESIGN.pdf
ManishBej3
?
1.Role lexical Analyzer
1.Role lexical Analyzer
Radhakrishnan Chinnusamy
?
atc 3rd module compiler and automata.ppt
atc 3rd module compiler and automata.ppt
ranjan317165
?
Plc part 2
Plc part 2
Taymoor Nazmy
?
role of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdf
ranjan317165
?
Lexical analysis - Compiler Design
Lexical analysis - Compiler Design
Kuppusamy P
?
automata theroy and compiler designc.pptx
automata theroy and compiler designc.pptx
YashaswiniYashu9555
?
Introduction to Compilers
Introduction to Compilers
vijaya603274
?
A Role of Lexical Analyzer
A Role of Lexical Analyzer
Archana Gopinath
?
CD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
?
Assignment4
Assignment4
Sunita Milind Dol
?
ElysiumPro Company Profile 2025-2026.pdf
ElysiumPro Company Profile 2025-2026.pdf
info751436
?
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Yannis
?

More Related Content

Similar to An Introduction to the Compiler Designss (20)

11700220036.pdf
11700220036.pdf
SouvikRoy149
?
SS & CD Module 3
SS & CD Module 3
ShwetaNirmanik
?
Module 2
Module 2
ShwetaNirmanik
?
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
?
Lexical Analysis
Lexical Analysis
Munni28
?
Structure of the compiler
Structure of the compiler
Sudhaa Ravi
?
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy719186
?
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
GunjalSanjay
?
1._Introduction_.pptx
1._Introduction_.pptx
Anbarasan Radhakrishnan R
?
COMPILER DESIGN.pdf
COMPILER DESIGN.pdf
ManishBej3
?
1.Role lexical Analyzer
1.Role lexical Analyzer
Radhakrishnan Chinnusamy
?
atc 3rd module compiler and automata.ppt
atc 3rd module compiler and automata.ppt
ranjan317165
?
Plc part 2
Plc part 2
Taymoor Nazmy
?
role of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdf
ranjan317165
?
Lexical analysis - Compiler Design
Lexical analysis - Compiler Design
Kuppusamy P
?
automata theroy and compiler designc.pptx
automata theroy and compiler designc.pptx
YashaswiniYashu9555
?
Introduction to Compilers
Introduction to Compilers
vijaya603274
?
A Role of Lexical Analyzer
A Role of Lexical Analyzer
Archana Gopinath
?
CD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
?
Assignment4
Assignment4
Sunita Milind Dol
?
3a. Context Free Grammar.pdf
3a. Context Free Grammar.pdf
TANZINTANZINA
?
Lexical Analysis
Lexical Analysis
Munni28
?
Structure of the compiler
Structure of the compiler
Sudhaa Ravi
?
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy719186
?
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
Structure of a Compiler, Compiler and Interpreter, Lexical Analysis: Role of ...
GunjalSanjay
?
COMPILER DESIGN.pdf
COMPILER DESIGN.pdf
ManishBej3
?
atc 3rd module compiler and automata.ppt
atc 3rd module compiler and automata.ppt
ranjan317165
?
role of lexical parser compiler design1-181124035217.pdf
role of lexical parser compiler design1-181124035217.pdf
ranjan317165
?
Lexical analysis - Compiler Design
Lexical analysis - Compiler Design
Kuppusamy P
?
automata theroy and compiler designc.pptx
automata theroy and compiler designc.pptx
YashaswiniYashu9555
?
Introduction to Compilers
Introduction to Compilers
vijaya603274
?

Recently uploaded (20)

ElysiumPro Company Profile 2025-2026.pdf
ElysiumPro Company Profile 2025-2026.pdf
info751436
?
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Yannis
?
Machine Learning - Classification Algorithms
Machine Learning - Classification Algorithms
resming1
?
3. What is the principles of Teamwork_Module_V1.0.ppt
3. What is the principles of Teamwork_Module_V1.0.ppt
engaash9
?
chemistry investigatory project for class 12
chemistry investigatory project for class 12
Susis10
?
Engineering Mechanics Introduction and its Application
Engineering Mechanics Introduction and its Application
Sakthivel M
?
David Boutry - Mentors Junior Developers
David Boutry - Mentors Junior Developers
David Boutry
?
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
gowthamvicky1
?
4th International Conference on Computer Science and Information Technology (...
4th International Conference on Computer Science and Information Technology (...
ijait
?
Fundamentals of Digital Design_Class_12th April.pptx
Fundamentals of Digital Design_Class_12th April.pptx
drdebarshi1993
?
60 Years and Beyond eBook 1234567891.pdf
60 Years and Beyond eBook 1234567891.pdf
waseemalazzeh
?
Impurities of Water and their Significance.pptx
Impurities of Water and their Significance.pptx
dhanashree78
?
×îаæÃÀ¹úʥĪÄῨѧԺ±ÏÒµÖ¤£¨³§²Ñ°ä±ÏÒµÖ¤Ê飩ԭ°æ¶¨ÖÆ
×îаæÃÀ¹úʥĪÄῨѧԺ±ÏÒµÖ¤£¨³§²Ñ°ä±ÏÒµÖ¤Ê飩ԭ°æ¶¨ÖÆ
Taqyea
?
Fundamentals of Digital Design_Class_21st May - Copy.pptx
Fundamentals of Digital Design_Class_21st May - Copy.pptx
drdebarshi1993
?
NALCO Green Anode Plant,Compositions of CPC,Pitch
NALCO Green Anode Plant,Compositions of CPC,Pitch
arpitprachi123
?
Pavement and its types, Application of rigid and Flexible Pavements
Pavement and its types, Application of rigid and Flexible Pavements
Sakthivel M
?
machine learning is a advance technology
machine learning is a advance technology
ynancy893
?
Great power lithium iron phosphate cells
Great power lithium iron phosphate cells
salmankhan835951
?
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
Fran?ois Garillot
?
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Alexandra N. Martinez
?
ElysiumPro Company Profile 2025-2026.pdf
ElysiumPro Company Profile 2025-2026.pdf
info751436
?
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Rigor, ethics, wellbeing and resilience in the ICT doctoral journey
Yannis
?
Machine Learning - Classification Algorithms
Machine Learning - Classification Algorithms
resming1
?
3. What is the principles of Teamwork_Module_V1.0.ppt
3. What is the principles of Teamwork_Module_V1.0.ppt
engaash9
?
chemistry investigatory project for class 12
chemistry investigatory project for class 12
Susis10
?
Engineering Mechanics Introduction and its Application
Engineering Mechanics Introduction and its Application
Sakthivel M
?
David Boutry - Mentors Junior Developers
David Boutry - Mentors Junior Developers
David Boutry
?
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
20CE601- DESIGN OF STEEL STRUCTURES ,INTRODUCTION AND ALLOWABLE STRESS DESIGN
gowthamvicky1
?
4th International Conference on Computer Science and Information Technology (...
4th International Conference on Computer Science and Information Technology (...
ijait
?
Fundamentals of Digital Design_Class_12th April.pptx
Fundamentals of Digital Design_Class_12th April.pptx
drdebarshi1993
?
60 Years and Beyond eBook 1234567891.pdf
60 Years and Beyond eBook 1234567891.pdf
waseemalazzeh
?
Impurities of Water and their Significance.pptx
Impurities of Water and their Significance.pptx
dhanashree78
?
×îаæÃÀ¹úʥĪÄῨѧԺ±ÏÒµÖ¤£¨³§²Ñ°ä±ÏÒµÖ¤Ê飩ԭ°æ¶¨ÖÆ
×îаæÃÀ¹úʥĪÄῨѧԺ±ÏÒµÖ¤£¨³§²Ñ°ä±ÏÒµÖ¤Ê飩ԭ°æ¶¨ÖÆ
Taqyea
?
Fundamentals of Digital Design_Class_21st May - Copy.pptx
Fundamentals of Digital Design_Class_21st May - Copy.pptx
drdebarshi1993
?
NALCO Green Anode Plant,Compositions of CPC,Pitch
NALCO Green Anode Plant,Compositions of CPC,Pitch
arpitprachi123
?
Pavement and its types, Application of rigid and Flexible Pavements
Pavement and its types, Application of rigid and Flexible Pavements
Sakthivel M
?
machine learning is a advance technology
machine learning is a advance technology
ynancy893
?
Great power lithium iron phosphate cells
Great power lithium iron phosphate cells
salmankhan835951
?
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
Fran?ois Garillot
?
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Montreal Dreamin' 25 - Introduction to the MuleSoft AI Chain (MAC) Project
Alexandra N. Martinez
?
Ad

An Introduction to the Compiler Designss

  • 1. 20IT011-COMPILER DESIGN Text Book : Alfred V Aho, Monica S Lam, Ravi Sethi and Jeffrey D Ullman, ¡°Compilers - Principles, Techniques and Tools¡±, Pearson Education, 2nd Edition, 2018.
  • 2. COURSE OUTCOME CO1: Ability to understand the functionalities of the different phases of compiler. CO2: Ability to understand syntax-directed translation and run-time environment. CO3: Ability to implement code optimization techniques. CO4: Ability to apply different parsing algorithms to develop the parsers for a given grammar. CO5: Ability to design a scanner and a parser using LEX and YACC tools.
  • 4. UNIT-I SYLLABUS Structure of a compiler ¨C Lexical Analysis ¨C Role of Lexical Analyzer -Input Buffering ¨C Specification of Tokens ¨C Recognition of Tokens- Lex ¨C Finite Automata-Regular Expressions to Automata ¨C Minimizing DFA.
  • 5. STRUCTURE OF A COMPILER ? Definition: A program that accept as input a program text in a certain language and produces as output a program text in another language (Grune et all) ? Compiler is a translator program that reads a program written in one language - the source language- and translates it into an equivalent program in another language-the target language.
  • 7. PHASES OF A COMPILER
  • 8. Lexical Analysis ? In a compiler, linear analysis is called lexical analysis or scanning. ? It takes source code as input ? It reads one character at a time and convert into lexemes( represent in the form of tokens) Example: position = initial + rate * 60 ? The identifier position. ? The assignment symbol =. ? The identifier initial. ? The plus sign. ? The identifier rate. ? The multiplication sign. ? The number 60. <Id,Position> <=,> <Id,initial> <+,> <Id,rate><*,> <num,60>
  • 9. Contd.. ? Needs to record each id attribute: keep a symbol table ? Lexical Analysis eliminates white spaces, etc..
  • 10. Syntax Analysis ? It is called as parsing or Syntax Analysis ? It takes token as input and generate parse tree as output. ? The parser checks that the expression made by tokens is syntactically correct or not. ? Context free grammars formalize the rules and guide syntax analysis.
  • 11. Semantic Analysis ? It checks the parse tree follows the rules of language ? It keeps tracks of identifiers, types and expressions ? Output is annotated as tree syntax
  • 12. Intermediate Code Generator ? After semantic analysis, some compilers generate an explicit intermediate representation of the source program. ? This representation should be easy to produce and easy to translate into the target program.
  • 13. Code Optimization ? It is used to improve the intermediate code ? So the output can run fast and take less space ? It removes unnecessary lines of code and arrange the sequence in order to speed up the execution.
  • 14. Code Generation ? The final phase of the compiler is the generation of target code, consisting of relocatable machine code or assembly code. ? The intermediate instructions are each translated into sequence of machine instructions that perform the same task.
  • 15. Symbol table Manager ? An essential function of a compiler is to record the identifiers used in the source program and collect its information. ? A symbol table is a data structure containing a record for each identifier with fields for attributes.(such as, its type, its scope, if procedure names then the number and type of arguments etc.,) ? The data structure allows to find the record for each identifier and store or retrieve data from that record quickly. Example: sum=initial + value * 10 1 sum 2 initial 3 value 4 ¡­.
  • 16. Error Handling and Reporting ? Each phase can encounter errors. After detecting an error, a phase must deal that error, so that compilation can proceed ,allowing further errors to be detected. ? Lexical phase can detect error where the characters remaining in the input do not form any token. ? The syntax and semantic phases handle large fraction of errors. The stream of tokens violates the syntax rules are determined by syntax phase. ? During semantic, the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved.
  • 17. Role of Lexical Analyser
  • 18. Role of lexical Analyzer ? As the first phase of a compiler, the main task of the lexical analyzer is to read the input characters of the source program, group them into lexemes, and produce as output a sequence of tokens for each lexeme in the source program. ? The stream of tokens is sent to the parser for syntax analysis. ? It is common for the lexical analyzer to interact with the symbol table as well. ? When the lexical analyzer discovers a lexeme constituting an identifier, it needs to enter that lexeme into the symbol table.
  • 19. Contd.. ? In some cases, information regarding the kind of identifier may be read from the symbol table by the lexical analyzer to assist it in determining the proper token it must pass to the parser.
  • 20. Interaction between the lexical analyzer and the parser ? The interaction is implemented by having the parser call the lexical analyzer. ? The call, suggested by the getNextToken command, causes the lexical analyzer to read characters from its input until it can identify the next lexeme and produce for it the next token, which it returns to the parser. ? Since the lexical analyzer is the part of the compiler that reads the source text, it may perform certain other tasks besides identification of lexemes
  • 21. Contd.. ? One such task is stripping out comments and whitespace (blank, newline, tab, and perhaps other characters that are used to separate tokens in the input). ? Another task is correlating error messages generated by the compiler with the source program. ? lexical analyzers are divided into a cascade of two processes: ? Scanning consists of the simple processes that do not require tokenization of the input, such as deletion of comments and compaction of consecutive whitespace characters into one. ? Lexical analysis proper is the more complex portion, where the scanner produces the sequence of tokens as output.
  • 22. Lexical Analysis and Parsing ? Simplicity of design ¨C avoid complexity to work together . ? Improving Compiler Efficiency - A separate lexical analyzer allows us to apply specialized techniques that serve only the lexical task, not the job of parsing. In addition, specialized buffering techniques for reading input characters can speed up the compiler significantly. ? Enhancing Compiler Portability -Input-device-specific peculiarities can be restricted to the lexical analyzer.
  • 23. Tokens, Patterns, Lexemes ? A token is a pair consisting of a token name and an optional attribute value. ? A pattern is a description of the form that the lexemes of a token may take. ? A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token.
  • 26. Lexical Errors ? Some errors are out of power of lexical analyzer to recognize: ?fi(a==f(x)).. ? However it may be able to recognize errors like: ?d=2r ? Such errors are recognized when no pattern for tokens matches a character sequence
  • 28. Error Recovery ? Panic Mode: Successive characters are ignored until we reach to a well formed token ? Delete one character from the remaining input ? Insert a missing character into the remaining input ? Replace a character by another character ? Transpose two adjacent characters
  • 30. Input Buffering ? Input buffer helps to find the correct lexeme. ? Sometimes lexical analyzer needs to look ahead some symbols to decide about the token to return - in C language: we need to look after -,= or < to decide what token to return - in FORTRAN: DO 5 I=1.25 ? We need to introduce a two buffer scheme to handle large look-aheads safely
  • 31. Buffer Pairs Two pointers to the input are maintained: I. Pointer lexemeBegin, marks the beginning of the current lexeme, whose extent we are attempting to determine. II. Pointer forward scans ahead until a pattern match is found
  • 32. Sentinels ? A sentinel is a special character or token used to mark the end of a sequence. ? In the context of a lexical analyzer, a sentinel can be utilized to indicate the end of the input source code or the end of a specific token.
  • 34. QUIZ 1. ______________ is the heart of the compiler. 2.what will be the next phase of Intermediate code generator? 3. What is the output of Lexical analyzer? 4. Token consist of pairs, what are they? 5. What is the command ,the parser generate to lexical analyzer? 6. What is the drawback in 1-Buffer? 7. What the two pointers used in Buffering technique? 8. What are the Back-end in Phases of compiler?
  • 36. SPECIFICATION OF TOKENS ? In theory of compilation regular expressions are used to formalize the specification of tokens. ? The specification of tokens depends on the pattern of the lexeme. Example: letter(letter+digit)* ? Each regular expression is a pattern specifying the form of strings.
  • 37. Contd.. I) Alphabet, Strings, Languages II) Operations on Language III) Regular Expression IV) Regular Definition
  • 39. Strings and Languages Symbol: ? A symbol is an abstract entity that we shall not define formally. Examples: ? Letters: A|B|C¡­..|Z |a|b¡­.|z ? Digit:0,1,2¡­9 ? Special Characters
  • 40. Contd.. Alphabet: ? Alphabet is a non-empty finite set of symbols. it is denoted by the symbol ?(Sigma). Example: ? ?={0,1} is the binary alphabet which consisting of digits ? ?={a,b,c} is an alphabet which consisting of letters. ? ASCII is an alphabet ;it is used in many softwares. ? Unicode is an alphabet , which includes 1,00,000 characters.
  • 41. Contd.. String: ? A String is defined as finite sequence of symbols over an alphabet ? . It is denoted as ¡®w¡¯. ? In language theory, the terms ¡°sentence¡± and ¡°word¡± are often used as synonyms for ¡°string¡± Examples: 1.Alphabet ?={ a, b } Strings: w = {a, b, aa, ab, ba, bb, aaa, aab,¡­¡­..} 2. ?={ 0, 1 } Strings: w = { 0, 1, 00, 01, 10, 11, 000, 001, 010, 011,100,101,110,111, 0000,....}
  • 42. Different Notation used in a string ? Length ? Prefix:- any number of leading symbols ? Proper Prefix :- any number of leading symbols without € and original string ? Suffix:- tailing symbols ? Proper Suffix:- any number of trailing symbols without € and original string ? Substring
  • 46. Definition - A regular expression over ? can be defined as ? ? is a regular expression for empty set ? ? is a regular expression for null string {?} ? If ¡®a¡¯ is a symbol in ? then ¡®a¡¯ is RE for the set {a}. ? If R and S are two RE then ? UNION of two R+E is a RE ? CONCATENATION of two RS is a RE ? KLEEN CLOSURE of any R* S* is a RE
  • 47. Identity Rule or Algebric rule
  • 50. EXAMPLES 1) Design a RE to accept all possible combinations of a¡¯s and b¡¯s over ?={ a, b} 2) Design a RE which Starts with 1 and end with 0.