When we set up the regular expressions for the Lexer, the C code (code within braces) to be executed was only print statements:
"return" {printf("RETURN\n")};
For the Lexer to return tokens to the Parser, we need to modify this C code. For example, when the regular expression “return” is observed, a token representing a return keyword should be returned to the Parser:
"return" { return RETURN; }
In the above code, RETURN is a token that is returned to the Parser. Simply returning a token is sufficient for keywords; however, sometimes we want to return more. Say a number is encountered. We not only want to return a number token, but also return the number that was observed. For this a union called yylval is used.
yylval is a global variable used to pass values between the Lexer and Parser. Since yylval is a union it can pass multiple types of values (note union is definied in Parser):
%union {
int intval;
double doubleval;
char charval;
char* str;
struct AST_Node* node;
};
Now, since we have yylval, numbers and other desired values (i.e., variable names in the form of strings) can be passed to the Parser:
[0-9]+ {
int value = atoi(yytext);
yylval.intval = value;
return ICONST;
}
The following tokens can now be passed to the Parser from the Lexer (many more will be needed):
%%
"int" { return INT; }
"return" { return RETURN; }
[a-zA-Z_][a-zA-Z0-9_]* {
yylval.str = _strdup(yytext);
return ID;
}
[0-9]+ {
int value = atoi(yytext);
yylval.intval = value;
return ICONST;
}
"+" {return PLUS;}
"-" {return MINUS;}
"/" {return DIVIDE;}
"*" {return MULT;}
"=" {return ASSIGN;}
";" { return SEMI; }
"(" { return OP_PAR; }
")" { return CL_PAR; }
"{" { return OP_BRACE; }
"}" { return CL_BRACE; }
%%