splimter
11/1/2019 - 6:52 PM

compilation quick note

compilation quick note

RegEx

  • . any character except newline
  • \n newline
  • * zero or more copies of the preceding expression
    • one or more copies of the preceding expression
  • ? zero or one copy of the preceding expression
  • ^ beginning of line
  • $ end of line
  • a|b a or b
  • (ab)+ one or more copies of ab (grouping)
  • "a+b" literal "a+b" (C escapes still work)
  • [] character class

Example

  • abc abc
  • abc* ab abc abcc abccc ...
  • abc+ abc abcc abccc ...
  • a(bc)+ abc abcbc abcbcbc ...
  • a(bc)? a abc
  • [abc] one of: a, b, c
  • [a-z] any letter, a-z
  • [a\-z] one of: a, -, z
  • [-az] one of: -, a, z
  • [A-Za-z0-9]+ one or more alphanumeric characters
  • [ \t\n]+ whitespace
  • [^ab] anything except: a, b
  • [a^b] one of: a, ^, b
  • [a|b] one of: a, |, b
  • a|b one of: a, b

Struct

  • Input is copied to output one character at a time. The first %% is always required, as there must always be a rules section.
... definitions ...
%%
... rules ...
%%
... subroutines ...
  • In this example there are two patterns, “.” and “\n”, with an ECHO action associated for each pattern.
  • Several macros and variables are predefined by lex. ECHO is a macro that writes code matched by the pattern.
  • This is the default action for any unmatched strings.
  • Function yywrap is called by lex when input is exhausted. Return 1 if you are done or 0 if more processing is required.
  • yylex that is the main entry-point for lex.
%%
. ECHO;
\n ECHO;
%%

int yywrap(void) {
    return 1;
}

int main(void) {
    yylex();
    return 0;
}
  • The following example prepends line numbers to each line in a file. Some implementations of lex predefine and calculate yylineno. The input file for lex is yyin and defaults to stdin.
%{
int yylineno;
%}

%%

^(.*)\n printf("%4d\t%s", ++yylineno, yytext);

%%

int yywrap(void) {
    return 1;
}

int main(int argc, char *argv[]) {
    yyin = fopen(argv[1], "r");
    yylex();
    fclose(yyin);
}
  • Here is a scanner that counts the number of characters, words, and lines in a file:
%{
int nchar, nword, nline;
%}

%%

\n { nline++; nchar++; }
[^ \t\n]+ { nword++, nchar += yyleng; }
. { nchar++; }

%%

int yywrap(void) {
    return 1;
}

int main(void) {
    yylex();
    printf("%d\t%d\t%d\n", nchar, nword, nline);
    return 0;
}