Introduction to Compiler Theory
Lect. dr. Gianina Georgescu
3rd Year, 1st Semester
From Regular Expressions to Deterministic Finite Automata
In order to see if a word is in the language generated by a RE (Regular Expression), one should convert this expression to a DFA (Deterministic Finite Automaton) that will recognize the same language as the corresponding RE.
There are at least two methods to convert a RE to a DFA:
- by using Thompson's construction to convert it to a NFA (Non-deterministic Finite Automata) with ε-transitions, then converting this NFA to a NFA without ε-transitions and then to a DFA
- directly, using an algorithm that builds the tree corresponding to the RE, then calculates the nullable, firstpos, lastpos and followpos functions for each node of the tree and then builds the DFA using this functions
For more information on the second approach, check out a sketch of the algorithm (in PDF format).
The following source code implements this algorithm and is able to tell if a word is in the language generated by a RE: RegExpToDFA.cpp.
Here is a sample output produced by the source code:
abbbcc -> true
ba -> false
ab -> true
abc -> true
abb -> true
abcbccbccccc -> true
ac -> false
Context Free Grammar Recognizer
Source code and comments coming soon.
Here is a sample output produced by the source code:
A -> aS | a | bAA
B -> bS | b | aBB
S -> aB | bA
Number of words: 2
abaabbab
S -> aB
B -> bS
S -> aB
B -> aBB
B -> bS
S -> bA
A -> a
B -> b
true
abaaaba
false