Waiting for answer This question has not been answered yet. You can hire a professional tutor to get the answer.

QUESTION

REMINDER: This is an INDIVIDUAL assignment. You may not share / look at anybody else’s code. Code similarity will be checked and the honor pol- icy will be invoked if necessary. In this assignment

REMINDER: This is an INDIVIDUAL assignment.

You may not share / look at anybody else’s code.

Code similarity will be checked and the honor pol-

icy will be invoked if necessary.

In this assignment, you will implement a simple regu-

lar expression parser (with a very simplified input format).

The goals of this assignment are to:

• Gain an appreciation for how simply our theoretical

understanding of regular expressions can be applied

to implement a regular expression recognizer.

• To see the connection between our proof that regular

expressions are equivalent to NFAs by implementing some code that converts between them.

• To see the connection between our proof that regular

expressions are closed under common operations by

implementing some code that applies those ideas.

Regular expression matchers are a commonly used tool in computer science, and they directly relate to

the concept of a DFA/NFA from class. You have probably used a website before that forced you to type

in an email address in a proper format, and complained if your entry was ”not a valid email address”.

For this assignment we will give you a simple regular expression and a series of strings that may or may

not match that expression. For each string, you will output YES if the string matches the expression, and

NO if it does not.

There are many different ways to implement such features, but I would like you to use the implemen-

tation described below so that your work best reflects our discussion from class. This is despite the fact

that there are more memory efficient implementations. In addition, we give you a good amount of code to

start with that will make the assignment much less cumbersome for you. The rest of this document will

describe the input you can expect, the output you should provide, along with some details on how you

should implement this.

Input

All input will be provided to standard input (System.in). The first line of input is the regular expression

that needs to be parsed. It has a very simplified format. Each character in the string will be one of the

following:

• d: The lower-case character d matches to any single numeric digit 0-9.

• a: The lower-case character a matches to any lower-case alphabetical letter a-z.

• *: The star symbol means the previous regular expression can be repeated 0 or more times. For

example, the expression (dda)* would match with the empty string, 32z, 34z76b, etc.

1

• U: The capital letter U denotes union. The string matches the expression if it matches the left side

of this operator OR it matches the right side of this operator. For example, aaUdd would match for

string ba or 97, but not for a7.

We will adhere to a couple of rules when formatting the input in order to make the implementation a

little bit easier for you. The input is gauranteed to follow the following rules:

• Parentheses will only appear when grouping an expression for use with the * operator (and thus the

* operator will always appear after every right paren). Parentheses will never appear in any other

context. So, you might see a(dda)* but you will never see (ad)d*.

• Notice that the star operator will always appear after a right paren, but it can also appear after a

single character a or d, such as aad* or aa*d.

• In addition, no other operator (* or U) will be inside those parentheses if present.

• The union operator U will always apply to the entire expression to the left and right of the operator

(up to but not including the nearest U to the left or right). For example, aaddU(da)*a is the

entire expression aadd unioned with the entire expression (da)*a. You will never see something like

aa(ddUdaa)* because the parentheses mean that the union does not apply to everything to its left

and this violates one of the rules above. In other words, the union operator will never be inside of

parentheses.

• There will never be nested parentheses. You might see something like (dd)*(aa)* but you will never

see (dd(aa)*)*.

The next line will contain a single number n, which is the number of strings that will be given to match

with. The next n lines after that will each give a single string that is a potential match to the regular

expression.

Output

For each of the n example strings. Output YES on a single line if the string matches the regular expression,

and NO if it does not.

Getting Started

Even with such a simple format, this is not a trivial implementation. I am providing starter code that will

help get your started and you will be asked to implement a specific few of the empty methods provided.

• Step 1: Understand the provided NFA object: The first thing you should do is read the NFA

class that is provided annd understand its methods (yes, simulating non-determinism will make this

much easier). This object contains fields for each part of an NFA (list of states in the machine,

the start state, list of final states, list of transitions). You should understand the provided methods

that allow you to add states, add transitions between states based on characters (numeric digits or

alphabetic characters), change states to be final or start or back to normal, etc.

2

• Step 2: Implement the empty NFA methods that apply the *, U, and concatenation

operators to a given machine: In the NFA class, you will see three methods that given a current

NFA object (which represents some regular expression), applies one of the three respective operators

to that machine. Your next task is to implement these methods.

• Step 3: Implement the build NFA method: In Main.java, there is a method that takes the

regular expression and turns it into an NFA. We have provided comments that summarize how we

approached this method if you would like to use that as a guide.

• Step 4: Implement the acceptsString method in NFA.java: This method should take a string

and simulate the NFA moving from state(s) to state(s) given each input character. If the machine

ends in a final state, you should return true (false otherwise).

Running the Code

The sample code provided contains a simple Makefile. You can compile the code by simply opening a

terminal, navigating to the project directory, and type make. To run the code, simply type make run.

Submission

You should submit four files (Main.java, NFA.java, QSig.java, Makefile). The second two of those four do

not need to be altered to complete the assignment.

Sample Input

aadaUaaadaUaaadaa

5

mrf8t

mk5sc

lab2d

bea3ch

lbh1

(a)*dd(a)*

4

23

a56

bas98edd

as98f7f

Sample Output

YES

NO

YES

YES

NO

YES

YES

YES

NO

Show more
LEARN MORE EFFECTIVELY AND GET BETTER GRADES!
Ask a Question