הטכניון - מכון טכנולוגי לישראל Technion - Israel Institute of Technology Технион - израильский технологический институт ألتخنيون - معهد تكنولوجي لإسرائيل

02360360 - Theory Of Compilation

חורף 2019-2020Winter 2019-2020Зима 2019-2020شتاء 2019-2020

שאלות ותשובות - HW1 Frequently Asked Questions - HW1 Вопросы и Ответы - HW1 أسئلة وأجوبة - HW1

		.. (לתיקייה המכילה)

Regarding printing STRING lexeme, does the ascii escape sequence can accept upper case and lower case letters in the hexadecimal numbers?
Yes. For both "\x63oo\x6C" and "\x63oo\x6c" you should print "cool".

Regarding printing STRING lexeme, does the ascii escape sequence (\xdd) can accept a number that represents a non-printable character?
No. The printable character range is 00-7F. Therefore, in case the string contains an ascii escape sequence that is out of range, you should print the "Error undefined escape sequence xdd\n" (As in section 3, in the pdf). For example, For the input "\xff" We will get: "Error undefined escape sequence xff\n"

Is it possible to find a comment on the last line of the input file? e.g. --- int main() { return 0; } // Comment at the end ----
Yes. The correct wording for the COMMENT token is: The lexeme starts with // and followed by any character except line-ending character. Thus, we can find comments also at the end of the file, where there is no line-ending character at the end of the line.

In the case where there is an error in a string, but it's also an unclosed one, what should we print?
In that case, for example: --- "Hello \q --- You should print "Error unclosed string\n"

What should be the precedence of the tokens? i.e. in which order should we write the tokens in the scanner.lex file?
The tokens order should be as in the pdf, or as in the tokens.hpp file

Regarding part B, does a single number is a valid input? i.e. the input file is: --- 5 ---
Yes. A single number is a valid input, and you should print it as a result.

What should be the output of the following input?
---
"aa\rb"
---

The output will be:
b STRING aa

Explanation:
We need to replace the two characters "\r" with a single character '\r' in the output result.
The '\r' is causing the stdout crusor to move to the start of the line. Therefore, if we will take an examination on the printing process, the following will be printed
1 STRING aa

Then, the '\r' will be printed (which will only cause the crusor to move).
Finally 'b' will be printed, and will override the '1' in the begining of the line.

Please Note:
In escape sequence handaling, you only need to replace the escape sequence with the appropriate character.
Don't process the string before you print it. e.g., In the example above, don't process the "\r" by creating the string "ba" before print.

Does the token B will appear necessarily after the token NUM?

No.
The input for part_a could be any kind of lexemes sequence regardless that it's a form valid program or not.
e.g.
both input are acceptable:
---
int main () {}
---
and
---
b56"adsf"
---

The second one isn't a valid program, and yet it will produce an output with no error.
The constraint that B has to come after NUM will be enforced by the grammar of the language. You will handle it on the next homeworks, and you don't need to consider it in your solution for HW1.

Regarding part B, what should be printed if we get a STRING with an error. e.g. a string with an unrecognized escape sequence.
In part B we don't check strings content. In case a token was found that is not NUM or BINOP you should print for example "Error: STRING\n"

Regarding part B, what should be the precedence of the errors? i.e. if we have multiple errors in the input, which one of them we should print?

You should read the tokens one by one.
The current token can be legal (NUM or BINOP), illegal (STRING for example), or illegal character (for example @).
When you read an invalid token/char you should print the propper error message and exit the program.
If the input is valid with respect to the tokens, but it forms illegal expression, you should print the error message regarding bad expression.
Some examples:
(1)
---
1 2 int @
---
The first error we found is in the INT token.

(2)
---
1 2 @
---
The first error we found is in @ character (even though the rest of the input is a bad expression)

(3)
---
1 2
---
No bad input with respect to tokens or character but it's a bad expression.

שאלות ותשובות - HW1 Frequently Asked Questions - HW1 Вопросы и Ответы - HW1 أسئلة وأجوبة - HW1

Regarding printing STRING lexeme, does the ascii escape sequence can accept upper case and lower case letters in the hexadecimal numbers?

Regarding printing STRING lexeme, does the ascii escape sequence (\xdd) can accept a number that represents a non-printable character?

Is it possible to find a comment on the last line of the input file? e.g. --- int main() { return 0; } // Comment at the end ----

In the case where there is an error in a string, but it's also an unclosed one, what should we print?

What should be the precedence of the tokens? i.e. in which order should we write the tokens in the scanner.lex file?

Regarding part B, does a single number is a valid input? i.e. the input file is: --- 5 ---

What should be the output of the following input? --- "aa\rb" ---

Does the token B will appear necessarily after the token NUM?

Regarding part B, what should be printed if we get a STRING with an error. e.g. a string with an unrecognized escape sequence.

Regarding part B, what should be the precedence of the errors? i.e. if we have multiple errors in the input, which one of them we should print?

Is it possible to find a comment on the last line of the input file?
e.g.
---
int main() {
return 0;
}
// Comment at the end
----

Regarding part B, does a single number is a valid input? i.e.
the input file is:
---
5
---

What should be the output of the following input?
---
"aa\rb"
---