官术网_书友最值得收藏!

Tokenization

The analysis phase of the compiler aims to split the source code into small units called tokens. A token may be a word or just a single symbol, such as = (the equals sign). A token is the smallest unit of the source code that carries meaningful value for the compiler. For example, the expression int a = 42; will be divided into the tokens int, a, =, 42, and ;. The expression isn't just split by spaces, because the following expression is being split into the same tokens (though it is advisable not to forget the spaces between operands):

int a=42;

The splitting of the source code into tokens is done using sophisticated methods using regular expressions. It is known as lexical analysis, or tokenization (dividing into tokens). For compilers, using a tokenized input presents a better way to construct internal data structures used to analyze the syntax of the code. Let's see how.

主站蜘蛛池模板: 济南市| 张家港市| 西昌市| 紫阳县| 长子县| 大石桥市| 长治市| 米林县| 芜湖县| 健康| 原阳县| 平顶山市| 友谊县| 文水县| 信阳市| 南宫市| 兴山县| 绵阳市| 绥中县| 灵山县| 房山区| 西藏| 新余市| 渝中区| 灯塔市| 余江县| 招远市| 辰溪县| 南投县| 江西省| 怀来县| 灵武市| 库车县| 壶关县| 崇文区| 观塘区| 额尔古纳市| 中山市| 南溪县| 武鸣县| 富锦市|