官术网_书友最值得收藏!

Tokenization

The analysis phase of the compiler aims to split the source code into small units called tokens. A token may be a word or just a single symbol, such as = (the equals sign). A token is the smallest unit of the source code that carries meaningful value for the compiler. For example, the expression int a = 42; will be divided into the tokens int, a, =, 42, and ;. The expression isn't just split by spaces, because the following expression is being split into the same tokens (though it is advisable not to forget the spaces between operands):

int a=42;

The splitting of the source code into tokens is done using sophisticated methods using regular expressions. It is known as lexical analysis, or tokenization (dividing into tokens). For compilers, using a tokenized input presents a better way to construct internal data structures used to analyze the syntax of the code. Let's see how.

主站蜘蛛池模板: 河池市| 凤冈县| 永嘉县| 永川市| 陵水| 富平县| 温泉县| 偏关县| 庆阳市| 遵义县| 华容县| 沈阳市| 高碑店市| 柳林县| 南澳县| 高唐县| 新龙县| 疏附县| 怀宁县| 兴文县| 嘉峪关市| 白山市| 老河口市| 武穴市| 肥西县| 布尔津县| 滨州市| 花莲市| 汝南县| 绥阳县| 奇台县| 河津市| 原阳县| 岑巩县| 明光市| 项城市| 禄劝| 衡阳市| 永州市| 台东市| 宁夏|