官术网_书友最值得收藏!

  • Expert C++
  • Vardan Grigoryan Shunguang Wu
  • 158字
  • 2021-06-24 16:33:54

Tokenization

The analysis phase of the compiler aims to split the source code into small units called tokens. A token may be a word or just a single symbol, such as = (the equals sign). A token is the smallest unit of the source code that carries meaningful value for the compiler. For example, the expression int a = 42; will be divided into the tokens int, a, =, 42, and ;. The expression isn't just split by spaces, because the following expression is being split into the same tokens (though it is advisable not to forget the spaces between operands):

int a=42;

The splitting of the source code into tokens is done using sophisticated methods using regular expressions. It is known as lexical analysis, or tokenization (dividing into tokens). For compilers, using a tokenized input presents a better way to construct internal data structures used to analyze the syntax of the code. Let's see how.

主站蜘蛛池模板: 华池县| 新绛县| 宁国市| 左权县| 启东市| 雅安市| 白山市| 涞水县| 涞源县| 广饶县| 双江| 安国市| 辽阳县| 城步| 静海县| 萨嘎县| 盐山县| 青神县| 阳高县| 资讯 | 资溪县| 潜江市| 安陆市| 大兴区| 桑植县| 广河县| 南康市| 田东县| 木里| 四平市| 博爱县| 伊宁县| 台南县| 杭锦旗| 庐江县| 万安县| 陆河县| 韶山市| 文安县| 赞皇县| 怀柔区|