官术网_书友最值得收藏!

Machine code generation

Compiler optimizations are done in both intermediate code and generated machine code. So what is it like when we compile the project? Earlier in the chapter, when we discussed the preprocessing of the source code, we looked at a simple structure containing several source files, including two headers, rect.h and square.h, each with its .cpp files, and main.cppwhich contained the program entry point (the main() function). After the preprocessing, the following units are left as input for the compiler: main.cpprect.cpp, and square.cppas depicted in the following diagram: 

The compiler will compile each separately. Compilation units, also known as source files, are independent of each other in some way. When the compiler compiles main.cpp, which has a call to the get_area() function in Rect, it does not include the get_area() implementation in main.cpp. Instead, it is just sure that the function is implemented somewhere in the project. When the compiler gets to rect.cpp, it does not know that the get_area() function is used somewhere. 

Here's what the compiler gets after main.cpp passes the preprocessing phase: 

// contents of the iostream 
struct Rect {
private:
double side1_;
double side2_;
public:
Rect(double s1, double s2);
const double get_area() const;
};

struct Square : Rect {
Square(double s);
};

int main() {
Rect r(3.1, 4.05);
std::cout << r.get_area() << std::endl;
return 0;
}

After analyzing main.cpp, the compiler generates the following intermediate code (many details are omitted to simply express the idea behind compilation):

struct Rect { 
double side1_;
double side2_;
};
void _Rect_init_(Rect* this, double s1, double s2);
double _Rect_get_area_(Rect* this);

struct Square {
Rect _subobject_;
};
void _Square_init_(Square* this, double s);

int main() {
Rect r;
_Rect_init_(&r, 3.1, 4.05);
printf("%d\n", _Rect_get_area(&r));
// we've intentionally replace cout with printf for brevity and
// supposing the compiler generates a C intermediate code
return 0;
}

The compiler will remove the Square struct with its constructor function (we named it _Square_init_) while optimizing the code because it was never used in the source code.  

At this point, the compiler operates with main.cpp only, so it sees that we called the _Rect_init_ and _Rect_get_area_ functions but did not provide their implementation in the same file. However, as we did provide their declarations beforehand, the compiler trusts us and believes that those functions are implemented in other compilation units. Based on this trust and the minimum information regarding the function signature (its return type, name, and the number and types of its parameters), the compiler generates an object file that contains the working code in main.cpp and somehow marks the functions that have no implementation but are trusted to be resolved later. The resolving is done by the linker.

In the following example, we have the simplified variant of the generated object file, which contains two sections—code and information. The code section has addresses for each instruction (the hexadecimal values):

code: 
0x00 main
0x01 Rect r;
0x02 _Rect_init_(&r, 3.1, 4.05);
0x03 printf("%d\n", _Rect_get_area(&r));
information:
main: 0x00
_Rect_init_: ????
printf: ????
_Rect_get_area_: ????

Take a look at the information section. The compiler marks all the functions used in the code section that were not found in the same compilation unit with ????. These question marks will be replaced by the actual addresses of the functions found in other units by the linker. Finishing with main.cpp, the compiler starts to compile the rect.cpp file: 

// file: rect.cpp 
struct Rect {
// #include "rect.h" replaced with the contents
// of the rect.h file in the preprocessing phase
// code omitted for brevity
};
Rect::Rect(double s1, double s2)
: side1_(s1), side2_(s2)
{}
const double Rect::get_area() const {
return side1_ * side2_;
}

Following the same logic here, the compilation of this unit produces the following output (don't forget, we're still providing abstract examples): 

code:  
0x00 _Rect_init_
0x01 side1_ = s1
0x02 side2_ = s2
0x03 return
0x04 _Rect_get_area_
0x05 register = side1_
0x06 reg_multiply side2_
0x07 return
information:
_Rect_init_: 0x00
_Rect_get_area_: 0x04

This output has all the addresses of the functions in it, so there is no need to wait for some functions to be resolved later. 

主站蜘蛛池模板: 五莲县| 微山县| 大埔县| 科技| 兴安盟| 大渡口区| 太保市| 遂川县| 岑巩县| 鄂托克前旗| 金堂县| 富蕴县| 建湖县| 汶上县| 蕲春县| 桃江县| 绥阳县| 青铜峡市| 新乐市| 故城县| 黎城县| 安康市| 都匀市| 阳东县| 文登市| 雅安市| 修水县| 伊金霍洛旗| 大化| 阿拉尔市| 贡觉县| 柯坪县| 武定县| 亚东县| 蒲江县| 普兰店市| 高台县| 吴川市| 金乡县| 静海县| 无极县|