官术网_书友最值得收藏!

Record field alignment

The third compiler option I'd like to discuss regulates the alignment of fields in Delphi record and class types. It can be set to the following values: Off, Byte, Word, Double Word, and Quad Word. Settings are a bit misleading, as the first two values actually result in the same behavior.

You can use compiler directives {$ALIGN 1} , {$ALIGN 2}, {$ALIGN 4}, and {$ALIGN 8} to change record field alignment in code, or equivalent short forms {$A1}, {$A2}, {$A4}, and {$A8}. There are also two directives which exist only for backward compatibility. {$A+} means the same as {$A8} (which is also a default for new programs) and {$A-} is the same as {$A1}.

Field alignment controls exactly how fields in records and classes are laid out in memory.

Let's say that we have the following record. And let's say that the address of the first field in the record is simply 0:

type
TRecord = record
Field1: byte;
Field2: int64;
Field3: word;
Field4: double;
end;

With the {$A1} alignment, each field will simply follow the next one. In other words, Field2 will start at address 1, Field3 at 9, and Field4 at 11. As the size of double is 8 (as we'll see later in this chapter), the total size of the record is 19 bytes.

The Pascal language has a syntax that enforces this behavior without the use of compiler directives. You can declare a record as a  packed record and its fields will be packed together as with the {$A1} alignment, regardless of the current setting of this directive. This is very useful when you have to interface with libraries written in other languages.

With the {$A2} alignment, each field will start on a word boundary. In layman's terms, the address of the field (offset from the start of the record) will be divisible by 2. Field2 will start at address 2, Field3 at 10, and Field4 at 12. The total size of the record will be 20 bytes.

With the {$A4} alignment, each field will start on a double word boundary so its address will be divisible by 4. (You can probably see where this is going.) Field2 will start at address 4, Field3 at 12, and Field4 at 16. The total size of the record will be 24 bytes.

Finally, with the {$A8} alignment, each field will start on a quad word boundary so its address will be divisible by 8. Field2 will start at address 8, Field3 at 16, and Field4 at 24. The total size of the record will be 32 bytes.

Saying all that, I have to add that $A directive doesn't function exactly as I described it. Delphi knows how simple data types should be aligned (for example, it knows that an integer should be aligned on a double word boundary) and will not move them to higher alignment, even if it is explicitly specified by a directive. For example, the following record will use only 8 bytes even though we explicitly stated that fields should be quad word aligned:

{$A8}
TIntegerPair = record
a: integer;
b: integer:
end;

If you need to exactly specify size and alignment of all fields (for example if you pass records to some API call), it is best to use the packed record directive and insert unused padding fields into the definition. The next example specifies a record containing two quad word aligned integers:

TIntegerPair = packed record
a: integer;
filler: integer;
b: integer:
end;

The following image shows how this record is laid out in memory with different record field alignment settings. Fields are renamed F1 to F4 so that their names would fit in the available space. X marks unused memory:

Why is all this useful? Why don't we always just pack fields together so that the total size of a record or class is as small as possible? Well, that is an excellent question!

As traditional wisdom says, CPUs work faster when the data is correctly aligned. Accessing a four-byte data (an integer, for example) is faster if its address is double word aligned (is divisible by four). Similarly, two-byte data (word) should be word aligned (address divisible by two) and eight-byte data (int64) should be quad word aligned (address divisible by eight). This will significantly improve performance in your program.

Will it really? Does this traditional wisdom make any sense in the modern world?

The CompilerOptions demo contains sets of measurements done on differently aligned records. It is triggered with the Record field align button.

Running the test shows something surprising—all four tests (for A1, A2, A4, and A8) run at almost the same speed. Actually, the code operating on the best-aligned record (A8) is the slowest! I must admit that I didn't expect this while preparing the test.

A little detective work has shown that somewhere around year 2010, Intel did a great job optimizing the data access on its CPUs. If you manage to find an older machine, it will show a big difference between unaligned and aligned data. However, all Intel CPUs produced after that time will run on unaligned and aligned data at the same speed. Working on unaligned (packed) data may actually be faster as more data will fit into the processor cache.

What is the moral lesson of all that? Guessing is nothing, hard numbers are all! Always measure. There is no guarantee that your changes will actually speed up the program.

主站蜘蛛池模板: 孝义市| 秦皇岛市| 信丰县| 宣武区| 长乐市| 勐海县| 安国市| 永嘉县| 德州市| 临桂县| 格尔木市| 平凉市| 云南省| 错那县| 桓仁| 平顶山市| 隆德县| 新巴尔虎左旗| 吉首市| 沅江市| 南平市| 图们市| 金溪县| 桐柏县| 江西省| 亚东县| 东宁县| 普安县| 肇东市| 华阴市| 安溪县| 夹江县| 克东县| 仁怀市| 宣汉县| 丽水市| 盐源县| 定襄县| 义马市| 宁南县| 潞西市|