官术网_书友最值得收藏!

Memory-related properties

Memory on the GPU has a hierarchical architecture. It can be divided in terms of L1 cache, L2 cache, global memory, texture memory, and shared memory. The cudaDeviceProp provides many properties that help in identifying memory available with the device. memoryClockRate and memoryBusWidth provide clock rate and bus width of the memory respectively. The speed of the memory is very important. It affects the overall speed of your program. totalGlobalMem returns the size of global memory available with the device. totalConstMem returns the total constant memory available with the device. sharedMemPerBlock returns the total shared memory that can be used in tne device. The total number of registers available per block can be identified by using regsPerBlock. Size of L2 cache can be identified using the l2CacheSize property. The following code snippet shows how to use memory-related properties from the CUDA program:

printf( " Total amount of global memory: %.0f MBytes (%llu bytes)\n",
(float)device_Property.totalGlobalMem / 1048576.0f, (unsigned long long) device_Property.totalGlobalMem);
printf(" Memory Clock rate: %.0f Mhz\n", device_Property.memoryClockRate * 1e-3f);
printf(" Memory Bus Width: %d-bit\n", device_Property.memoryBusWidth);
if (device_Property.l2CacheSize)
{
printf(" L2 Cache Size: %d bytes\n", device_Property.l2CacheSize);
}
printf(" Total amount of constant memory: %lu bytes\n", device_Property.totalConstMem);
printf(" Total amount of shared memory per block: %lu bytes\n", device_Property.sharedMemPerBlock);
printf(" Total number of registers available per block: %d\n", device_Property.regsPerBlock);
主站蜘蛛池模板: 湖南省| 和田市| 新乐市| 张北县| 左贡县| 高安市| 长汀县| 县级市| 福州市| 辽阳市| 邓州市| 阳泉市| 芮城县| 罗山县| 乾安县| 嵩明县| 聊城市| 开远市| 凤阳县| 桐庐县| 宝清县| 邵武市| 郎溪县| 栾城县| 麦盖提县| 榆中县| 兴国县| 潼南县| 汉川市| 满洲里市| 东兴市| 多伦县| 普宁市| 荆门市| 子长县| 灵宝市| 呼伦贝尔市| 胶州市| 称多县| 日喀则市| 舟山市|