- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Bhaumik Vaidya
- 308字
- 2021-08-13 15:48:23
Comparing latency between the CPU and the GPU code
The programs for CPU and the GPU addition are written in a modular way so you can play around with the value of N. If N is small, then you will not notice any significant time difference between the CPU and the GPU code. But if you N is sufficiently large, then you will notice the significant difference in the CPU execution time and the GPU execution time for the same-vector addition. The time taken for the execution of a particular block can be measured by adding the following lines to the existing code:
clock_t start_d = clock();
printf("Doing GPU Vector add\n");
gpuAdd << <N, 1 >> >(d_a, d_b, d_c);
cudaThreadSynchronize();
clock_t end_d = clock();
double time_d = (double)(end_d - start_d) / CLOCKS_PER_SEC;
printf("No of Elements in Array:%d \n Device time %f seconds \n host time %f Seconds\n", N, time_d, time_h);
Time is measured by calculating the total number of clock cycles taken to perform a particular operation. This can be done by taking the difference of starting and ending the clock tick count, measured using the clock() function. This is divided by the number of clock cycles per second, to get the execution time. When N is taken as 10,000,000 in the previous vector addition programs of the CPU and the GPU and executed simultaneously, the output is as follows:

As can be seen from the output, the execution time or throughput is improved from 25 milliseconds to almost 1 millisecond when the same function is implemented on GPU. This proves what we have seen in theory earlier that executing code in parallel on GPU helps in the improvement of throughput. CUDA provides an efficient and accurate method for measuring the performance of CUDA programs, using CUDA events, which will be explained in the later chapters.
- 老“碼”識途
- Mastering Swift 2
- Python Data Analysis(Second Edition)
- Learning Material Design
- MATLAB GUI純代碼編寫從入門到實戰
- 從Excel到Python數據分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應用
- Flask開發Web搜索引擎入門與實戰
- SaaS攻略:入門、實戰與進階
- 程序員超強大腦
- TensorFlow程序設計
- C++ Game Development Cookbook
- Expert Angular
- Mastering VMware Horizon 6
- Learning Behavior:driven Development with JavaScript
- Python學習手冊(原書第4版)