- Hands-On GPU Programming with Python and CUDA
- Dr. Brian Tuomanen
- 213字
- 2021-06-10 19:25:34
Profiling your code
We saw in the previous example that we can individually time different functions and components with the standard time function in Python. While this approach works fine for our small example program, this won't always be feasible for larger programs that call on many different functions, some of which may or may not be worth our effort to parallelize, or even optimize on the CPU. Our goal here is to find the bottlenecks and hotspots of a program—even if we were feeling energetic and used time around every function call we make, we might miss something, or there might be some system or library calls that we don't even consider that happen to be slowing things down. We should find candidate portions of the code to offload onto the GPU before we even think about rewriting the code to run on the GPU; we must always follow the wise words of the famous American computer scientist Donald Knuth: Premature optimization is the root of all evil.
We use what is known as a profiler to find these hot spots and bottlenecks in our code. A profiler will conveniently allow us to see where our program is taking the most time, and allow us to optimize accordingly.
- Extending Puppet
- Extending Bootstrap
- Windows Phone應(yīng)用程序開發(fā)
- 計算機(jī)系統(tǒng)開發(fā)與優(yōu)化實戰(zhàn)
- Dreamweaver CS5.5 Mobile and Web Development with HTML5,CSS3,and jQuery
- Linux內(nèi)核觀測技術(shù)BPF
- 操作系統(tǒng)分析
- Cassandra 3.x High Availability(Second Edition)
- Vim 8文本處理實戰(zhàn)
- Windows 7實戰(zhàn)從入門到精通(超值版)
- Linux網(wǎng)絡(luò)配置與安全管理
- 鴻蒙操作系統(tǒng)設(shè)計原理與架構(gòu)
- 鴻蒙HarmonyOS手機(jī)應(yīng)用開發(fā)實戰(zhàn)
- Windows Server 2008組網(wǎng)技術(shù)與實訓(xùn)(第3版)
- iOS 10快速開發(fā):18天零基礎(chǔ)開發(fā)一個商業(yè)應(yīng)用