- Mastering C# Concurrency
- Eugene Agafonov Andrew Koryavchenko
- 330字
- 2021-07-09 21:26:06
Optimization strategy
Creating parallel algorithms is not a simple task: there is no universal solution to it. In every case, you have to use a specific approach to write effective code. However, there are several simple rules that work for most of the parallel programs.
Lock localization
The first thing to take into account when writing parallel code is to lock as little code as possible, and ensure that the code inside the lock runs as fast as possible. This makes it less deadlock-prone and scale better with the number of CPU cores. To sum up, acquire the lock as late as possible and release it as soon as possible.
Let us consider the following situation: for example, we have some calculation performed by method Calc
without any side effects. We would like to call it with several different arguments and store the results in a list. The first intention is to write the code as follows:
for (var i = from; i < from + count; i++) lock (_result) _result.Add(Calc(i));
This code works, but we call the Calc
method and perform the calculation inside our lock. This calculation does not have any side effects, and thus requires no locking, so it would be much more efficient to rewrite the code as shown next:
for (var i = from; i < from + count; i++) { var calc = Calc(i); lock (_result) _result.Add(calc); }
If the calculation takes a significant amount of time, then this improvement could make the code run several times faster.
Shared data minimization
Another way of improving parallel code performance is by minimizing the shared data, which is being written in parallel. It is a common situation when we lock over the whole collection every time we write into it, instead of thinking and lowering the amount of locks and the data being locked. Organizing concurrent access and data storage in a way that it minimizes the number of locks can lead to a significant performance increase.
In the previous example, we locked the entire collection each time, as described in the previous paragraph. However, we really don't care about which worker thread processes exactly what piece of information, so we could rewrite the previous code like the following:
var tempRes = new List<string>(count); for (var i = from; i < from + count; i++) { var calc = Calc(i); tempRes.Add(calc); } lock (_result) _result.AddRange(tempRes);
The following is the complete comparison:
static class Program { private const int _count = 1000000; private const int _threadCount = 8; private static readonly List<string> _result = new List<string>(); private static string Calc(int prm) { Thread.SpinWait(100); return prm.ToString(); } private static void SimpleLock(int from, int count) { for (var i = from; i < from + count; i++) lock (_result) _result.Add(Calc(i)); } private static void MinimizedLock(int from, int count) { for (var i = from; i < from + count; i++) { var calc = Calc(i); lock (_result) _result.Add(calc); } } private static void MinimizedSharedData(int from, int count) { var tempRes = new List<string>(count); for (var i = from; i < from + count; i++) { var calc = Calc(i); tempRes.Add(calc); } lock (_result) _result.AddRange(tempRes); } private static long Measure(Func<int, ThreadStart> actionCreator) { _result.Clear(); var threads = Enumerable .Range(0, _threadCount) .Select(n => new Thread(actionCreator(n))) .ToArray(); var sw = Stopwatch.StartNew(); foreach (var thread in threads) thread.Start(); foreach (var thread in threads) thread.Join(); sw.Stop(); return sw.ElapsedMilliseconds; } static void Main() { // Warm up SimpleLock(1, 1); MinimizedLock(1, 1); MinimizedSharedData(1, 1); const int part = _count / _threadCount; var time = Measure(n => () => SimpleLock(n*part, part)); Console.WriteLine("Simple lock: {0}ms", time); time = Measure(n => () => MinimizedLock(n * part, part)); Console.WriteLine("Minimized lock: {0}ms", time); time = Measure(n => () => MinimizedSharedData(n * part, part)); Console.WriteLine("Minimized shared data: {0}ms", time); } }
Executing this code on Core i7 2600K and x64 OS in Release configuration gives the following results:
Simple lock: 806ms Minimized lock: 321ms Minimized shared data: 165ms
- Data Visualization with D3 4.x Cookbook(Second Edition)
- 計算思維與算法入門
- Raspberry Pi for Python Programmers Cookbook(Second Edition)
- JSP網絡編程(學習筆記)
- Kibana Essentials
- CMDB分步構建指南
- Java編程指南:基礎知識、類庫應用及案例設計
- Visual Basic學習手冊
- MySQL數據庫管理與開發(慕課版)
- SQL基礎教程(視頻教學版)
- Swift語言實戰精講
- Nginx Lua開發實戰
- 寫給程序員的Python教程
- Orleans:構建高性能分布式Actor服務
- C++程序設計教程