I will soon need to train my teammates to this pattern and have been working on a series of short articles to get them started.
The purpose of this first article is to introduce the basic concepts of asynchronous programming in C# with lots of visual content, which is probably what is missing the most in the articles found online. My intention here is to simplify the theory to only focus on the essence of asycnhronous programming.
The theory
Asycnhronous programming is meant to capture parts of your code into schedulable blocks called Task. These blocks will be executed in the background with a set of shared ressources called ThreadPool.
Let's take a look on how a classic single-threaded program would execute on a system.
A program (or code) is made of functions of functions. On a single-threaded application, this code will be executed sequentially by the same Thread.
Now if we want to leverage TAP with the same code, here is what the code would look like.
The functions of my code are now captured in Task objects that are scheduled for execution. This time, it is not necessarily the same Thread that will execute each block, it will depend on threads availability in the ThreadPool. However, the code will be executed in the exact same order.
So why is it different ?
If f(x) executes blocking operations (e.g. I/O read), the Thread will remain blocked until the operation completes. That thread will be unavailable for other operations during that time.
If T executes blocking operations, the execution of the Task will be suspended until the the blocking operation resumes. During that time, the Thread will be released and free to execute other Tasks if necessary.
So functionnally, both codes are equivalent but in terms of system resource consumption, they do not work identically.
How is this possible ?
A Task is a stateful object (see TaskStatus). Whenever the code hits an await statement, it will start a state machine for the execution of the Task.
When hitting a blocking asynchronous sub-function, my Task will enter WaitingForChildrenToComplete state and will be put aside. The system can detect when an IO completes and will resume the execution where it was left by reloading the execution context.
Pros and Cons
- A code that is executed synchronously will perform better than its asynchronous version. As previouly explained, the execution of asynchronous code requires the creation of a state machine and is dependent on threads availability.
- Using TAP makes my system more scalable than its synchronous version. The resources of my system are only used when necessary which allows me to support a higher workload
Asynchronous vs Parallel
A common mistake is to mix these two concepts. The purpose of asynchronous programming is not to offer a simplified framework for parallel processing. Most of the time, you should not even use Task.Run or Task.Factory.StartNew and I believe that's what creates confusion. TAP is not a multitasking framework, it is a "promise for execution" framework.
With that said, TAP provides a few interesting methods if you want to parallelize the execution of your Task objects with Task.WhenAll or Task.WhenAny.