As developers, we all know that concurrency and parallelism matters. Without concurrency, expensive operations would block the user interface, and be detrimental to the overall user experience. But often times, questions are raised:
What's the difference between asynchronous and multithreading? What exactly are threads? Are the number of threads limited by the number of cores? Concurrency v.s. Parallelism?
This article elaborates on these points, and seeks to clarify some common misunderstandings.
We start with what the word "concurrent":
Concurrent: having multiple things occur at the same time.
Note that by this definition, we don't actually care how things occur simultaneously. The mere fact that there are multiple things happening at the same time makes a process qualify as being "concurrent".
We will soon come back to "how things are done at the same time". But for now, just keep in mind that as long as there are multiple things happening at the same time, we would call the situation as "concurrent".
In contrast, the word "consecutive" is are already familiar with, both in terms of definition and implementation:
Consecutive: having things occur one after another.
In terms of implementation, this form of execution is called "synchronous execution".
The hallmark of synchronous execution is that different operations have absolutely no overlap between their execution timespans. No operation ends after another operation starts, as illustrated below:
No matter how long operation 1 takes, operation 2 will always start only after operation 1 finishes.
The Story of a Server on Mars Let's say we have a server on Mars. As a client on the earth, I want to update 2 records on the server: record and record with:
a = a + 1 //takes about 8 minutes makeSentSound() //takes about 10 seconds b = b - 1 //takes about 8 minutes makeSentSound() //takes about 10 seconds
With synchronous operation, line 2 would execute only after line 1 has finished executing, which is 8 minutes later; then line 4 would also have to wait 8 minutes for line 3 to finish executing. It's simple... but how ridiculous!
Obviously, the simplest optimization one could think of is firing b = b - 1
right after a = a + 1
. However, this means starting the execution of line 3 before the finish of line 1 -- this is the core idea behind asynchronous execution, that one operation can start before the completion of another:
How can a computer do this? One may ask.
If we look more closely, in the synchronous case of the "server on mars" story, the computer is not doing anything while it's waiting for the completion of a = a + 1
. It isn't responsible for the transmission of the signals in the universe, nor is it responsible for updating the record. (That's the server's job.)
So asynchronous execution takes advantage of the fact that the computer can still do other work while it is idling and waiting for completions not on its part.
Therefore, as an obvious optimization to the above case: line 2 can be executed while line 1 is waiting for completion. line 3 can start executing right after line 2 completes, and line 4 can execute while both line 1 and line 3 is waiting for completion.
Note that because different operations may occur simultaneously, asynchronous execution is considered a form of concurrency.
The above-mentioned optimization is made possible by the fact the the computer is otherwise on idle during the 8-minutes time. What if it's not? What if it is indeed required to perform a lengthy operation? Would the optimization still be effective?
Cooking and Serving a beef stew, involves seasoning the beef for 1 minutes (hands on), letting the beef rest for 15 minutes (hands off), preparing vegetables for 10 minutes (hands on), boiling the water for 5 minutes (hands off), throwing everything inside (hands on), letting everything cook for 20 minutes (hands off), and setting up the table for 10 minutes (hands on).
A novice would do it synchronously, which takes minutes. A seasoned cook would do it asynchronously, tucking all the hands-on time into the hands-off time. This would take minutes, a great optimization that is almost half of the original timespan.
However, if now the hands-off time becomes hands-on time: e.g. he needs to massage the meat when it rests, and he needs to stir the water and the soup all the time, then the asynchronous approach yields minutes, the same result as the synchronous approach. Since the cook must fully devote himself to every step, there is simply no space for optimization.
What he could do instead, is hire a helper. After all, massaging the meat or setting up the table does not require much expertise. In this way, while the helper is massaging the meat, the cook can go on to prepare the vegetables. This results in a completion time of minutes.
This is the idea of multithreading: having different workers (threads) work on different tasks simultaneously.
Hold on, one may say, but doesn't "sync" means "together" and "async" means "separate"? Shouldn't the nomenclature be the other way round?
No, we are not mistaken. In this case, "sync" means "one by one", and "async" means "at the same time". Why? Because we are not syncing operations with different operations, we art syncing different operations with the same clock.
If two operations share the same clock, they are mutually aware of the occupation of time by the execution of the other.
If two operations are not in sync with the same clock (hence async), they can execute on their own time, without regard to the execution time of the other.
From a user's point of view, we first have to understand processes. Processes are softwares and services running on operating systems. What we call "app"s are in computer's term, processes. Services like the Location Service, and the App Push Notification Service are also processes. Desktop operating systems like Windows and MacOS supports multiple processes for the same application (like multiple windows of chrome), while iOS only supports one process per app, and processes are further distinguished to frontend processes and backend processes.
Now, threads belong to processes. Each process have at least 1 thread (the main thread). "Thread" is in fact a visualized term, the real way to understand it is that it's just a sequence of executable operations. That is, each process may have a number of sequences of operations
The real significance of having different sequences of operations (threads) is in context switching. Modern operating systems are able to switch between threads. The switching happens so fast and frequently, that to the eye of users, it appears that the computer can do 1000 things at the same time, even though deep down it is only doing 3 or 4 at the same time.
Nowadays, CPUs have more than 1 core. With hyperthreading technology, each core can raise its clock speed to pretend to the operating system that it is 2 or more cores. These "fake core"s that the operating system is aware of are called physical threads / logical threads. They are the computing resources (we just mentioned) that the operating system distribute to software threads. Note that obviously, software threads are very different from physical cores.
From a frontend/mobile engineer's perspective, the thing we care the most is "60 fps frame rate" of the UI. Since all UI updates must be performed on the main thread, any block or wait it is unforgivable to the user experience.
Understanding concurrency allows us to distribute expensive operations from the main thread to background threads as much as possible. This is why operations such as networking calls, file uploads and downloads, image decoding are always done on background threads.
An even more interesting topic is having UI elements rendered on background threads. Frameworks like Texture (AsyncDisplayKit) have successfully done so and have thus gained much tractions from scrolling/rendering-intensive apps.
In future posts, we will surely discuss more concurrency-related topics. I hope this post sets up a solid conceptual ground for further implementation details.