## Better CPU Utilization
Imagine an application that reads and processes files from the local file system. Reading a file takes 5 seconds and processing it takes 2 seconds. Processing two files then takes
5 seconds reading file A
2 seconds processing file A
5 seconds reading file B
2 seconds processing file B
-----------------------
14 seconds total
When reading a file from disk, most of the CPU time is spend waiting for the disk to read the data. The CPU is pretty much idle during that time instead of doing something else. By changing the order of the operations, the CPU could be better utilized.
5 seconds reading file A
5 seconds reading file B + 2 seconds processing file A
2 seconds processing file B
-----------------------
12 seconds total
The CPU waits for the first file to be read. Then, while the second file is being read in by the IO component of the computer, the CPU processes the first file.
In general, the CPU can be doing other things while waiting for IO. It does not have to be the disk IO. It can be network IO as well, or input from a user at the machine. Network and disk IO is often a lot slower than CPU's and memory IO.
## Simpler Program Design
Programming the above ordering and processing by hand in a single-threaded application would require keeping track of both the reading and processing state of each file. Instead, multithreading can be used to ensure a single thread only has to worry about reading and processing a single file. Each thread will be blocked waiting for the disk to read its file. While waiting, other threads can use the CPU to process the parts of the file they have already read. As a result, the disk is kept busy at all times, reading from various files into memory. Resulting in better utilization of both the disk and CPU. It is also simpler to program, since each thread only has to keep track of a single file.
## More Responsive Programs
Multithreaded applications are able to achieve better responsiveness. Imagine a server application that listens on some port for incoming requests. When a request is received, it handles the request and then goes back to listening. If the request takes a long time to process, no new clients can send request to the server for that duration. Only while the server is listening can request be received.
while(server is active){
listen for request
process request
}
An alternate design would be for the listening thread to pass the request to a worked thread, and return to listening immediately as shown below. This way the server thread will be back at listening sooner. Thus more clients can send requests to the server, making the server more responsive.
while(server is active){
listen for request
hand request to worker thread
}
The same is true for desktop applications. Clicking a button that starts a long task where a single thread handles both, the GUI update and task processing can result in unresponsiveness of the GUI while the task completes.
## More Fair Distribution Of CPU Resources
Imagine a situation where a single threaded server application is receiving requests. Receiving a request that takes way longer than the other requests will cause the short running tasks to have to wait for the long running task to complete. By dividing the CPU time between multiple threads and switching between the threads, the CPU can share its execution time more fairly between several requests. As a result, even if one of the requests is slow, other requests that are faster to process can be executed concurrently with the slower request.
However, this means that the long running task will take even longer since it won't have all the CPU execution time for itself.