cs161 2004 Lecture 2 Typical ways to get concurrency - this is about s/w structure. There are any number of potential structures. 0. One process, serial requests 1. Multiple processes 2. One process, many threads 3. Event-driven Depends on O/S facilities and type of application. Degree of interaction among different sub-tasks. One process is simplest, and not SO bad for relatively small files read-ahead into cache write-behind from buffer Concurrency with multiple processes Start a new process for each client connection / request Master process hands out connections. Now plenty of work available to keep system busy Still simple: look at server_2() in handout. fork() after accept() Preserves original s/w structure. Isolated: bug for one client does not crash the whole server Most interaction hidden by O/S. E.g. lock the disk queue. Gets CPU concurrency as a side effect! We may also want *CPU concurrency* Make use of multiple CPUs on shared memory machine. Often I/O concurrency tools can be used to get CPU concurrency. Of course O/S designer had to work a lot harder... CPU concurrency much less important than I/O concurrency: 2x, not 100x In general, very hard to program to get good scaling. Usually easier to buy two separate computers, which we *will* talk about. Multiple process problems Cost of starting a new process (fork()) may be high. 50us New address space &c. Processes are fairly isolated by default E.g. they do not share memory What if you want a web cache? Must be shared among processes. Or even just keep statistics? Concurrency with threads Looks a bit like multiple processes But thread_fork() leaves address space alone So all threads share memory One stack per thread, inside process [picture: thread boxes inside process boxes] Seems simple -- still preserves single-process structure. Potentially easier to have e.g. shared web cache But programmer needs to know about some kind of locking. Also easier for one thread to corrupt another There are some low-level but very important details that are hard to get right. What happens when a thread calls read()? Or some other blocking system call? Does the whole process block until disk I/O has finished? If you don't get this right, you don't get I/O concurrency. Kernel-supported threads O/S kernel knows about each thread It knows a thread was just blocked, e.g. in disk read wait Can schedule another thread from that same process [picture: thread boxes dip down into the kernel] What does kernel need for this? Per-thread kernel stack. Per-thread tables (e.g. saved registers). Semantics: per-process resources: addr space, file descriptors per-thread resources: user stack, kernel stack, kernel state Kernel could schedule threads on different CPUs This sounds like just what we want for our server BUT kernel threads are usually expensive, just like processes Kernel has to help create each thread Kernel has to help with each context switch? So it knows which thread took a fault... Getting into/out of kernel is slow and getting slower. Many O/S do not provide kernel-supported threads User-level threads Implemented purely inside program, kernel does not know User scheduler for threads inside the program In addition to kernel process scheduler [picture] User-level scheduler must: Know when a thread is making a blocking system call. Don't actually block, but switch to another thread. Know when I/O has completed so it can wake up original thread. Answer: thread library has fake read(), write(), accept(), &c system calls library knows how to *start* syscall operations without waiting library marks threads as waiting, switches to a runnable thread kernel notifies library of I/O completion and other events library marks waiting thread runnable read(){ tell kernel to start read; (syscall) mark thread as waiting for read; sched(); } sched(){ find a runnable thread; restore registers and return; } io_done(){ mark relevant thread as runnable; } Events: new network connection data arrived on socket disk read completed client/socket ready to receive new data Scheduler checks for new events at each context switch Resumes a thread when its event arrives Like a miniature O/S inside the process Problem: typical O/S provides only partial support for event notification yes: new TCP connections, arriving TCP/pipe/tty data no: file-system operation completion similarly, not all system calls operations can be started w/o waiting yes: connect(), socket read(), write() maybe: disk read() no: open(), stat() Threads are hard to program? The point is to share data structures in one address space Thread *model* involves CPU concurrency even on a single CPU so programmer may need to use locks even if only goal was to overlap I/O wait But *events* usually occur one at a time could do CPU processing sequentially, overlap only the I/O waiting Event-driven programming Suggested by user threads implementation Organize the s/w around arrival of events Write s/w in state-machine style When this event occurs, execute this function Library support to register interest in events The point: this preserves the serial natures of the events Programmer sees events/functions occuring one at a time This is the style in which you'll implement your assignments.