cs161 2004 Lecture 12: Extension mechanisms

Goals of an extension mechanism
 Allow for uses that original authors didn't implement, or even expect.
 Allow for expected evolution - new protocols, versions, porting
 Adding new functionality should be easier than starting over
  work with higher-level abstractions
  share code with other extensions
 base functionality should not break easily

Our examples:
 Process based protocol (like lab 2) - very safe
  CGI, Flash, Squid, Mozilla (helpers)
 Dynamic (binary) loading - high performance
  X, Apache (mod_*), Mozilla (plugins)

Other examples:
 function based: syscalls, RPC (X)
 "Resources" - images, sounds, etc
 Config files -
   key-value - ini files, 
   declaritive (HTML, most window managers, BPF)
   imperitive - simplistic scripting (config through cpp/m4, simple branching)
   True embedded languages - shells, mod_perl, make
   App is written in the "extension language"  - ns, emacs, expect

CGI
 Protection is complete
 High abstraction: variables exposed, stdin to stdout, per request

FastCGI (paper/page written in 2002) - protocol, implementation
 Persistent
 Multiplexed


 Protocol Abstraction - Not all that much
  "packet based"
  multiplexed stdin, stdout
  version, type, request ID, data
   BEGIN_REQUEST
   ABORT_REQUEST (browser "stop")
   END_REQUEST (return code)
   PARAMS (CGI variables, empty means end) STREAM
   STDIN (POST data) STREAM
   STDOUT (response) STREAM
  If you care about "fast & flexible", cross-domain abstractions maintain form

 Implementation Abstraction
  Library to handle FastCGI details and call higher level interfaces
  Look familiar?
   scheduler -> io_handler
    fd_is_readable -> ravail
    fd_is_writable -> wavail
   Listener<T> -> tcp_acceptor<T>
   event_loop is implicit

 Protection
  Completely separate from main code
  But multiple requests could be killed by bug

 Advantages
  Application lifetime
  Context-switches eliminated
  Network oriented - clusters, load balancing, etc
  Caching - normal, precaching, db in memory (advantages all due to lifetime)
  C vs Perl/PHP/TCL - "real" language, library code, independence, faster

 Disadvantages
  Memory leaks matter more
  other requests get killed
  complexity
  Protocol spec is not enough!  Must know how apache is going to call you.
   Is your app really persistent? (Apache didn't bother at time of writing)
   Is there exactly one app? (Do all requests come through one process?)
   If more than one, is there a "session"?

 Where is the world of webapps going?
  Extensions in the server runtime, but in safe languages
  mod_perl,
  PHP
  Java servlets
  Ruby on rails
  PLT-Server?

Linux Modules (page written in 1995)
 Pretty bare bones.  Writing a module is different than kernel dev only because
   only "published" symbols are allowed (and now, even license is tracked)
   init_module()
   cleanup_module()

 Steps
  Prepare module (num = get_kernel_syms(0), x=malloc(num), get_kernel_syms(x)
  Allocate kernel memory (start_addr = create_module())
  Copy in the module (init_module SYSCALL)
   Begin execution (init_module FUNCTION IN MODULE)

 Preparation is in userspace
  Why? (keep this big, complicated mess out of kernel)
  Kernel provides: get_kernel_syms, create_module, init_module, delete_module

 Protections?  Not much
   only exposes select symbols
   reference counting of usage
   versioning
    uname info
    md5 hashing of structures

 Abstraction? Not much
  Relies on the language (C - Ha!)
  Independent development is disallowed by versioning
  Contrast Spin

 Then if it's just like kernel dev, what's the point?
  Kernel memory
  No other program has this requirement (b/c of demand loading)

 Safer binary plugins?
   More structured interface
   Sandboxing

 unloading modules is hard
   Cleanup is very tricky
   Need back pointers everywhere - historically they have been HARD
    You try to keep track of where your module memory is in kernel structures
    But what about code?  Suppose a kernel thread is in your module when unloaded
   Might need time (TIME_WAIT)
   Maybe you shouldn't allow unloading?

Want to do this yourself?
  loading plugins - dlopen(), dlsym(), dlcose()
  replacing system behavior (LD_PRELOAD)