Timothy Wood - Potential MS/PhD Students

Potential MS/PhD Students

I am always looking for good students to work with on my research projects, however, there are often more students interested in working with me than I can handle. I have also found that often students are broadly interested in cloud computing and distributed systems, but do not have the specific technical skills that are crucial for the type of research that my group does.

If you are interested in working with me, you must posses the following traits:

  • Self-motivated, curious, and hard-working
  • Comfortable with using Linux, installing applications from source, and writing bash scripts
  • Proficient in writing C programs, including topics such as multi-threading or networking
  • Knowledge of operating systems concepts in general, as well as practical experience modifying the Linux kernel

While you may be able to learn many of these topics "on the job", it is still important that you have this kind of background so that you know that you will enjoy the type of low-level programming required to work in my lab. In particular, if you have never built the Linux kernel from source or made some modifications to a kernel module, then you are extremely unlikely to qualify for my lab. Fortunately, if you are truly interested in working in this area, it is quite possible to teach yourself how to do this.

The challenge programming assignments listed below will help test (and grow) your skills in the above areas. They are very similar to assignments we have given in past undergraduate operating systems courses, so they should be do-able by anyone with an undergraduate education in computer science. You may not have familiarity with the specifics such as the pthreads library or condition variables, but you should be able to learn enough on your own to complete these mini projects. An important part of being a successful researcher is being able to teach yourself new areas quickly.

Challenge #1: Multi-Threading

Understanding how multi-threaded programs work is crucial for modern multi-core systems. The POSIX Threads library provides the basic functionality needed to spawn threads and negotiate access to shared data using locks and condition variables. Your task is to use pthreads to write a simple web server that can handle multiple client requests simultaneously. There are of course many different ways to write such a program, but you should attempt to build it using the following two threading models:

  1. A single master accepts requests and spawns a new thread to handle each one. These worker threads exit after finishing one request.
  2. A master thread creates a pool of NUM_CORES threads. The master accepts new client requests and puts them on a list so that they can then be serviced in order by one of the worker threads. The worker threads and the master use mutexes and condition variables to control access to the shared list of requests that must be processed. The master signals the other threads when a new request has been added to the list. A thread wakes up, services the request, then goes back to sleep on the condition variable.

Your challenge is to expand the source code below so that it can run in either of these two modes. You can then test your server with httperf which can be downloaded at here. This program will stress test the server at a level impossible to recreate with a browser or wget. An example of using httperf follows:

httperf --port=8080 --server=localhost --num-conns=10000 --burst-len=100 --uri=index.html

Note: you will need to run httperf on the same machine where the server is running, or change the –server=x parameter as appropriate.

To get you started, use this sample code. Many thanks to Gabe Parmer for writing this code in the first place.

Challenge #2: Kernel Programming

I haven't gotten around to carefully defining this project yet, but here's a rough description.

Phase 1: Write a kernel module that creates a new /proc/ entry. When you try to read from this file, the kernel module should print out the current time.

Phase 2: Enhance your kernel module so that it reports more interesting system level data such as the number of processes currently running, memory usage statistics, etc. Allow the user to write a value to the module; depending on what is written, print out a different piece of data.

Was that easy or hard?

If you have a lot of trouble solving any of these problems keep searching the internet and better yet, read through some operating systems textbooks–they will help a lot particularly for the theory of multi-threading. Unfortunately I do not have time to respond to help requests from students attempting these challenge problems.

If you were able to solve them correctly, then that's great. Congrats! You are now well prepared for working in my research lab, although of course there are always still limits on the number of students I can accept and fund. Write me an email and let me know what you learned in this process and what you are interested in learning next.

research/apply.txt · Last modified: 2013/02/04 12:09 by twood