CS 336: Homework

Implementing the Game of Life with Pthreads (due Sat Oct. 4)

The Game of Life has garnered much attention since its development in 1970 by John Conway (see its entry in Wikipedia). In this assignment you will be developing a parallel version of the class game of life and running it on the Natural Sciences Computer Cluster (NSCC).

Logging onto NSCC

You all have accounts on the Natural Sciences Compute Cluster (NSCC) in order to run code for CS 336. The node we will be directed to has 8 processors, with each pair of processors sharing a 6MB L2 cache. There is no queueing software installed, so it is up to you to be careful not to hog the machine nor to clobber other people's jobs.

My advice is to develop and test your code on your machine first. You can even use 8 threads to simulate the use of 8 processors - there just won't be a speedup. Once you are reasonably happy with your code, use an SFTP or SCP server (I use SFTP on Fugu) to move it (and its makefile) to the network server (colby0.colby.edu). That puts the code into a directory that NSCC can access.

To run your code on NSCC, you must log onto it using SSH. In a Terminal window, type ssh nscc.colby.edu. The system is set up to automatically place you in your home directory associated with the colby0 file server. Tho see who else is logged in to the machine, type who on the command line. To see who is running jobs on the machine, type top (then, to quit it, type q). If none of your classmates are actively using it, then it should list FahCore_a2.exe as the process taking up most of the CPU time for each processor. This is code Randy is running in the background. It has a low priority and will yield to any processes you start.

If you have any problems using this computer, let me know AS SOON AS POSSIBLE!

What to Do

  1. Download gol.c, which has in it the sequential code to play the game. There is also Makefile. Do a sanity check - compile and run it on NSCC. It should print out:
    Running the parallel version
    It took 0.000000 seconds to run the parallel algorithm with 1 threads
    Running the sequential version

    Generation 0:
    oooooooo
    oooooooo
    oooooooo
    oooxxxoo
    ooxxxooo
    oooooooo
    oooooooo
    oooooooo

    Generation 1:
    oooooooo
    oooooooo
    ooooxooo
    ooxooxoo
    ooxooxoo
    oooxoooo
    oooooooo
    oooooooo

    Generation 2:
    oooooooo
    oooooooo
    oooooooo
    oooxxxoo
    ooxxxooo
    oooooooo
    oooooooo
    oooooooo

    Generation 3:
    oooooooo
    oooooooo
    ooooxooo
    ooxooxoo
    ooxooxoo
    oooxoooo
    oooooooo
    oooooooo
    It took 0.000000 seconds to run the sequential algorithm
    ERROR -- The boards differ!

    Let me know if there is a problem.

  2. Familiarize yourself with the sequential version of the code. In particular, note
    • This code is designed to run the game of life with an initial board configuration that is tiled with toads (the 2-phase oscillator we saw in class).
    • There are several macros that act as the "tuning knobs" for the code. The two that control the problem size are NUM_GENERATIONS and TILES.
    • The PRINTING macro determines whether or not the current board is printed to the screen after every generation. For small problems, set it to 1 to see the evolution of the cells. Otherwise, set it to 0.
    • There are several helper routines. Read their header comments to determine what their role is. The code is heavily commented, so I will not repeat the information here.
  3. Add support for a parallel version of the Game of Life by augmenting the code in gol.c to use Pthreads.
    • There is a function named parallel_life that you will need to fill in. This is the function that should call the Pthread create and join routines. Model it after the main functions from previous homeworks.
    • You will need to write the main thread function. Model it after the sequential_life function in this homework, but each thread should be responsible for simulating the lives of a subset of the cells. The data decomposition you use is up to you -- by rows, by columns, or by blocks.
    • Note that after each generation, each thread should wait at a barrier. Once all threads have reached the barrier, it will allow them through. This ensures that all processors are working on the same generation at the same time. The code for the barrier function is already included in gol.c.
  • Thoroughly test your code.
  • Compile and run the working code on NSACC and then analyze the speed-up for at least three different problem sizes. There is code in the main function to report the time it takes to run both the sequential and parallel versions of the code. For each given initial board configuration, run the code with 1, 2, 4, and 8 threads. Record the times and graph them against the number of threads. Also, record the runtime for the sequential code. Choose your problem sizes carefully - be sure the problem size is large enough to cause the 8-threaded version to run at least 4 seconds. One problem size should have a large number of generations, another should have a large number of tiles, and one should have a largish number for both. Does changing the problem size affect the speed-up (i.e. does the ratio between the running time for the 1-threaded version to the 8-threaded version change from problem size to problem size? If so, can you make a conjecture as to why?)
  • Make a short write-up of your results. The report should include your graphs and text describing the data in the graphs and answers to the questions above. In additiona, discuss whether or not the speed-up is what you would expect theoretically. Is the parallel version with 1 thread faster, slower, or the same as the sequential version? Why?

    What to Turn In

    Turn in your code and a short report by emailing me. Put "CS336" in your subject line so that my mail filters will catch it and I will notice it right away.

    Expectations

    Tips

    Back to Homework Page