Project #4

### Project 4: Confirming Benford's Law with pthreads (due Friday Mar 9 at midnight)

On NSCC, create a proj04 directory.

In this project, you will be confirming that data from a (relatively wide) log-normal distribution follows Benford's Law. Benford's Law states that data meeting certain qualifications will have way more leading digits that are 1's than are 9's. This property can be used to detect fraud, so it has gotten lots of attention. Read about Benford's Law on Wikipedia.

Not only do we want to confirm the data follow Benford's Law, we want to confirm it efficiently. In this project, you will write P-threads code to count the frequency of 1's, 2's, etc. in each element of a data files. The question is: where do you store your counters? Do you use a local variable? Do you use a global variables? Should you use mutexes?

1. #### Sequential

• On NSCC, cURL the code tarball to it. Then expand the tarball:
1. curl -O http://cs.colby.edu/courses/S18/cs336/projects/proj04/proj04.tar
2. tar -xf proj04.tar

Note that when I did this, a bunch of warnings were printed, but it didn't prevent anything from working. Please ignore them.

• Read the code, compile it, run it, and embrace it. I supply several data files, two of which should have data that follows Benford's Law. Read the comments in benford_sequential.c for more information. The output should indicate that approximately 30% of the leading digits are ones.
2. #### Parallel

You will be implementing several parallel versions. All should use 8 threads. Each thread analyzes a section of the data array (the first thread examines indices 0 through N/8-1, the second threads analyzes numbers N/8 through 2*N/8-1, etc.). Note that N is not necessarily divisible by 8, so you will need to account for that.

To test each Pthreads version, just compare its output to that of the sequential version.

1. #### Global Counter Array Protected by Single Mutex

Use a global variable (array of 10 ints) to keep track of the total count of each digit (0 throug 9). Since all threads will be reading and writing to it like crazy, we need to protect it with a mutex lock. So, make a global variable mutex lock. For each number the thread analyzes, it locks the lock, updates the counter for the appropriate digit, and then unlocks the lock.

2. #### Global Counter Array Protected by Array of Mutexes

Use a global variable (array of 10 ints) to keep track of the total count of each digit (0 throug 9). Since the count for each digit is independent of the count for every other digit, we can protect each digit-counter with its own mutex. So, make a globa variable that contains an array of 10 mutex locks. For each number the thread analyzes, it locks the appropriate lock, updates the corresponding counter, and then unlocks the lock.

3. #### Local Counter Array, with Final Update Protected by Single Mutex

Each thread should use a local variable (array of 10 ints) to keep track of the number of times it sees each digit as a leading digit. After it has looped through its section of the data, it should use one mutex to protect the entire global array of counts and add its local counts to it.
4. #### Local Counter Array, with Final Update Protected by Array of Mutexes

Each thread should use a local variable (array of 10 ints) to keep track of the number of times it sees each digit as a leading digit. After it has looped through its section of the data, it should use one mutex per digit to protect the appropriate entry in the global array of counts and add its local count to it.
5. #### Global Counter Array of Arrays, Grouped by Thread, no Mutex

Use a global variable that is an array of ints with 10*NUMTHREADS entries. The digit counts for each thread should be stored in that array. For this version store the counts for all digits for a single thread together (e.g. thread 0 uses entries 0 through 9, thread 1 uses entries 10 through 19, etc.). Because each thread has its own section of the array, there is no need for a mutex! After all the threads have joined, loop through this new array of arrays, and sum the counts for each digit.

6. #### Global Counter Array of Arrays, Grouped by Digit, no Mutex

Use a global variable that is an array of ints with 10*NUMTHREADS entries. The digit counts for each thread should be stored in that array. For this version store the counts from all the threads for a single digit together (e.g. entries 0 through NUMTHREADS-1 are used to keep track of the number of 0's, entries NUMTHREADS through 2*NUMTHREADS-1 are used to keep track of the number of 1's, etc.). Again, because each thread has its own section of the array, there is no need for a mutex! After all the threads have joined, loop through this new array of arrays, and sum the counts for each digit.

3. Learn about false sharing and prevent it. Chapter 7 in Introduction to Parallel Computing talks about Pthreads. They have a nice description of false sharing. Read this and then try to prevent false sharing in the version e of this problem (Global Counter Array of Arrays, Grouped by Thread, no Mutex). To do so, use an array of padded ints instead of an array of ints. Here is my definition of a padded int.
```typedef struct {
int value;
```

### Ideas for Extensions

• The code to determine the first digit of a number is not itself particularly efficient. Re-implement it more efficiently. In your write-up include data showing that the new implementation decreases the runtime.
• Revisit the first parallel version (with one mutex that is frequently locked). Does it affect performance if we lock the mutex before we determine the leading digit?
• When we run the code on the given data sets, we expect the entry counts 1's to be the most frequently used. This could affect the performance for some of the implementations. How does the performance change when we use data that does not follow Benford's Law (and instead has a uniform distribution of leading digits)? Can you tell if that performance change is due to updating the counts array or to increased computation time in leadingDigit?
• Vary the number of threads. Determine how the runtime depends on the number of threads.

### Writeup and Handin

To hand in your project, you will gather all of the necessary files into a proj04 directory:

1. Create a file named README.txt and use it to describe your code organization (e.g. info about how to make and run it).
2. Create a second file for you project report. This file should be in a format that allows you to include charts or images (such as .docx or .pdf). This week, I want you to collect timing information from all of your runs and then analyze the results. The analysis is a significant part of the project, so don't leave it until the last minute, when you are too tired to think clearly. Here's the data I want you to collect: the number of seconds it takes to run each version of the program. You should record this at least 5 times for each program. Drop the min and max values from these 5 and then report the mean of the middle 3. Present the data so that it is as easy to read as possible -- if there are lots of numbers then please use some sort of graphic, if not a table would work just fine. You may format your analysis however you would like to, but be sure to answer the following questions: What conclusions can you draw? What can you say about the roles of mutex locks for this problem? Does false sharing slow down the program, if you don't prevent it? Which version of the code is the one you would most like to write? Which would you most like to run?
3. You should hand in all code necessary to run your solutions. Place all necessary .h, .c, and Makefile files in the proj04 directory. Stephanie will probably want to compile and run the code. It should be possible to do so without looking for any more files.

Zip up the directory and put it in your Private folder on Courses/CS336.