CS 152: Project 3

Title image Project 3
Fall 2016

Project 3: Calculating Thermoclines

For this project you will be writing a library of use functions as well as a set of more general programs that will be useful for computing simple statistical functions of a data stream. The final task will involve computing the depth of the thermocline on Great Pond. The thermocline is the depth at which there is the largest difference in water density, with a layer of denser water below and a layer of less dense water above.


Tasks

  1. If you haven't already set yourself up for working on the project, then do so now.
    1. Mount your directory on the Personal server.
    2. Open the Terminal and navigate to your Project3 directory on the Personal server.
    3. Open TextWrangler. If you want to look at any of the files you have already created, then open those files.
  2. If you did not do so in lab, complete your stats functions in stats.py so that you have functions for computing each of max, min, sum, mean, variance, and standard deviation for a list of single values. Make sure each function, except max and min, computes and returns a floating point value. You can assume the input is a list of integers or floats.

    Add to your stats functions a function that converts Celsius to Fahrenheit. The function should take in a single parameter--temperature in Celsius--and return a single value--temperature in Fahrenheit. Call the function celsius2fahrenheit.

  3. Write a Python program, computeStats.py, that reads a single stream of numbers, stores them in a list, then uses the library of functions you wrote in the lab to calculate the min, max, mean, and standard deviation and prints the values to the terminal.

    Use your program, plus appropriate grep and cut commands, to compute these values from the LEA buoy data for temperature at 1m, temperature at 5m, and one other variable of your choice for the month of June, 2016. Repeat the exercise for the months of July and August 2016. Report these values in your report. Be sure to put the whole command for running your program in comments at the top of your Python program.

    When computing the max, min, and mean of temperature variables, use the Celsuis to Fahrenheit function to display the temperatures in both systems. You can choose how to implement this.

  4. The next task will be to write a program that computes the depth of the thermocline on Great Pond. The thermocline is the depth at which the water density changes most quickly, creating a layer of colder, denser water below a layer of warmer water that tend not to mix. Overall, you will write a function that computes the density of water given a list of temperatures, a function that computes the depth of the maximum change in density, and then a top-level function that reads in a stream of data and sets up the computations.

    1. Create a new file, thermocline.py. Put your name, date, and class at the top, along with a comment indicating what the program will do (compute the thermocline).
    2. Write a function density that takes in one parameter, temps that is a list of temperatures. The function should first create a new empty list to hold density values, rhos. Then, it should loop over the temps list and for each temperature value compute the density using the following equation.

      rho = 1000 * (1 - (t + 288.9414) * (t - 3.9863)**2 / (508929.2*(t + 68.12963)))

      It should then append the computed density to the rhos list. Finally, it should return the list of densities (rhos).

      Test your function using this test file. It should print out the following if your density function is working correctly.

      24.47 -> 997.21
      23.95 -> 997.34
      24.41 -> 997.22
      23.81 -> 997.37
      19.92 -> 998.25
      16.88 -> 998.82
      14.06 -> 999.26
      11.56 -> 999.57
      9.82 -> 999.74
      9.13 -> 999.80
      8.82 -> 999.82
      
    3. The next step is to add a function to thermocline.py that computes the derivative of density with respect to depth, or how fast the density is changing as you get deeper. The function will take in two lists: one is the set of temperatures, the other is the set of corresponding depths. The function will return one value: the depth of the maximum change in density. The algorithm below gives the function.
      def thermocline_depth( temps, depths ):
      
          # assign to rhos the result of calling the density function with temps as the argument
          # assign to drho_dz the empty list
      
          # calculate the first derivative of density
          # loop for one less than the length of rhos
              # append to drho_dz  the quantity rhos[i+1] minus rhos[i] divided by the quantity depths[i+1] minus depths[i]
              # optional step: print out temps[i], rhos[i], and drho_dz[i]
      
          # assign to max_drho_dz the value -1.0
          # assign the maxindex the value -1
          # loop for the length of drho_dz (loop variable i)
              # if drho_dz[i] is greater than max_drho_dz
                  # assign to max_drho_dz the value drho_dz[i]
                  # assign to maxindex the value i
      
          # assign to thermoDepth the average of depths[maxindex] and depths[maxindex+1]
      
          return thermoDepth
      

      Test your thermocline_depth function using this test file. It should return a depth of 6.0m (note that the maximum change of 0.44 at that depth -- you do not need to report this, but if you run into problems, knowing the maximum change is supposed to be 0.44 may help you debug).

    4. The final step is to write the main function that reads in data from the buoy file through stdin, extracts all of the temperature fields in order, computes the thermocline_depth and prints it out. The program should also keep track of the minimum and maximum depth of the thermocline over the range of measurements provided and print out those values at the end along with when those minimum and maximum events occurred.

      The algorithm given below is not line by line. Each comment will correspond to one or more lines of Python.

      def main(stdin):
      
          # these are the fields corresponding to the temperatures in order by depth
          # note the 0-indexing 
          fields = [8, 11, 14, 17, 20, 23, 26]
      
          # these are the depth values for each temperature measurement
          depths = [ 1, 3, 5, 7, 9, 11, 13 ]
      
          # create variables to hold the max depth, min depth, the datetime
          # of the max depth and the datetime of the min depth.  Give them
          # reasonable initial values (a small value for max depth, a large
          # value for min depth, and empty strings for the datetime variables.
      
          # assign buf the first line of stdin and then start the standard while loop until buf is empty
              # split buf on commas and assign it to words
      
              # assign to datetime the value in words[0]
              # assign to temps the empty list
      
              # loop over the number of items in depths (loop variable i)
                  # append to temps the result of casting words[ fields[i] ] to a float
      
              # assign to depth the result of calling thermocline_depth with temps and depths as arguments
      
              # test if depth is greater than maxdepth and update maxdepth and maxtime if it is
      
              # test if depth is less than mindepth and update mindepth and mintime if it is
      
              # print out the datetime value and the depth value, separated by commas
      
              # update buf with the next readline from stdin
      
          # print out the minimum and maximum thermocline depth and the corresponding date/time
      
          return
      
      if __name__ == "__main__":
          main(sys.stdin)
      

      Test your program using the following command.

      curl http://schupflab.labs.keyes.colby.edu/buoy/3100_iSIC.csv | grep '7/2/2016' | grep -e ' 6:00' -e '18:00' | python thermocline.py

      The thermocline depth at 6am should be 4m, and at 6pm (18:00) it should be 6m.

  5. For the final task, compute the thermocline depth on hourly intervals for the month of July and save the results to a file. Use grep to select the lines you need and pipe it to thermocline.py file to compute the thermocline values. Direct the output to a new file 2016-07-thermo.csv. This should be a file with two columns: datetime, and thermocline depth.

    Use the grep command to pick out lines that begin with 7 (you can use the pattern "^7" or do something that looks for a seven, slash, digits, slash, 2016), pipe that to cut to get the second field, and then pipe that to your computeStats.py file to get the min, max, mean, and stdev of the thermocline for the month of July.

    Include this final output in your writeup.


Extensions

Each assignment will have a set of suggested extensions. The required tasks constitute about 85% of the assignment, and if you do only the required tasks and do them well you will earn a B+. To earn a higher grade, you need to undertake one or more extensions. The difficulty and quality of the extension or extensions will determine your final grade for the assignment. One complex extension, done well, or 2-3 simple extensions are typical.


Write-up and Hand-in

Turn in your code by putting it into your private hand-in directory on the Courses server. All files should be organized in a folder titled "Project 3" and you should include only those files necessary to run the program. We will grade all files turned in, so please do not turn in old, non-working, versions of files.

Make a new wiki page for your assignment. Put the label cs152f16project3 in the label field on the bottom of the page. But give the page a meaningful title (e.g. Milo's Project 3).

In general, your intended audience for your write-up is your peers not in the class. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it. Follow the outline below.