Data and a Simple GUI
Due midnight, Wednesday 15 February 2012
The purpose of this assignment is to get you thinking about data--its forms, formats, and meta-data--and to start the process of building an interactive visualization system.
See these install notes to set up the necessary software on your own computer.
The Pieces of a Python GUI
Download the skeleton python file for creating a window and menus using the Tkinter package. Rename the file to display.py or something you like. To run the program, open a Terminal and type the following command.
The code creates the menus and a single button in the window and connects them to empty functions.
For more details on Python GUI development in Tk, see the following resources.
Extending the GUI
Add a menu item for opening a file. The menus are built within the
buildMenus function. Two lists manage the user menus. One defines the
menu text, the other defines the function to call for each menu item.
There are some special escape characters needed to indicate the apple/command symbol. Use the sequence \xE2\x8C\x98 to generate the command symbol.
- Add a key handler for control-o, if it does not already exist. Event bindings are set up in the setBindings function. Follow the syntax used for setting up the control-q key equivalent.
- Add an event handler for button 3. You will need to make a function to handle button 3 and then bind the function to the event within the setBindings function. Have the function print out the x and y mouse position of the click.
- Add a second regular button in the window and an event handler for it. Set up the button in the buildControls function and bind it to a new function called handleCmd2Button. Have the function print out a message.
In the function that handles your third mouse button, have it create a
circular object at the mouse click location and store the object in
the self.objects list (created and initialized in the init function).
dx = 3 rgb = "#%02x%02x%02x" % (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255) ) oval = self.canvas.create_oval( event.x - dx, event.y - dx, event.x + dx, event.y + dx, fill = rgb, outline='') self.objects.append( oval )
Note that colors in Tk use integers in the range [0, 255] for each color channel.
In the function that handles the motion of button 1, have the function
update the location of all objects in the self.objects list. You can
get the current coordinates of a graphical object using the coords
loc = self.canvas.coords(obj)
and you can modify the coordinates of a graphic object using the same function with arguments.
self.canvas.coords( obj, loc + diff, loc + diff, loc + diff, loc + diff )
Find a small (10) to medium (1000) size data set that has at least two
(three is better) variables per data point. For example, the data
might be high and low temperature and average wind speed for a weather
station over the course of a year. The data set should have at least
two numeric variables that make sense on a number line.
Check out several different options before choosing one. Bug your friends doing projects, find interesting projects on the web, or grab data from the World Bank. One interesting site for climate data is the Biogeoinformatics of Hexacorals.
Alternatively, you are welcome to collect your own data set. It should have at least 20 data points. Your music collection or the set of books on your bookshelp are fine examples.
- Once you have picked a data set, make sure you also get the meta-data. You will need it to answer the questions below.
- Write a short Python program to read the data into a list--usually a list of tuples--and then calculate and print the average for each variable for which it makes sense. You may want to modularize this so you can import this python file into your application and use the I/O functionality.
Modify the handleOpen routine of your GUI so it can read in your data
set and store it in the self.data field. The function does not have
to read in generic data sets, just your data set.
Make sure all existing graphics objects get deleted when you open a file so that it starts with a fresh window.
- Create a method, plotData, that plots two axes of the data in the self.data field. Call plotData from the handleOpen function so that it generates a 2D plot when you read in your data. Your plotData function does not have to be general, it just needs to generate a visualization of two hard-coded axes of your data. They do not need to be numeric.
- Update your handleButton1Motion function so that it translates your data. You can disable the ability of the user to add points to the set of visual objects, if you wish.
- Make one of the menu commands build a histogram of one axis of your data and plot it with matplotlib/pylab. You can hard code which axes to use. There is an example of making a histogram in the skeleton code.
- Enable your program to make multiple types/styles of plots or use different axes with your data set.
- Calculate standard deviations or other features (mode/median) in addition to the means.
- Collecting your own data set, if well done, is a nice extension.
- Start the process of generalizing your data I/O. Pick a format, like CSV, and enable your system to read in any CSV file (perhaps with the restriction that it have at least two columns, for example. Note, there is a Python package for reading CSV files.
For this assignment, make a wiki page, link your data to the page and answer the questions below.
- Give a one sentence description of your data.
- What is the source of the data?
- What is the dimensionality of the data?
- What is the size of the data set?
- What is the format of the data?
- What is the precision of the data? Is this the native precision or the precision of the format? Can you figure this out?
- Does your data set contain missing values? If so, how are missing values represented?
- What is the range of each variable in the data set?
- What information could you find on error in your data set? Could you define error bars for each data value
In addition, describe the functionality of your GUI, put in at least one screen shot, and identify any extensions you did.
Make your writeup for the project a wiki page in your personal space.
Once you have written up your assignment, give the page the label:
You can give any page a label when you're editing it using the label field at the bottom of the page. Once it is properly labeled, it should show up on the course wiki page
Packages for Tk/Numpy/Scipy/Matplotlib
If you want to do this project on your own computer, you will need to install four packages: Python, Numpy, Scipy, and Matplotlib.
If you are running Snow Leopard (OSX 10.6) or Lion (OSX 10.7), then download the universal 2.7.2 Python package for OSX 10.6-10.7 from python.org. Install it first.
Then download the numpy and scipy packages from Sourceforge, which is linked from scipy.org. You want the 2.7.2 OSX 10.3 packages. Install those second.
Then download the matplotlib package from Sourceforge, which is linked from the matplotlib site. Note the link is the word 'download' on the right. You also want the 2.7.2 OSX 10.3 package. Install this last.
Since the numpy, scipy, and matplotlib packages are 32-bit code, you will need to run the 32-bit version of python. This is called python-32. In addition, you will need to tell matplotlib to use macosx as its windowing package by using the following when you import matplotlib and/or pylab.
import matplotlib as mpl mpl.use('macosx') import pylab
For windows machines you can go to the same places and pick up the packages appropriate for your OS install. They key is to make sure all of the packages match.