Due midnight, Wednesday 20 February 2012
The purpose of this assignment is to get you thinking about data--its forms, formats, and meta-data--and to start the process of building an interactive visualization system.
See these install notes to set up the necessary software on your own computer.
The Pieces of a Python GUI
Download the skeleton python file for creating a window and menus using the Tkinter package. Rename the file to display.py or something you like. To run the program, open a Terminal and type the following command.
The code creates the menus and a single button in the window and connects them to empty functions.
For more details on Python GUI development in Tk, see the following resources.
Designing a Simple GUI
The skeleton program gives you a structure for making an application. The GUI pieces of this project give you practice creating other types of GUI elements like buttons, lists, labels, popup menus and modal dialog windows.
- Come up with an initial design for your application. Think about a data visualization and analysis application as having three pieces: the main view, status, and controls. These pieces are not disimilar to a video game. Think about how you might want to lay out the different sections, or whether certain widgets can have two purposes. Make a rough sketch (using pen and paper or some drawing program), thinking about what kind of status information would be useful and what the set of fundamental controls will be. Consider whether controls should be in menus, part of the main window, or part of a modal dialog.
Menus - Add a menu item for opening a file. The menus are built
within the buildMenus function. Two lists manage the user menus. One
defines the menu text, the other defines the function to call for each
There are some special escape characters needed to indicate the apple/command symbol. Use the sequence \xE2\x8C\x98 to generate the command symbol.
- Key Bindings - Add a key handler for control-o, if it does not already exist. Event bindings are set up in the setBindings function. Follow the syntax used for setting up the control-q key equivalent.
- Event Bindings - Add an event handler for button 3. You will need to make a function to handle button 3 and then bind the function to the event within the setBindings function, attaching it to the canvas. Have the function print out the x and y mouse position of the click.
- Labels - In the status area, create labels that will be used to indicate the current data set, and the variables selected for the x and y axes. Put in some temporary strings in the labels, for now.
- List box - In the control area, create a list box widget. Set it up so three items are visible and put six temporary items into the box. Enable multiple selections in the list box.
- Button - Modify the existing button so it says 'Plot' and place it in the control area with the list box. Have the button call a function, and have the function print out a message.
In the function that handles your third mouse button, have it create a
circular object at the mouse click location and store the object in
the self.objects list (created and initialized in the init function).
dx = 3 rgb = "#%02x%02x%02x" % (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255) ) oval = self.canvas.create_oval( event.x - dx, event.y - dx, event.x + dx, event.y + dx, fill = rgb, outline='') self.objects.append( oval )
Note that colors in Tk use integers in the range [0, 255] for each color channel.
In the function that handles the motion of button 1, have the function
update the location of all objects in the self.objects list. You can
get the current coordinates of a graphical object using the coords
loc = self.canvas.coords(obj)
and you can modify the coordinates of a graphic object using the same function with arguments.
self.canvas.coords( obj, loc + diff, loc + diff, loc + diff, loc + diff )
When you are done with this set of tasks, you should be able to click around and see the effects of all of the various widgets. You should also be able to create some dots in the main window and click and drag the dots around using the first mouse button.
Find a small (10) to medium (1000) size data set that has at least two
(three is better) variables per data point. For example, the data
might be high and low temperature and average wind speed for a weather
station over the course of a year. The data set should have at least
two numeric variables that make sense on a number line.
Check out several different options before choosing one. Bug your friends doing projects, find interesting projects on the web, or grab data from the World Bank. One interesting site for climate data is the Biogeoinformatics of Hexacorals.
Alternatively, you are welcome to collect your own data set. It should have at least 20 data points. Your music collection or the set of books on your bookshelp are fine examples.
- Once you have picked a data set, make sure you also get the meta-data. You will need it to answer the questions below.
- Write a short Python program to read the data into a list--usually a list of tuples--and then calculate and print the average for each variable for which it makes sense. You may want to modularize this so you can import this python file into your application and use the I/O functionality.
Modify the handleOpen routine of your GUI so it can read in your data
set and store it in the self.data field. The function does not have
to read in generic data sets, just your data set.
Make sure all existing graphics objects get deleted when you open a file so that it starts with a fresh window.
- Create a method, plotData, that plots two axes of the data in the self.data field. Call plotData from the handleOpen function so that it generates a 2D plot when you read in your data. Your plotData function does not have to be general, it just needs to generate a visualization of two hard-coded axes of your data. They do not need to be numeric.
- Update your handleButton1Motion function so that it translates your data. You can disable the ability of the user to add points to the set of visual objects, if you wish.
- Make a button or one of the menu commands build a histogram of one axis of your data and plot it with matplotlib/pylab. You can hard code which axes to use. There is an example of making a histogram in the skeleton code.
- Enable your program to make multiple types/styles of plots or use different axes with your data set.
- Calculate standard deviations or other features (mode/median) in addition to the means.
- Collecting your own data set, if well done, is a nice extension.
- Start the process of generalizing your data I/O. Pick a format, like CSV, and enable your system to read in any CSV file (perhaps with the restriction that it have at least two columns, for example. Note, there is a Python package for reading CSV files.
For this assignment, make a wiki page, link your data to the page and answer the questions below.
- Give a one sentence description of your data.
- What is the source of the data?
- What is the dimensionality of the data?
- What is the size of the data set?
- What is the format of the data?
- What is the precision of the data? Is this the native precision or the precision of the format? Can you figure this out?
- Does your data set contain missing values? If so, how are missing values represented?
- What is the range of each variable in the data set?
- What information could you find on error in your data set? Could you define error bars for each data value
In addition, describe the functionality of your GUI, put in at least one screen shot, and identify any extensions you did.
Make your writeup for the project a wiki page in your personal space.
Once you have written up your assignment, give the page the label:
Packages for Tk/Numpy/Scipy/Matplotlib
If you want to do this project on your own computer, you will need to install four packages: Python, Numpy, Scipy, and Matplotlib.
If you are running Snow Leopard (OSX 10.6) or Lion (OSX 10.7), then download the universal 2.7.2 Python package for OSX 10.6-10.7 from python.org. Install it first.
Then download the numpy and scipy packages from Sourceforge, which is linked from scipy.org. You want the 2.7.2 OSX 10.3 packages. Install those second.
Then download the matplotlib package from Sourceforge, which is linked from the matplotlib site. Note the link is the word 'download' on the right. You also want the 2.7.2 OSX 10.3 package. Install this last.
Since the numpy, scipy, and matplotlib packages are 32-bit code, you will need to run the 32-bit version of python. This is called python-32. In addition, you will need to tell matplotlib to use macosx as its windowing package by using the following when you import matplotlib and/or pylab.
import matplotlib as mpl mpl.use('macosx') import pylab
For windows machines you can go to the same places and pick up the packages appropriate for your OS install. They key is to make sure all of the packages match.