206.618.6241         résumé.pdf

PSLC DataShop


Pittsburgh Science of Learning Center
Pittsburgh, PA


Programming Lead
Documentation Lead


8 months


DataShop Prototype (swf)
> enter any user id / pwd combination
> click on "new" under "Diagrams"

The Pittsburgh Science of Learning Center is a joint initiative between Carnegie Mellon and the University of Pittsburgh. It was founded to gain a deep understanding of the learning process through the use of intelligent tutors, and at the time of this project, dealt with 7 different subjects: Three languages, two sciences, algebra, and geometry.

The aim of the DataShop is to provide one consistent place where PSLC researchers could monitor their LearnLab studies and explore their data before conducting their own detailed research. A high-level exploration tool was critical since, as of Spring 2009, DataShop housed "100,000 hours of student instruction that comprised 22 mil. individual transactions between students and tutoring programs" (The Link).

The primary challenges my team faced in this project were twofold:

  • Establish a clear set of standards for the data model, and encourage researchers from both universities to adopt it.
  • Understand the visualizations that researchers use to understand their data; and more importantly, the path they might take between reports as they explore their data.
design communication

See the Data Behind the Graph

As we watched researchers explore their data through visualizations, we quickly saw how tedious the process was. Not only was it time-intensive to clean the data and set up the visualization: The visualization itself didn't have the actual information the researcher needed. We quoted one of our participants as saying (about a visualization), "Wow, that's really interesting. I can't wait to find out what that means!"

Giving people the power to drill into visualizations immediately was an important facet of the DataShop. To that end, we consistently provided a "drill down" capability to the right of each visualization.

The intereesting point on the Learning Curve is
				  		  explained in a Point Composition report

Live Data Filtering

Changing the variables on a visualization was also a time consuming process. Since each visualization only had a limited number of questions that users wanted to ask, we processed those questions in the system and let them always be a click away.

A report that details errors made on the step of a particular
				   		  problem. Users can elect to show or hide hints and correct
design communication

The Flow of Data Exploration

After conducting numerous interviews, we had a solid understanding of how researchers approach their data. Once we had collected some common visualization techniques, we mapped the visualizations onto the analysis flow we had created.

The results were astounding — before and after our presentation, researchers crowded around the model, excited to see their process mapped onto the vision of the DataShop.

The flow between DataShop reports

Advocating for a Standard Data Model

The odds of getting professors at two universities to change their behavior and adhere to one data model was bleak. In fact, we were told to not even try. How we got these researchers buzzing about our standards exemplifies the power of good design communication.

Over the course of our interviews, we collected everyone's favorite visualizations, and collected them all into a presentation to explain the vision of the DataShop. We closed by stating that these visualizations came with a small cost, and showed the chart on the left. Since researchers were so excited by the visualizations, they were much more amenable to adhering to our proposed data model.

The knowledge subjects studied in the PSLC crossed with
				   		  whether or not their tutor has a given concept