What's the difference between a population and a sample? What is seasonal adjustment? People who generate and use statistics every day know these terms. For the rest of us, faculty and students from the School of Information and Library Science are finding ways to make interpreting statistics a little easier.
Cary C. Boshamer Professor Gary Marchionini and Associate Professor Stephanie Haas are leading a nationwide team using a three-year, $1.3 million grant from the National Science Foundation to link state and federal statistical resources and develop more user-friendly ways of presenting them online. After conducting user studies, the team manipulates scads of statistics from various agencies and incorporates them into prototype pages to show the agencies what's possible. "When the agencies create their customized applications, they can take some of these ideas," Marchionini said.
The first goal is helping people find what they need. For instance, the Census Bureau and the Bureau of Labor Statistics put a staggering amount of information on the web, but it's not always easy to sift through. "Agencies' internal organizational structures are often reflected on the web pages," Haas said. "But that's not what people care about."
To find out what users do care about, the team has conducted user studies with students at Carolina as well as with journalists, librarians, and teachers in Washington, D.C. In one set of tests at the University's Interaction Design Lab, a device tracked users' eye movements as they looked at tables and other data online. The team also has conducted interviews with ordinary users as well as with people who use statistics every day. "What do these experts know that lets them use the data successfully that is not apparent to somebody else?" Haas said.
Haas and other team members have even analyzed transaction logs from agency web pages to find out what kinds of information people search for and what problems they have. This was a tedious task, but Haas said it did provide some clues. "We're kind of guessing what they were after," Haas said. But she could see instances where, for example, a user was "desperately trying as many variations of a term as he or she could and wasn't getting anything that looked right."
Once a user finds the statistics he or she is looking for, they still might not understand them. A big barrier is unfamiliar terms like the ones in the first paragraph of this article (see "What do those terms mean?"). The team's demos give definitions of such terms without making users search for them. For instance, one tool makes a definition pop up when a user runs the mouse over a word. Mouseovers can also be used in table and column headings. If a user runs the mouse over a column heading, the label will expand, or an explanation will pop up.
Another problem is long tables that require lots of scrolling. These can be made more user friendly with "sticky headers" -- no matter how far down a table you scroll, the column labels will always show up. This is a simple modification that can save time and frustration, but gets overlooked when people simply transfer paper tables to the web, Haas said. Some of the demos will use animations and audio to make the statistics easier to grasp.
Applying these tools to large statistical databases requires a lot of computer time. In an hour-and-a-half meeting, Marchionini, Haas and their graduate students toss around enough computer acronyms to clog your mind. In their line of work, "Get the BLS data in HTML, PDF and ASCII," is an understandable sentence.
But to them it's worth it if these tools can help people find and understand data. "If someone in a small town in North Carolina is making a decision based on what the unemployment rate is likely to be or what it actually is, and they don't really understand how that number was generated, they can't make an informed decision," Haas said. "You'd like them to know what they're using."
The project will help statistical agencies develop a statistical knowledge network that seamlessly links data from local, state and national levels. Marchionini said, "This requires new data flow architectures that allow individual agencies to process and manage their data locally yet contribute it to a common system that all people can use."
The Carolina team collaborates with researchers from the University of Maryland at College Park and Syracuse University, and representatives from federal and state statistical agencies including the Bureau of Labor Statistics, the Census Bureau, the Energy Information Administration, the Social Security Administration and the National Agricultural Statistical Service. The project's web site is at www.ils.unc.edu/govstat.
What do those terms mean?
A population is all the members of a certain group, such as all 6,060 SPA staff members at Carolina. A sample is a portion of the pop-ulation -- for instance, 1,200 of those staff members. Seasonal adjustment smoothes out seasonal fluctuations in data so that other changes become apparent. For example, stores regularly hire workers in December to handle holiday sales. Removing that seasonal increase reveals any change in employment levels caused by other factors.
|