Results tagged “infovis” from kwc blog

CHI Wednesday notes


Infovis: Bush vs. Clinton protocols


doj_wh_charts.gifGonzales' testimony before Congress yesterday was entertaining for me. It produced this sort of language in today's New York Times Editorial:

Mr. Gonzales came across as a dull-witted apparatchik incapable of running one of the most important departments in the executive branch... He delegated responsibility for purging their ranks to an inexperienced and incompetent assistant who, if that’s possible, was even more of a plodding apparatchik.

I do enjoy my Russian political references. War czar, anyone?

Gonzales' testimony also helped produce this infographic by Sen. Sheldon Whitehouse, D-R.I. (reported by Slate) that compares protocols in the Clinton and Bush White House for who is allowed to talk to whom in the DoJ about ongoing criminal investigations. See if you can spot the difference.

via Froomkin

The 4th Dimension in Google


Just last week at lunch, we were discussing Google Earth and MS's Virtual Earth 3D and how cool it would be once there is enough data to start adding a time slider to it all. Move the slider on Mountain View and you'd get to watch the town collapse all the way down to a stage coach stop. Move the slider over San Francisco and watch the skyline appear and the Golden Gate Bridge come into fruition.

Well, as it turns out, we were discussing a feature that is, in some ways, already there. The new Google Earth 4 comes with a time slider, which works with any timestamp data. It's not the all encompassing time machine, as it is a feature that still awaits massive amounts of data, but people have already put it to work with Hurricane Katrina, London buildings, and more.

There's also another feature they've announced that fits well with all of this: new historical map layers.

This, to me, is a critical tipping point for consumer mapping applications. Before, they could only show us the present. Now, they can show us our past, i.e. give us glimpses into our cultural memory, take a walk down Memory Lane in 3D. Now, we just need data.

Google Earth Blog: Google Earth 4th Dimension Redux

Voting aftermath


Rumsfeld Resigns as Defense Secretary After Big Election Gains for Democrats

I spent all night reloading Virginia's results, watching in panic as Allen and Webb changed places, and then calming down as Arlington and Fairfax started pushing Webb ahead. I'm not terribly happy with the California Propositions results, though I'm glad that the 'takings' eminent domain prop got voted down. I drove around in a semi-panic election morning trying to find my polling place due to an registration snafu on my part. I ended up voting in Los Altos on one of their new touchscreen machines: this time around their touchscreens have printed receipts, which was rather comforting, even if there is some bad UI design -- if you check the wrong box, you can't change it by checking the correct box; you have to first press on the checked box, then check the correct box.

Old links to clear out 2005


Caltrain vis take 1


I believe a fair assessment of the new Caltrain schedule is that there are a lot more opportunities for shorter commutes, but those opportunities come at the cost of increased complexity. In addition to all the problems of what train stops where and which train is which, there's one more bit of complexity in my commute: the gaps between trains during rush hour have been increased to 50 minutes at my closest station.

The larger time gap presents a new choice: do I walk five minutes to my closest station, Menlo Park, or do I walk 15 minutes to the next closest station, Palo Alto.1 It takes a bit of calculation to answer this question with the variety of schedules. I could slice and dice and annotate my paper schedule to answer all these questions, but that's no fun.

I decided instead to write a little Python program to visualize my options, borrowing extensively from my understanding of Visual Display of Quantitative Information. The end result reads chronologically from left to right with each red line representing a commute option:


I have grander visions for this little program, but for now I have a something that I can glance over at the end of my workday. Some potential next directions: * nicer fonts, higher resolution for printing on paper * hooking this up to a Web server so others can get schedules * go one step further and try to do a combined Caltrain, BART, N Judah visualization (Caltrain -> BART Millbrae -> Embarcadero vs. Caltrain -> 4th and King Muni -> Embarcadero)

1I could bike to Palo Alto, but the Baby Bullet that stop there has less room for bikes, which means I might to be able to board there. More complexity that I haven't modeled here..

Tag clouds are teh suck


Zeldman discusses several of the problems with tag clouds, but I thought I'd hit on a couple of more from a different viewpoint.

First, as a primer, a tag cloud (as seen on my Flickr account, but also seen on sites like (experimental) and 43things):

 700   animal   ape2005   architecture   armstrong   beach   bike   bird   blue   boulders   bridge   buddha   bunny   cacti   california   castro   cave   chaparral   child   christinethornburg   christmas   cliff   condor   contrail   cute   cycling   deyoung   ekimov   endangered   evil   flight   flower   football   gate   gehry   getty   goldengatepark   green   halloween   herzog   house   incredibles   iris   japanesemaple   japaneseteagarden   lamb   lancearmstrong   landscape   leaves   licenseplate   lights   lizard   losangeles   maple   metaldetector   meuron   momiji   moon   morganhill   mountain   nationalpark   nerd   orange   pagoda   paulmccartney   peligro   pinnacles   pipes   rabbit   race   railing   red   richardmeier   robonexus   rock   sanfrancisco   sanfransciscograndprix   santamonica   sfmoma   sidewalk   sign   silhouette   sonoma   spiderman   spire   spires   stonelantern   stones   sunset   tattoo   teagarden   tmobile   tonybennett   tree 

Tag clouds follow a very basic principle: the font size of the word is scales linearly with the number of times the tag has been used.

At first glance, there appear to be several things right with this sort of display. You can see, for example, that I have a ton of photos tagged "Richard Meier", and that I have a lot more "architecture" photos than "ape2005" photos. IMHO, however, this is all fluff -- it's has the appearance of being a statistical visualization but instead conveys information crudely and inaccurately. For example, for each of these pairs, answer the question, "Which do I have more photos tagged with?"

  • japaneseteagarden or goldengatepark?
  • richardmeier or architecture?
  • sanfranciscograndprix or house?

With close examination you will probably get these right, but my point is that it takes a bit of thought (and you have the chance of getting it wrong). One of the fundamental problems is that the "tag cloud" display is using the size of the word to convey how many tags are associated with it. However, the size of the word is related to (a) the number of characters in the word (sanfranciscograndprix vs. house) and (b) the font size of the word, which grows in two-dimensions. Instead of trying to convey:

size of word ~= (# of tagged items)

we instead have the relation

size of word ~= (# of tagged items * length of word)2

So as a statistical display, it's bunk -- appearing to help you understand relative tag distribution, but not in an accurate manner.

Aesthetically, in order to try and convey this pseudo-statistical information, it completely throws the list out-of-whack: lines grow to arbitrary heights, one's ability to scan quickly across the entire list is lost, large words are constantly drawing your attention from smaller words, etc..., and, to borrow from Zeldman, navigation skews towards popularity rather than findability.

The fact that "richardmeier" is one of my most prominent tags entirely relates to the fact that (a) I took a ton of photos of the Getty one day, and (b) I was testing out my new Flickr Pro upload limits. They are not my "best" category of photos, I don't frequently take "richardmeier" photos, and they are not the photos I most want people to see. But the tag cloud design dictates that visitors will forever feel "richardmeier"'s gravitational force (that is, until I go crazy with another photo upload).

My own tag/category display could use some work, but I offer it here as a comparison (feel free to critique in the comments):


Budget stat visualizations


John Maeda has posted his "Money Counter" which visualizes compares different Federal budget expenditures. With my own caveat that statistics can lie or be overly selective, feel free to check it out. The data is based on Parade magazine's "Where does your tax money go?" article. The javascript did weird things to my Firefox (don't follow these links until you've bookmarked all your tabs), including rendering the pull-down menu for selecting different comparisons inoperable, so I'll link to the comparisons available individually:

mac world, mac office, iLife


I think I've discovered my mutant superpower. I've long pondered this after we discovered honeyfield's ability, which is the power to speak to anyone, including extreme geeks (artisty/gamer/programmer), for extended periods of time; hers is a very useful power to have at Comic-Con.

My power, depending on your allegiances, either qualifies me as a superhero or supervillain. Without saying what my power is specifically, I will present evidence rendered in crude infovis.

MacWorld (data you have provided in comments, as well as macs at work not in my immediate vicinity):


Mac Office ('k' = me):


iLife (Macs that have had direct, frequent contact with me [metamanda, honeyfields, d, parakkum, ln m, pqbon]):


I think I'll make frequent trips to the Apple Store to see if I can focus my powers...

Maneesh Agrawala, Julie Heiser, and Barbara Tversky Tutorial session at AAAI

Two implemented systems explored for automated design of visualizations: map routes and assembly instructions. Map routes system (LineDrive) used by MapBlast (now

Three parts to talk: cog sci/CS background, map routes, assembly instructions.

Tufte's sparklines


Tufte has posted some of the material from his upcoming Beautiful Evidence book that covers his concept of 'sparkline.' Sparklines are essentially tiny graphics that can convey trends very quickly, and in some cases they can provide very specific data. They aren't earth-shattering -- most of them look like shrunken versions of familiar information graphics such as stock graphs -- but the idea that it is now very easy to embed such high-information-density graphics directly into our text is a good proposition.

Tufte briefly covered these when I saw him speak, but it's nice to see his written text, which I prefer to his speaking.

Edward Tufte: Sparklines

Bad Statistics III


Bad Statistics II


Here's one from March 25. It was really well publicized in the Slashdot crowd, so you may have already seen it.

Adobe Bad Statistical Graph

You don't need me to tell you that Edward Tufte's Visual Display of Quantitative Information (amazon) is a good book. I've posted my outline notes below (mostly for my own benefit, as this is useless without Tufte's pretty [and ugly] examples).

I hate powerpoint


I found this great parody of the drain of powerpoint: Gettysburg Address in Powerpoint. There's also Tufte's analysis of Boeing's report on Columbia Tile Damage. Powerpoint really takes the crown for robbing the corporate coffers of brain cells. I never thought I would say this, but I miss good ole's overhead transparencies that cost $2/piece to make. At least once they were made no one would even dare suggest using a synonym for the second word in bullet four on slide three.

Credit: IDblog (entry) for the link to Dan Brown's Understanding Powerpoint: Special Deliverables #5, which provided the GA link.

BTW: Tufte uses ArsDigita Community System. Good stuff, even if you don't like Greenspun.

Bad Statistics


This graph has at least four things wrong with it, can you spot them all? (note: I previously said three, but I was lumping two together)

03-13-03.US Airways On-time Graph.Statistics.jpg

Additional info: The graph is based on DC-NY-Boston routes by the respective companies. The ** next to the Delta Shuttle refers to the fact that they're only counting DC-NY routes for the Delta Shuttle "statistics."