Tuesday, December 21, 2010

First-Digit Law and Google Ngram


The first-digit law [Benford's Law] describes how the leading digit in count data will tend to over-represent "1", and to a decreasing extent "2", "3" each value less common than the one before.  Google Ngram counts of number frequencies in their book corpus show a similar trend, which is interesting, since these values arise from such heterogeneous sources.

I ran the same set for the hundreds, and the results are similar.  Although, "800" is behaving differently than expected.  One possible explanation might be that our surplus "800"'s come from 1-800 phone numbers.   Running the same thing but substituting "101" for "100" etc. eliminates the 800 bulge that starts in the 1980's.


It is exciting to think about the potential to ask more socially interesting questions of this data.   Note, I stopped the graph at the default (2000).  Although the data set extends to 2008, it seems that there must be data missing after 2000 because of many values that should not drop in concert are dropping.

Friday, December 3, 2010

It takes a digital metropolis to create a belly teddy bear. . . .


It may take a village to raise a child, but it takes a digital metropolis to create a belly teddy bear.  I posted briefly about this before, but there is much more to the story than it might seem at first.


My daughter knows the alien cartoon logo from reddit as the "Belly Teddy Bear."  The cute little image adorns the header, and occasionally appears in advertisement space for the site, which I read pretty regularly in the evenings.   Sydney was and remains struck by how cute the image is.   I hunted for a plush version on sale from reddit, but I quickly learned that no such toy existed.  Mass production was out.  I gave up on the gift idea.  But she kept remarking about the "belly teddy bear" so I tried to think creatively.  Lacking skills or friends with the necessary skills I had to think outside the constraints of my geographic village and my personal social network.


I learned about Etsy from a student in my group processes seminar and searched there until I found someone [Ning Ning Gong] who made knit stuffed animals of her own design (among other things).  I contacted her, we discussed the design, and she agreed to make the "belly teddy bear".   In fact, she worked on it right away and shipped it early, which made Sydney very happy on her birthday.


So, via email alerts and messages on a webpage, I contracted the production of a unique, hand crafted gift for $28 plus shipping, with a person I had never met, who lived in another country and with whom I was not likely to interact again.

How did the digital metropolis make this all possible?  In other words, what digital and social infrastructure did we rely on?  At the very least, we needed:


  • Secure monetary transactions at a distance:  Paypal
  • Efficient distributed web hosting for craft producers:   Etsy
  • Source for stylized but not fully commercialized image:  reddit

Yet these three examples are the tips of branches in the infrastructure;  these in turn, rely upon the lowered transaction costs, ease of search, and distribution of information that is generic to the internet; plus we need the communities and commercial entities that drove the creation of the tools and data management systems behind the scenes that make possible the digital systems we use.   We needed monetization and we needed free information.  We needed property rights, contracts, and we needed a bit of trust.  We needed digital cameras, structured query language, and free and open source software.  We needed online community, reputation systems, and socially based information aggregation.



In "The Internet? Bah!"  Clifford Stoll (1995) famously missed the mark on the potential for the internet to change our lives and alter the way we do things.   Rather than consider the ways that he has been proven wrong by developments in the last 15 years, I would like consider how we can study the current capacities, events and changes in light of what sorts of trends we already see developing.   What does the production of "the belly teddy bear" reveal?  How will that thread of the future contribute to the fabric of everyday life that we will take for granted in the next 15 years?