Monday, September 28, 2009

Network example and a Free Project Idea

First, if you are still thinking about networks from last week take a look at I’ve always thought that the interface and visualization pretty clean. Additionally, even though it hasn’t been updated since 2004, if you are a conspiracy theorist you will find the content fairly compelling.

More importantly, though I’ve sort of gone my own way on my individual project I wanted to make a suggestion that I think would be pretty neat for anyone having trouble thinking of something. If any other individual/group wants to take this project up it is all theirs.

Most introductory stats classes start with a fun little exercise about the probability that someone in the room shares your birthday. The Ask Dr. Math forum has a pretty good description/analysis of the problem here. However, the Dr. Math forum looks like its clip art was designed by a second grader,so for a more compelling graph of the change in probability of sharing a birthday vs. group size I turn your attention here.

The Wikipedia article on the birthday problem, without an actual citation, accurately notes that one of the fundamental assumptions in most computations of the birthday problem, that birthdays are uniformly distributed, is false. An example of non-uniformly distributed birthday data is found for years 1978-1987 here.

An implied conclusion to all of this is that birthdays are based on other real world events that occurred nine months prior to any given date. These real world events, such as seasonal variations in temperature, the admission practices of hospitals, or a particularly long power outage in a portion of rural Connecticut, are likely be correlated to social characteristics in addition to birth dates. For example, the likelihood that individuals meet and/or form friendships is probably higher in that town in Connecticut.

Sample Hypothesis:
There will be a difference in birthday distributions in different individuals social networks.

Facebook provides an amazing opportunity to combine social network with birthday data. While I haven’t seen anything like this, I am sure there is some application that aggregates your friends’ birthdays. Get a couple people in a small group to do this and you have an interesting representation of both a larger population sample and individual social network samples. Some graphs and a couple statistical tests later (maybe Chi-squared Counts by Months?) and you have a really interesting project.

Even if no one does this I would be really interested on hearing what you have to say about this kind of a hypothesis, do you think that the friends you make are more/less likely to share your birthday based on other physical and sociological features?