Maven7 Blog: September 2012

Friday 21 September 2012

The Paradox Of Friendship – Why do our friends have more friends than we do?

What may look like a psychological phenomenon, is actually basic maths.

In a colossal study of Facebook by Johan Ugander, Brian Karrer, Lars Backstrom and Cameron Marlow, examined all of Facebook’s active users, which at the time included 721 million people — about 10 percent of the world’s population — with 69 billion friendships among them. They found that a user’s friend count was less than the average friend count of his or her friends, 93 percent of the time. Next, they measured averages across Facebook as a whole, and found that users had an average of 190 friends, while their friends averaged 635 friends of their own.

Studies of offline social networks show the same trend. It has nothing to do with personalities; it follows from basic arithmetic. For any network where some people have more friends than others, it’s a theorem that the average number of friends of friends is always greater than the average number of friends of individuals.

This phenomenon has been called thefriendship paradox. Its explanation hinges on a numerical pattern — a particular kind of “weighted average” — that comes up in many other situations. Understanding that pattern will help you feel better about some of life’s little annoyances.

In this hypothetical example, Ross, Chandler, Phoebe and Rachel are four friends. Lines signify reciprocal friendships between them; two people are connected if they’ve named each other as friends.

Ross’s only friend is Chandler, a social butterfly who is friends with everyone. Phoebe and Rachel are friends with each other and with Chandler. So Ross has 1 friend, Chandler has 3, Phoebe has 2 and Rachel has 2. That adds up to 8 friends in total, and since there are 4 girls, the average friend count is 2 friends per girl. This average, 2, represents the “average number of friends of individuals” in the statement of the friendship paradox. Remember, the paradox asserts that this number is smaller than the “average number of friends of friends” — but is it? Part of what makes this question so dizzying is its sing-song language. Repeatedly saying, writing, or thinking about “friends of friends” can easily provoke nausea. So to avoid that, I’ll define a friend’s “score” to be the number of friends she has. Then the question becomes: What’s the average score of all the friends in the network?

Imagine each person calling out the scores of his/her friends. Meanwhile an accountant waits nearby to compute the average of these scores.

Ross: “Chandler has a score of 3.”

Chandler: “Ross has a score of 1. Phoebe has 2. Rachel has 2.”

Phoebe: “Chandler has 3. Rachel has 2.”

Rachel: “Chandler has 3. Phoebe has 2.”

These scores add up to 3 + 1 + 2 + 2 + 3 + 2 + 3 + 2, which equals 18. Since 8 scores were called out, the average score is 18 divided by 8, which equals 2.25.

Notice that 2.25 is greater than 2. The friends on average do have a higher score than the girls themselves. That’s what the friendship paradox said would happen.

The key point is why this happens. It’s because popular friends like Chandler contribute disproportionately to the average, since besides having a high score, they’re also named as friends more frequently. Watch how this plays out in the sum that became 18 above: Ross was mentioned once, since she has a score of 1 (there was only 1 friend to call her name) and therefore she contributes a total of 1 x 1 to the sum; Chandler was mentioned 3 times because she has a score of 3, so she contributes 3 x 3; Phoebe and Rachel were each mentioned twice and contribute 2 each time, thus adding 2 x 2 apiece to the sum. Hence the total score of the friends is (1 x 1) + (3 x 3) + (2 x 2) + (2 x 2), and the corresponding average score is

Each individual’s score is multiplied by itself before being summed. In other words, the scores are squared before they’re added. That squaring operation gives extra weight to the largest numbers (like Chandler’s 3 in the example above) and thereby tilts the weighted average upward.

So that’s intuitively why friends have more friends, on average, than individuals do. The friends’ average — a weighted average boosted upward by the big squared terms — always beats the individuals’ average, which isn’t weighted in this way.

Like many of math’s beautiful ideas, the friendship paradox has led to exciting practical applications unforeseen by its discoverers. It recently inspired an early-warning system for detecting outbreaks of infectious diseases. In a study conducted at Harvard during the H1N1 flu pandemic of 2009, the network scientists Nicholas Christakis and James Fowler monitored the flu status of a large cohort of random undergraduates and found that people with more connections were infected faster.

For more analogies check out the whole article at a New York Times blog.

Monday 17 September 2012

In the Mist of Drugs

A research from India takes a closer look at what our medicine cabinet is made of, with the help of network analysis.

It is a well-known phenomenon, that the demand on medicine increases year to year (the market produces an annual growth of 6%!). The industry has an income sum of 800 billion dollars per year, with India and China as the fastest growing markets, and an annual increase in demand over 15%. The top consumers are of course overseas. The Americans with their 320 billion dollar annual drug spending are responsible for more than one third of the industries income, a sum about three times larger than in Germany. Its hardly a coincident, that the number prescription drug abuse victims is growing as well. Last year only, about 27.000 people died prescription medicine related deaths, one in every 19 minutes. Livestock drugs are pretty common too, since factory farming procedures require to use antibiotics on animals.

The goal of the research was to understand drug consumption from a network point of view, and to learn what drugs consist of. American drug label databases served as sources of information, making over 70 thousand chemicals subjects of the analysis.

The picture above shows the whole network of ingredients, with 16,444 dots and 32,627 edges. You can notice at first sight, that clustering is present. the most common chemicals include Octinoxate, Titaniumdioxide, Octisalate, Oxybenzone and Avobenzone, that are ingredients in drugs and chemicals, sometimes even food colouring materials. Another center point is Triclozan, a commonly used antibacterial and antifungus chemical.

Alcohol is number 3 in the centrailty top 10.For more cool pictures and the top10 check out the original aricle at Web 2.0.

Wednesday 12 September 2012

The Fromula Of Doom

A recent interdisciplinary study shows how food poisoning might be the end of us all.

The collaboration – including the University of Notre Dame and the Budapest Corvinus University – took a closer look at a darker future, with a serious methodological background. With Earth’s population exceeding 7 billion people, sustainable and safe food raise some serious concerns. The high demand for nutrition turned the food world trade into a very complex system, with seven countries in central positions, and the ability to reach 77% of the planet’s population on an everyday basis. But there is a serious price to be paid for stuffed grocery shelves: the risk. The door is not only open for goods and services, but infections as well. A massive food poisoning epidemic – like the Escherichia coli virus in Germany last year – could do serious damages, and claim human lives.

The United Nations monitors food trade since the sixties, focusing on networks, qualities, and trends of the goods being transferred. An interesting development of the past decades was the fact that the amound of food transfer is now larger than production itself. The main exports shifted from raw agricultural materials to processed and branded foods. The research itself used a 2007 UN database as a source. The density of the network increased by 33% in the last ten years, its most vulnerable parts are dots (countries) in the centre with the most edges (connections). Through these countries, viruses could spread vastly within a few days, reaching millions, and making it virtually impossible to locate the source of an infection (in the case of Germany, it took 3 weeks). Surprisingly, the most vulnerable dot was not an agricultural giant like the USA, but the Netherlands (based on per capita trade activity).Other weak links are the seven giants including the USA, Germany, France, Italy, China and Spain.

For those of you interested in the numbers, the research was based on graph theory, that used factors like consumption, population and production figures in order to make a dynamic model, ranking the danger level of individual countries. We already mentioned that the Nederlands came out on top. They also calculated how fast a virus could spread in a country, and how vulnerably they are.

For more figures and numbers check out the original article.

Friday 7 September 2012

The Cinephile’s Guide to The Galaxy

Jermain Kaminski and Michael Schober’s blog Movie Galaxies offers a quantitative analysis of popular films, drawing the social structure of each subject.

Every screenwriter uses a unique narrative structure in storytelling the same way the audience chooses a character they sympathize with while watching a movie. As previously seen in our X-Men article, fictional social structures share features with real-life ones.

In terms of network analysis, the density of a social structure has a strong impact on how a story unfolds. The definition of network density is the proportion of edges in a network relative to the total number of possible edges. The term – also used in sociology – shows how much an individual identifies with the group or people surrounding him/her, and is an indicator of social capital as well. But what does this have to do with movies?

Similar to sociology, narratology has strong emphasis on group membership, and the social and behavioral patterns that keep these companionships in tact. In a high density group most of the members are in constant contact with one another, the same way they are in a classic romantic flick. In contrast to this, a movie – like The Lord of the Rings Trilogy – operating with a larger cast has a lower level of density. This makes networks in the romantic genre smaller with larges dots and stronger edges. The story usually revolves around the main conflict between the female and male lead, that plays out directly, or through confessions to each parties closest friends.

The following picture represents an interesting aspect of the research:

The infographic shows how various directors deal with their characters as narrative time passes, and how their direction affects the density of the social network within the movie. Oliver Stone and Steven Spielberg obviously like to resolve their conflicts by the end of the movie, eliminating irrelevant story lines and characters, thereby increasing the density of the network. Quentin Tarantino and David Lynch however, like to confuse their viewer even more with adding some extra storylines to the movie around half-time, lowering density.

The site offers various movie social networks with films like 2001: A Space Odyssey, Twin Peaks, Pulp Fiction and many more. The seconds picture shows Paul Thomas Anderson’s epic, Magnolia. Those of you who have seen it know, that its storytelling uses the colliding storyline technique that was made popular by Thornton Wilder who first connected seemingly unrelated storylines in The Bridge Of San Luis Rey, and has since been mastered by directors like Akira Kurosawa or Alejandro González Iñárritu in Babel. It is instantly obvious, that the various storylines and the characters they operate form clusters connected by a single edge each.

For more movie networks, check out the site.