Simply the best?

Social Network  Analysis (SNA) dates back over fifty years. The serious side of this work is that a network representation is a great simplification of complex social interactions but hopefully one which captures a key aspect  – the bilateral relationships. Mathematical measures can then be applied to the tangle of connections in order to reveal key features. One important use of SNA is to identify who is the most important person. Defining what is meant by “important” is an important part of the question with many different answers for different contexts.

However it also means we can have a little fun. In a frivolous mode we can just apply the various tools SNA provides (PageRank, betweenness and so forth) to produce instant answers to our favourite collection of individuals.

A recent study of bibliographies on Wikipedia (Biographical Social Networks on Wikipedia – A cross-cultural study of links that made history) was based on publically available data from dbpedia.  There is a serious point to the paper which contains many interesting conclusions especially about the relationships between different languages. However at a trivial level I couldn’t help but be drawn to a table of the “top 25 persons” which is given below.  The top 5 were (in descending order by in-degree) George W. Bush, Barack Obama, Bill Clinton, Ronald Reagan and then Adolf Hitler. Of course this tells us as much about who is using the English Wikipedia system as anything else.  In fact looking at the top 25 shows most are recent US presidents. Its those who are not who caught my eye.

First, in this age of celebrity,  there are very few modern entertainers, just Elvis Presley, Frank Sinatra, Bob Dylan, Michael Jackson (again in descending order).  They are,  though, all American.

At least us Brits are the second most popular nation as we get William Shakespeare, Winston Churchill and the Queen.  In fact it is probably the continuing obsession with the Second World War in the America and Britain, and not their tyrannical nature, that sees two other foreigners make it into the list, with Hitler and Stalin appearing along with Churchill.

Religion makes a surprisingly small contribution given the US bias. We only have Pope John Paul II and Jesus as top 25 sort of guys, though the last Pope is higher than his boss.  Still while Jesus is near the bottom it still puts him above any of the Beatles,  ‘disproving’ John Lennon’s famous quip.

I also thought it was interesting that anything before  1900 is clearly ancient history.  Lincoln is the only 19th century representative while Shakespeare and Jesus are the only contributions from earlier times.

Perhaps the saddest comment on our times is the lack of any women.  I’m not sure if the Queen is there because of what she has done with the job or because of the fact she is head of state.  The latter reason is a just matter of chance although even there her sex made it harder to be in the position (male heirs are preferred over female heirs) .  The only other woman, at number 25 in fact, is Hilary Clinton.

The top 25 persons in the English Wiki- pedia ranked by in-degree.

The top 25 persons in the English Wiki- pedia ranked by in-degree. Taken from Table 2 of arXiv:1204.3799.

Open access academic publishing

Open access academic publishing made it into a UK national newspaper, both as an article (“Government welcomes calls to open up science“, The Guardian, 11 April 2012), as an editorial (“An open and shut case“, 11 April) and in it’s letter column (“Better models for open access“, 15 April). Yet for a working model of open and low cost academic publishing arxiv.org has provided a successful example for over twenty years. Authors place articles for free, articles are open for all to read and index, while the minimal costs are covered directly by research funding agencies. Readers decide what is worth a look, posterity decides what is good.

My experience with arXiv (the X is meant to represent the greek letter chi so that this is pronounced like archive) is that it provides me with everything I need. Why pay for the editorial staff when I provide journals with camera ready copy? Why pay for paper copies when no one uses them? The arXiv brand is now better known in theoretical physics than any single journal.  There are no referees for my arXiv articles but I find most referee reports of limited use. Instead I provide input to other authors electronically when I think I have something constructive to say for which I am paid the same as my reports for journals, i.e. zilch.  Google Scholar, Microsoft Academic Search and other social networking sites exploit the open nature of arXiv to provide citation tracking and other search tools.

The only reason I use a journal for my work is that the bodies funding my research persist in using the publications in a journal and the citations to my journal article from other journal articles as a measure of the quality of my output. Thus it is the funders themselves who perpetuate a system in which they use scarce funds to support an old fashioned, expensive and unnecessary system for the propagation of research results.

Part of a screen shot from arXiv

Richard Dimbleby Lecture 2012 – The New Enlightenment (Sir Paul Nurse)

This year’s Richard Dimbleby Lecture was given by Sir Paul Nurse, President of the Royal Society of Great Britain.  The tenet for Sir Paul’s lecture was that science and investment in science has given greatly to the society that we live in today – not just for scientists but also for technology and the economy and far beyond.  He makes a well-reasoned argument, which politicians would be wise to listen to.  Historically, scientists and researchers have been looked up to in society and have been considered to be those whose work underpins our society.  I think that to some large degree that perception is in the descendant and has been chipped away or damaged over the years.  Certainly, the act of working in science is getting forever harder and more and more time is spent on administration, teaching, form filling and justification of funding proposals with less and less time actually spent on science itself.  “The business of science” is now arguably a larger industry than “the practice of science”.  A recent article in the Guardian by Professor Mike Duff of Imperial College London would appear to agree with this last point.  Basic science is coming under fire and only popular science remains.  Britain has always been a leader in innovation, it would be sad to lose that in favour of following the crowd.

Geographies of the world’s knowledge

This report came to my attention a short while ago, it was sent to me by a colleague at the University of Oxford.  While most of this blog is about social media, science, networks and ranking, which we might think of as the “weather” in science this report it more of a “climate” analysis.  Anyone who is interested understanding the trends of science on a century by century timescale would do well to read this report.  Although it is very much an executive summary, there is some good commentary and the report contains some carefully considered and beautifully produced infographics.

The report is available on request from http://www.convoco.co.uk/editions.

The Connected Past: academia at its best

The Connected Past Logo

Just back from an excellent meeting which pulled together people interested in networks and complexity in archaeology and history. The Connected Past (twitter hash tag #connectedpast) was held at Southampton as a two day symposium preceding the CAA 2012 conference (Computer Applications and Quantitative Methods in Archaeology). There was an incredible range of speakers and as they were all recorded you should be able to find the talks on line eventually.

At one end Astrid Van Oyen from Cambridge talked about Actor Network Theory which seemed to me to be a good example of what researchers in social sciences understand to be a theory. That is it seemed to be largely about concepts and described in terms of words, certainly no equations, so nothing like the theories I am used to. However my experience over the last decade has been that physical scientists should not be too quick to dismiss these types of theory. The thoughts and ideas in social science theories can be used to mould the numerical models and theoretical equations which I associate with a theory.

My work on archaeological models lies at the other extreme, and we had many examples at this end too. The work of a University of Sheffield group, as described by Caitlin Buck, was a good example here. This uses Bayesian methods to produce models for the spread of agriculture across Europe but ones which are firmly based on the data (including its uncertainty), in this case carbon dated finds of cultivated cereal grains. It takes five days of computer time to produce a spatial temporal map of the spread of agriculture across Europe so it is a real challenge for the physical scientists.

I can’t resist mentioning my colleague, Ray Rivers too.  I am not sure if archaeologists like his  description of Knossos as being (in some models) the “Tescos of the Aegean” (you can replace Tescos with WalMart or any other appropriate supermarket chain)  if they found the idea that Margaret Thatcher went wrong because she placed had too much trust in Agent Based Modelling helpful.  However underneath his flowery turn of phrase was a serious message echoed by several others, that is trying to understand what we can learn from models.

Overall one of the most enjoyable meetings I have been too. The range of topics and knowledge meant there was lots of for me to take away and I hope I gave something new to others too. There is a new wave of archaeologists who can see the utility of these new ideas and tools and as a complement to existing methods. This field is part of the Digital Humanities movement as information technology delivers new avenues of research for the social sciences. The popularity of the meeting showed that in archaeology this approach has now reached a critical mass.

Perhaps the main question I came away with is “what are we delivering with these new tools”? This was one of the themes suggested by Carl Knappett from Toronto (literally as he gave his talk over Skype – a first for me at a conference). Carl suggested it was time for the field to mature and I agree with that. For this we need physical scientists to work with data and for archaeologists to become confident in working these new tools.

The Connected Past keynote speaker Carl Knappett

The Connected Past keynote speaker Carl Knappett

The overriding feeling though is one of excitement, new ideas, new opportunities, along with the challenge of what are we going to deliver. The organisers, Tom Brughmans (University of Southampton), Anna Collar (University of Liverpool) and Fiona Coward (Royal Holloway University of London) are to be praised for doing such a good job with such a timely meeting. The Connected Past meeting was academia at its best.

The Organisers of the Connected Past Meeting, Southampton 2012

The Organisers of the Connected Past Meeting, Southampton 2012

Citeology

Well, we all know that adding “-ology” to a word makes it a science – geology, biology, scientology – oh, well, perhaps not scientology.  The citeology project at Autodesk Research is a wonderful visualisation that shows the temporal relationship between references.  The corpus to which the analysis is applied is currently quite small, extending to some 3502 papers in Human Computer Interaction conferences between 1982 and 2010 – 11699 citations are tracked.  The ensuing diagrams give a compelling visualisation showing quickly just how many citations have been made to articles and in the corpus, which articles are uncited and what the temporal “reach”  of an article has been.  There is a nice app on the page that allows you to explore the data set.  While this works well for smaller datasets, I wonder how this approach could be scaled to work with something of the size of the Web of Science or Scopus data sets?

Evidently, Justin Matejka is the force behind this work – a contact link can be found to him on the page mentioned above.  A paper describing the approach by Justin and his colleagues Tovi Grossman and George Fitzmaurice is available here http://autodeskresearch.com/publications/citeology2.

A lifetime in numbers

While this article isn’t particularly about network theory it is an excellent study in what can be done with some data, a computer and a small amount of time.  Wolfram explores the numbers (and distributions) behind his email…  An amusing analysis: http://blog.stephenwolfram.com/2012/03/the-personal-analytics-of-my-life/.  Candidly, the aim of the article is to show off the new Wolfram Alpha Pro offering.  The graphs are cute and the comemntary is fun.  But, given the nature of the data and the power of the computational tools available to Dr Wolfram, why not a network analysis?