Blog‎ > ‎


posted Apr 23, 2012, 8:55 AM by Symeon Papadopoulos   [ updated Apr 30, 2012, 1:56 AM ]
I had the great opportunity to attend all days of the 21st World Wide Web conference, held in Lyon last week (16-20 Apr). It is probably not necessary to introduce WWW, as it is considered the most important conference in the area of Web research and technologies. In contrast, it is worthwhile to stress that in the last years the conference has progressively increased its focus on topics related to Social Networks. The following description contains some of the highlights that caught my attention. Unfortunately, there were many interesting workshops and presentations running in parallel, so it was necessary to miss some really interesting ones...

Keynotes by Sir Tim Berners-Lee, Neelie Kroes and Plenary Panel
Obviously, the keynote by Tim Berners-Lee was one of the conference highlights. Tim tried to summarize the main principles and values that render the Internet an effective technological infrastructure for increasing social welfare and freedom. Tim started his talk with a reference to the principle of least power, which points to the use of declarative languages as the basis for the Web. Then, he stressed the importance of consensus when designing complex systems and drafting standards. Although consensus needs a lot of hard work and effort to achieve, it is necessary in order to end up with high-quality designs and standards (characteristic quote: "If you happen to meet a guy coming from a W3C meeting in a pub, buy him a drink. He needs it!"). Furthermore, Tim encouraged the use of open platforms and systems as the basis for the Internet of the Future. As an example, he encouraged the use of HTML5 over native mobile apps. Finally, Tim tried to warn of potential risks stemming from research legislation efforts, such as CISPA, which centralize trust on infrastructures that do not necessarily reflect trust (e.g. PKI, DNS). Some nice highlights of the keynote are available in Storify.
Very related to Tim's keynote was the talk by Neelie Kroes on the topic of open Internet, standards and innovation on top of them, as well as the Plenary Panel that followed on the topic "Web as a human right". A recurring point in the discussion was the management of personal data and digital traces that people leave on the Web. An interesting remark by Tim Berners-Lee hinted that we will need to accept that in the future our personal data will be accessible by different parties and that legislation should be in place to ensure that they are used in limited and appropriate ways.

A snapshot of the Amphitheater where the keynotes took place. 

MSND, SMANE and TempWeb workshops
Mining Social Network Dynamics (MSND), Social Media Applications in News and Entertainment (SMANE) and TempWeb were three fascinating workshops containing interesting keynotes and research presentations. 
Eytan Bakshy from Facebook was the invited speaker for MSND. In his talk, he described a very large-scale (253 million users) empirical study on how social cues affect the sharing behavior of online users. A series of interesting research presentations followed, focusing on different aspects of the information diffusion problem: prediction of diffusion, maximization of information diffusion by appropriate targeting mechanisms, and visualization of information diffusion. 
The SMANE workshop opened with a very interesting talk by Nic Newman focusing on key issues and challenges related to social media and the news business. Research presentations covered different aspects of the workshop topic, ranging from the measurement of public mood on Twitter, to using Twitter for news re-ranking. Of particular interest was the presentation of SocialSensor, focusing on the requirements that news professionals have from a social media mining system. This led to a panel discussion moderated by Jochen Spangenberg (DW), where news experts (Wilfried Runde, Denis Teyssou, Nic Newman) discussed about the opportunities and challenges that news professionals face in their effort to harness social media. Two very important problems relate to tracking massive amounts of user-contributed content and to verifying information originating from social networks. A clear need was expressed for tools that can help in these problems. The TembWeb workshop focused on the temporal aspects of the Web. A fascinating keynote was given by Staffan Truve providing valuable insights into the main challenges involved in indexing temporal information in news feeds. A particularly interesting fact provided by Staffan was their use of "sentimomentum" (a measure expressing the popularity and sentiment of named entities on the Web) for stock trading. They reported a 10% gain in a single month, while the S&P500 index dropped by 10% in the same period.

Web Mining, Web Content and User Characterization in Social Networks
The technical tracks of the conference contained a multitude of interesting talks. Out of those that I followed, I particularly liked a study by Robert West and Jure Leskovec, in which they devised a game, "Wikispeedia", asking users to navigate (starting from a given seed article) through Wikipedia articles with the goal of ending up in a predefined article. In that way, they could study the efficacy of the different navigation strategies employed by the game players. Another interesting work pertained to the study of temporal characteristics of hashtag popularity, in which four dynamical classes of popularity were defined and their correlation with social semantics was investigated.
A study showcasing the potential of social multimedia as sensors of the real-world was presented by Crandall and colleagues: more specifically, the researchers made use of a large number of geotagged Flickr images with the goal of studying the presence of snow and vegetation cover in the US. The study offered many insights in the challenges and opportunities when mining social media content for extracting information for a different purpose. 
Localizing Tweets constituted the main focus of a presentation by Hong and colleagues. The authors proposed a probabilistic graph model modelling the term distribution of regions, users and topics with the goal of localizing Tweets despite their brevity. 
Finally, three very enjoyable presentations were included in the Web Mining (chaired by Thomas Gottron). The first of those described a novel visualization mechanism for summarizing the evolution and linking of news stories. The second detailed a framework for mining events from news streams with the goal of uncovering (and subsequently using for prediction) causality structures between events. The third presentation contained the results of a historical study on a very large news collection with the goal of measuring the span of public attention and popularity through news media.

If I could summarize the WWW in a short list of important topics that seem to be of prominence these days, I would pick "information diffusion", "community detection", "web mining" in terms of research, and "open data and systems", "trust", "privacy" in terms of policy.