The Ever Evolving Web: The Power of Networks
Wendy Hall, University of Southampton
Reported by: Jaclyn Selby, Youngji Kim & Amanda Beacom
Dame Wendy Hall is Professor of Computer Science at the University of Southampton School of Electronics and Computer Science and a founding director of the Web Science Research Initiative, now the Web Science Trust (http://webscience.org). Her current research focuses on the Semantic Web and Web science. In her seminar presentation, Professor Hall provided a historical context for online networks, tracing the emergence and growth of the World Wide Web to the current development of the Semantic Web.
Hall began her presentation by discussing how throughout history, people have been writing about linking information and how difficult it is to do. She noted that the brain does this very well and so scholars have sought to develop tools that use the human brain as a model for the organization and management of information. With the creation of increasingly sophisticated machines, people began to think about how machines could be used to create cross-references, links, and associations between related units of information. In 1945, for example, Vannevar Bush, scientific advisor to U.S. President Franklin Roosevelt during World War II, wrote an Atlantic Monthly article titled “As We May Think” which advocated the need for new technologies that use the brain as a model for storing and finding information. This article, which Hall highlighted in her lecture as one of the inspirations for her own work, discusses how a machine could create a system of automatic, “associative indexing,” and uses terms such as “trails” and “web” to describe this system.
Hall described how innovations in computers beginning in the 1960s continued to reference or attempt extensions or augmentations of the human brain. She mentioned her colleague, Ted Nelson, who coined the phrase “everything is deeply intertwingled” to express the complexity of interrelations in human knowledge. In the 1960s, Ted Nelson first used the terms hypertext and hypermedia and created Xanadu, a hypermedia system; and Douglas Engelbart developed Augment, a project with hypertext features that envisioned the use of computers to enhance intellect. In the 1970s and ‘80s, hypertext systems were further developed in research labs and commercially with the introduction of personal computers. Hall and her colleagues created Microcosm (the Mountbatten Archive Application) in the late 1980s to store information links in databases. These ‘linkbases,’ as she referred to them, were to capture all the relationships between different pieces of information. All the links were triples. Source, destination, scripts. Hall noted that although the Internet existed, there was no real web. Her hypothesis was that hierarchical indexing is what is necessary to store information.
Apple’s HyperCard became available on Macintosh computers in 1987. In 1989, Tim Berners-Lee began development of the World Wide Web to facilitate information sharing among scientists, creating a system of open protocols and universal standards involving Hypertext Transfer Protocol (HTTP) and Hypertext Mark-up Language (HTML). He wrote a paper called “Information Management: A Proposal” and then went on to work on a demo of the ‘World Wide Web’ which he debuted in 1991 to much skepticism. The ACM hypertext conference famously rejected Tim’s paper but by 1993 the idea was widely accepted. In just a few years, the system became user-friendly with the introduction of the Mosaic, and later the Netscape and Explorer browsers.
After outlining these key historical events, Hall offered some lessons learned in the development of the Web. First, she said, “big is beautiful,” meaning that as Berners-Lee argued, the network is the most important feature of the Web as a hypertext information system. She emphasized that we had lost (for a time) conceptual and contextual linking and that the Web had been “a strangely linkless world” with search engines filling the gap where the missing links were. Other such systems that were developed around the same time as the Web operated on stand-alone workstations, and could be accessed only at those workstations. The Web, in contrast, may be accessed anywhere. Second, “scruffy works,” meaning that the system did not need to be perfect in order to be effective. Links could fail. The third lesson, Hall said, is that “democracy rules.” The Web is based on non-proprietary protocols and universal standards, and demonstrates how everyone has to use such a system, or no one will. Hall points out that ironically, Web search engines such as Google, which are so integral to Web use today, operate in a spirit opposite that of this third lesson. Whereas the Web is an open and transparent system, Google is a closed system with proprietary search algorithms and little transparency. The irony is that when Brinn and Page published their paper on their page ranking algorithm in 1997, they were told it wouldn’t scale. They had to get financial support and do the math to prove that it would, which they did in 1999. But then they couldn’t make any money so they came up with this idea of auctioning words which turned out to be very successful. One of Hall’s key lessons regarding the rise of Google, which depends on the links we make to make its results more accurate, is that Links equal Power. Ie. if more people point to you than you are rewarded with status you don’t have to pay for.
Hall then described a situation where she asked her students if the Web was truly a hypertext system? Links are unidirectional and don’t point back to where they came from. However, the World Wide Web was so much better than what came before it that researchers didn’t care and busied themselves exploring “the new universe.” However, the web did not become a truly ‘social web’ until it completed the transformation from Read Only Web to Read/Write Web. Hall cites a number of revolutionary social sites (Wikipedia, Galaxy Zoo, Twitter) that are a product of our growing ability to ‘write’ to the web.
The lessons from the development of the Web, of course, also inform the development of the Semantic Web. Hall explained that whereas the Web is built on links between documents, the Semantic Web is built on links between data. This shift from documents to data allows for data re-use, reduces the requirements for human information processing, and releases the large quantity of currently inaccessible data stored in relational databases and Excel spreadsheets by allowing these data to be directly processed by machine. The building blocks of the semantic web are Universal Resource Identifiers (URIs) and the Resource Description Framework (RDF), which describes and links the data, and which Hall equated to HTML. (Nigel Shadbolt’s seminar lecture, which followed Professor Hall’s, provided additional detail on these concepts.). Hall suggested that the aggregation of all this information in a standard manner might make it possible for people to pose queries to the system such as “where is the best place to study journalism?” and receive structured and useful answers.
Hall posed the question of what will be the tipping points for widespread adoption and use of the Semantic Web. One possible tipping point, she said, is the use of the Semantic Web by governments. Both the administration of U.K. Prime Minister Gordon Brown and the administration of U.S. President Barack Obama have announced initiatives for using the Semantic Web. (See the following sites for more information: http://data.gov.uk/ and http://www.sitepoint.com/blogs/2009/03/19/obama-groundbreaking-use-semantic-web/.)
Hall concluded her talk by introducing the emerging interdisciplinary field of Web science, which she refers to also as part of Web 3.0. She envisions Web science as “a process of creative innovation, design and engineering, the social and the technical, and interpretation and analysis,” and “inter-/multi-/trans-disciplinary”—not the union of disciplines, but their intersection. Understanding the web, according to Hall, is a major challenge as large as any other global cause because nobody (as of yet) owns the web and there are possible scenarios which could end in its demise. She argued that the field—and the questions it will investigate—matter because the Web has become our cultural legacy and social heritage, and because we cannot afford to take the freedom to exchange information online for granted.
Discussion
Q: What are the limits of the knowledge available online?
A: Aside from some archival data, most information is accessible on the Web.
Q: Is a limitation of the Semantic web its objective view of associations, given that the association one person makes between two pieces of information may differ from the association another person makes, depending on different ontologies?
A: Given that the World Wide Web functions without every link to every document, the Semantic Web should be able to function without all possible associations.

