Sir Tim Berners-Lee, the inventor of the World Wide Web, has an even grander vision for what the web can be. He and his allies have been working through the World Wide Web Consortium on an evolving initiative they call the semantic web. Berners-Lee and co-authors wrote in Scientific American in 2001 that they see a future when machines can understand data and make meaningful connections between related items, all in the service of helping people get the material they want and complete the tasks that matter to them:
The semantic web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users… The semantic web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation… To date, the Web has developed most rapidly as a medium of documents for people rather than for data and information that can be processed automatically. The semantic web aims to make up for this… The real power of the semantic web will be realized when people create many programs that collect Web content from diverse sources, process the information and exchange the results with other programs.
In the article, Berners-Lee and co-authors James Handler and Ora Lassila sketched out a scenario of how the semantic web would work and highlighted the keywords which indicate terms whose semantics, or meaning, were defined for the computer “agent” through the Semantic Web:
The entertainment system was belting out the Beatles’ “We Can Work It Out” when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor’s office: “Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I’m going to have my agent set up the appointments.” Pete immediately agreed to share the chauffeuring.
At the doctor’s office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom’s prescribed treatment from the doctor’s agent, looked up several lists of providers, and checked for the ones in-plan [eligible for insurance coverage] for Mom’s insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete’s and Lucy’s busy schedules.
In a few minutes the agent presented them with a plan. Pete didn’t like it—University Hospital was all the way across town from Mom’s place, and he’d be driving back in the middle of rush hour. He set his own agent to redo the search with stricter preferences about location and time. Lucy’s agent, having complete trust in Pete’s agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already sorted through. Almost instantly the new plan was presented: a much closer clinic and earlier times—but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were—not a problem. The other was something about the insurance company’s list failing to include this provider under physical therapists: “Service type and insurance plan status securely verified by other means,” the agent reassured him. “(Details?)”
Lucy registered her assent at about the same moment Pete was muttering, “Spare me the details,” and it was all set. (Of course, Pete couldn’t resist the details and later that night had his agent explain how it had found that provider even though it wasn’t on the proper list.)
The Wikipedia entry on the semantic web elaborates the idea this way: “The semantic web is a vision of information that is understandable by computers, so computers can perform more of the tedious work involved in finding, combining, and acting upon information on the web.” In his 1999 book “Weaving the Web,” Berners-Lee said: “I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘semantic web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.”
People commonly refer to the semantic web as the next generation of the World Wide Web – or, sometimes. Web 3.0. The hope is that it might improve data aggregation to such an extent over the next decade that an internet search that now yields hundreds or thousands or millions of responses (many not associated with the searcher’s needs) will generally deliver only the specific information she seeks.
The concept is so revolutionary that people have difficulty describing it in just so many words and its proponents self-consciously struggle to describe the “killer app” for the semantic web that will make users understand its power – and support its creation. Berners-Lee describes it as “getting one format across applications” so the semantic web standards can enable people to gain access to the information they want and use it any way they want, for instance, being able to mesh data from a personal bank statement and a personal calendar. He has said he would like to see a future web that allows people to connect their ideas with the ideas of others, building a system for people to share parts of ideas in a way that can make them whole.
Success for the semantic web will depend upon people working together to accept its standards (GRDDL, RDFa, OWL, SPARQL, and others) and its naming and tagging ontologies. Sites already implementing semantic web elements include DBpedia, Twine, Garlik, GeoNames, RealTravel, and MetaWeb.
The semantic web has some vocal opponents and critics. One is blogger, author, and speaker Cory Doctorow who described in a 2001 essay titled, “Metacrap: Putting the Torch to Seven Straw-Men of the Meta-topia” what he considered to be the seven obstacles to getting reliable data: People are lazy; people cannot accurately observe themselves; people lie; people are stupid; schema are not neutral; metrics influence results; and there’s more than one way to describe something.
Another critic is author, blogger, and professor Clay Shirky. In a 2003 column titled, “The Semantic Web, Syllogism, and Worldview” he wrote, “The semantic web is a machine for creating syllogisms…despite their appealing simplicity, syllogisms don’t work well in the real world because most of the data we use is not amenable to such effortless recombination …. There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the semantic web … like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.”
Shirky did concede, however, that some positive results will come from the efforts made toward creating a semantic web.
Accompanying the hopes for the semantic web and other Internet development are fears about the future of privacy and identity and the need for security in an architecture in which growing amounts of information are shared in a worldwide data cloud. For instance, security is vital for companies and individuals who employ semantic web products and technologies. When search for aggregated data becomes more specific, a simple search for a person’s name can yield health records, parking tickets, mortgages, signatures, travel records, video-viewing habits, and any other recordable, databased information. People’s Web-search histories, financial transactions, mailing lists, and surveillance photography of them and their homes are being collected and can be accessed, forming a “digital shadow” for every individual and group on record.