Content area
Full Text
Abstract
Chapter 2 introduces Semantic Web concepts in the context of library data and defines common Semantic Web terminology and acronyms, as well as the primary components of linked data: open data, machine-readable data formats, and the use of Uniform Resource Identifiers (URIs).
The World Wide Web was invented because Tim Berners-Lee, a scientist at the CERN nuclear research laboratory in Switzerland, wanted a way for him and his colleagues to share documents over the Internet. The genius of the invention was the ability to link from one document to another, thus creating a digital and actionable version of the classic citation. The Semantic Web is also about linking, but it adds to the original Web the linking of data, not just documents. It also changes the nature of the link: whereas the link between documents has no meaning other than "link," in the Semantic Web the links themselves have a specific meaning. We can illustrate this using the citation example: in a standard document, a citation is simply a number in the text and a bibliographic citation at the end of the page. You don't know why the author is citing that work other than what you can glean from the surrounding text. Using the richly semantic links of the Semantic Web, you could characterize each citation with a meaning such as "cites as evidence," "disagrees with," or "extends." (Those examples are from an actual Semantic Web vocabulary, CiTO, which will be described later in this report.)
There are two ways that the Semantic Web will be built: by linking information that exists within documents, and by allowing data itself to be on the Web. Markup of information in documents could allow smarter access to that information than we get with keyword searching. For example, markup could identify the author of a document so that an actual author search could be done, something that our search engines do not provide today. It could also add machine actionability to information in texts. While you and I can easily interpret "Herman Melville, the author of Moby-Dick," an indexing algorithm sees that as merely a string of seven units that can be indexed. Adding markup that specifies that the string "Herman Melville" represents a person, that...