Content area
[...]metadata plays a very crucial role in information retrieval. [...]filtering out relevant information from websites, blogs, etc., from the web search engines is a difficult task. [...]metadata has now become the focus of many research studies, particularly in the areas of information retrieval and semantic web technologies, as communities outside of the library and information science community have now realized its importance. 3.2 Dublin Core The Dublin Core Metadata Element Set is a vocabulary of 15 elements which can be used in resource description.
Abstract
The paper discusses an ontology-based model for the knowledge transactions that are part of every organization, be it a university, business house, or a government department. The framework is centered on semantic web-based technologies. It uses ontologies and standards like Personal Information Model (PIMO) Ontology and Dublin Core (DC) for categorization and description of the knowledge transactions, respectively. The framework will be helpful in many ways for all kinds of organizations that wish to go in for automation.
Keywords: Ontology-based framework, Framework for knowledge transactions, Semantic web-based framework, PIMO, RDF, NEPOMUK Representational Language (NRL)
1. Introduction
Every organization, irrespective of whether it is academic, administrative, research institute or a business-house, produces knowledge at different levels and in different forms for a variety of purposes. This knowledge is used to develop products and services, or to provide expertise and consultation, or is documented for future use. But not all knowledge is necessarily documented. Much of it is passed on or disseminated through informal communication channels, such as face-to-face interactions, mails, consultations, presentations at meetings, seminars or conferences and even informal ways like invisible colleges, information gatekeepers, etc. In other words, knowledge in organizations is produced both in tacit and explicit forms. Managing and preserving knowledge in documented and undocumented forms has long been a challenge for the knowledge managers of organizations. Knowledge management is now an integral part of every organization; however, it also presents certain issues.
2. Knowledge Transactions in an Organization
It is important for organizations to document and organize knowledge, irrespective of whether it is tacit or explicit. This knowledge helps all the people associated with the organization, ranging from the employees, stakeholders, and the customers/users of the system. The knowledge should be well-documented, well-organized, and easily and quickly retrievable. This helps in providing services, developing products, troubleshooting, resolving problems, and resolving issues in the organization.
In order to document and organize knowledge, it is essential to know the components involved and the process of knowledge transactions that goes on in the organization. The knowledge transactions taking place in an organization will understandably be complicated with a variety of people, complex formal and informal transactions/exchanges amongst them, and the various documents, services, and products involved. To understand the intricacies involved in such transactions, a model or framework needs to be developed, which specifies all the components and stakeholders and is also able to define the multipronged and complex relationships among these components.
3. Proposed Model for Organizational Knowledge Transactions
This article proposes a model for knowledge transactions in an organization for a semantic web-based system (Semantic Web, 2016) for managing knowledge within an organization like a university or business organization. This kind of a model/framework helps in searching knowledge which is spread across various departments within the organization in the form of documents, expertise or service/product. It will function as a support system for the people associated with the system.
3.1 Metadata
Metadata is often simplistically described as 'data about data'; however, the term metadata is much more than that. Metadata can be used to provide a standard description of entities such as data, people, institutions, events, and processes. In fact, metadata plays a very crucial role in information retrieval. Metadata enables one in knowing the identification, location, availability, mode or method of access, provenance information, and rights information related to the entities. Hence, metadata can be more meaningfully described as 'structured data about data'. It enables a meaningful description of the entity being described. The concept of metadata is not new to librarians. Librarians have been using catalogues for centuries to describe, as minutely as possible, the documents in their holdings in order to facilitate their easy and quick access to the library users. It is bibliographical metadata that makes librarybased tools like the library catalogue/Online Public Access Catalogue (OPAC) more efficient in retrieval than the Web search engines, such as Google. The websites on the World Wide Web (WWW) lack proper description of the web pages and other content that they host. This is because Hyper Text Markup Language (HTML) is used to host content on websites. HTML does not offer any kind of mechanism to meaningfully describe content on the websites. All or any kind of description about the web pages is limited to the use of META tag with the attribute keyword. Hence, filtering out relevant information from websites, blogs, etc., from the web search engines is a difficult task. Realizing this, few search engines have started focussing on scholarly information available on the WWW. Google Scholar is one such example. Few others like yippy.com, for example, have tried to adapt rudimentary kind of classificatory structures. But again these search engines depend on the metadata from the journal websites or the digital libraries. Hence, metadata has now become the focus of many research studies, particularly in the areas of information retrieval and semantic web technologies, as communities outside of the library and information science community have now realized its importance.
3.2 Dublin Core
The Dublin Core Metadata Element Set is a vocabulary of 15 elements which can be used in resource description. Dublin Core gained popularity when it was disseminated as part of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Since then, it has been ratified as IETF RFC 5013, ANSI/ NISO Standard Z39.85-2007, and ISO Standard 15836:2009. The fifteen element Dublin Core is part of a larger set of metadata vocabularies and technical specifications maintained by the Dublin Core Metadata Initiative (DCMI). The complete set of vocabularies, DCMI Metadata Terms [DCMI-TERMS], also includes sets of resource classes (including the DCMI Type Vocabulary [DCMI-TYPE]), vocabulary encoding schemes, and syntax encoding schemes. The terms in DCMI vocabularies are intended to be used in combination with terms from other, compatible vocabularies in the context of application profiles, and on the basis of the DCMI Abstract Model [DCAM]. (DCMI Metadata Basics, 2016)
3.3 Ontology
To understand the knowledge transactions, a formal specification is required to explicitly define various entities (like people, job descriptions, services, products, documents, etc.) and their interrelationships in an organization. Ontologies can be used to define such entities and their interrelations. Hence, this framework is based on ontology which is one of the semantic web technologies.
An ontology as used in context of knowledge sharing is defined as 'a specification of a representational vocabulary for a shared domain of discourse -definitions of classes, relations, functions, and other objects-is called an ontology' (Gruber, 1993). In simple terms, ontology can be said to be the definition of entities and their relationship with each other. It is based on semantic nets used in Artificial Intelligence. Ontologies define data models in terms of classes, subclasses, and properties. Noy and McGuinness state the need for developing ontology (Ontology Development 101, 2016.
* To share common understanding of the structure of information among people or software agents
* To enable reuse of domain knowledge
* To make domain assumptions explicit
* To separate domain knowledge from operational knowledge
* To analyse domain knowledge
3.4 PIMO Ontology
PIMO is the abbreviation for the Personal Information Model of a user. The PIMOOntology is both a Resource Description Framework (RDF) vocabulary to express such a model and an upper ontology defining basic classes and properties to use. The scope of a PIMO is to model data that is within the attention of the user and needed for knowledge work or private use.
3.5 RDF
The Resource Description Framework (RDF) is a general-purpose language for representing information on the Web. RDF is a standard model for data interchange on the Web. RDF has features that facilitate merging of data even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. RDF extends the linking structure of the Web to use Uniform Resource Identifiers (URIs) to name the relationship between things as well as the two ends of the link (this is usually referred to as a 'triple'). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications. This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations (Resource Description Framework, 2016).
3.6 RDF Data Model
Representation of data through RDF is very easy as it is based on a tri-pronged approach of resource, property, and value. A simple RDF model comprises the following three parts:
* Resource: Any entity which has to be described is known as resource. It can be a 'web page' on Internet or a 'person' in society or any object.
* Property: Any characteristic of 'resource' or its attribute which is used for the description of the same is known as property. For example, a web page can be recognized by 'Title' or a person can be recognized by 'Name'. So, both are attributes for recognition of resource 'web page' and 'person', respectively.
* Value: A 'property' must have a value. For example, Prolegomena to Library Classification will be value of the property Name and S R Ranganathan can be value of the property Title.
3.7 NEPOMUK Representational Language (NRL)
NEPOMUK stands for Networked Environment for Personalized, Ontology-based Management of Unified Knowledge. The Nepomuk Representational Language (NRL) was designed for knowledge representation in NEPOMUK Social Semantic Desktop applications. It is built on top of the Resource Description Framework (RDF) and the associated RDF Schema (RDFS). NRL includes support for two main additional concepts: Named Graphs and Graph Views. Named Graphs help in coping with the heterogeneity of knowledge models and ontologies, especially multiple knowledge modules, with potentially different interpretations. The graph view concept allows for the customizing of ontologies towards different needs in various applications and also provides the mechanism to impose different semantics on the same syntactical structure.
3.8 Workflow components of an Organisation
The work flow components of such organizational transactions essentially involve people, documents, and services/products. Each of these components has its own attributes and is also interrelated to the other components. The attributes and interrelations of each of these components can be explicitly defined and explained using ontologies. The work flow that takes place in an ontology-based system can be represented diagrammatically as shown in Figure 1.
For instance, let us consider the people component from the above example. The profiles of all the employees and clientele of the organization and their interrelationships can be formalized and represented using ontologies. This conceptual framework is represented as an ontology relationship in Figure 2. A person can either be the creator or collaborator of a document. The person and the document profiles are linked to each other by the subject, that is,the person specializes in the given subject. The person also holds some position in the organization, so his/her profile will also be linked to the organizational chart.
3.9 An Ontology-based Model for Knowledge Transactions in an Organization
The conceptual framework presented in this article is based on ontologies (see Figure 3). Ontologies are developed for Person, Document, Product, Subject, and Organization. For the description of Document, Product, Subject and Organization ontologies, Dublin Core (DC) metadata standard (Dublin Core Metadata Initiative, 2016) is used for description. This framework uses PIMO Ontology for 'person'. PIMO stands for Personal Information Model of a user. The PIMO Ontology can be used to express Personal Information Models of individuals. It is based on Resource Description Format (RDF) (Resource Description Framework, 2016) and the NEPOMUK Representational Language (NRL) (Nepomuk Representational Language Specification, 2016) and other Semantic Web ontologies. The PIMO-Ontology is both an RDF vocabulary to express such a model and an upper ontology defining basic classes and properties to use. Data stored in files, in Personal Information Management (PIM) or in groupware systems in different formats, like text documents, contact information, e-mails, appointments, task lists, project plans, or an Enterprise Resource Planning (ERP) system lies within the scope of PIMO, but abstract concepts (such as 'Love', 'Language') can also be represented, if required (Personal Information Model [PIMO], 2016).
4. Conclusion
The framework discussed in this article is based on semantic web concepts which are very helpful for an organization that wishes to go for automation, be it universities, business houses, ministries, etc. The framework divides the system into different components and applies the standards for categorization and description of each and every object. The standard framework can work across organizations for object description and categorization which will lead to interoperability in the functioning of organizations. It will further lead to data extraction and reusability by policymakers of the organizations. Dependence on third-party individuals or companies for data collection can be dispensed with and also time and effort will be saved. Establishment of such a system at the national level with standard framework may also prove very helpful in e-governance.
5. References
DCMI Metadata Basics. Available at <http://dublincore.org/metadata-basics/>, last accessed on March 11, 2016.
Dublin Core Metadata Initiative. Available at <http://dublincore.org/>, last accessed on March 9, 2016.
Gruber, Thomas R. 1993. A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 5(2):199-220. Available at <http://tomgruber.org/writing/ontolinguakaj-1993.pdf>, last accessed on March 9, 2016.
Nepomuk Representational Language Specification. Available at <http://www. semanticdesktop.org/ontologies/2007/08/15/nrl/>, last accessed on March 9, 2016.
Ontology Development 101: A Guide to Creating Your First Ontology. Available at <http://protege. stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html>, last accessed on March 11, 2016.
Personal Information Model (PIMO) Ontology Guide, NEPOMUK Recommendation v1.1. Available at <https://nlp.fi.muni.cz/projects/ole/deri/pimo.pdf>, last accessed on March 9, 2016.
RDF - Semantic Web Standards. Available at <https://www.w3.org/RDF/>, last accessed on March 11, 2016.
Semantic Web - W3C. Available at <https://www.w3.org/standards/semanticweb/>, last accessed on March 9, 2016.
Dimple Patel
Assistant Professor, Department of Library & Information Science, Central University of Himachal Pradesh, Dharamshala, Himachal Pradesh
(E): [email protected]
Aditya Tripathi
Associate Professor, Department of Library & Information Science, Banaras Hindu University, Varanasi
(E): [email protected]
Copyright The Energy and Resources Institute Dec 2016