- Artificial Intelligence for Big Data
- Anand Deshpande Manish Kumar
- 445字
- 2021-06-25 21:57:11
Ontology learning
With the basic concepts on Ontologies covered in this chapter, along with their significance in building intelligent systems, it is imperative that for a seamlessly connected world, the knowledge assets are consistently represented as domain Ontologies. However, the process of manually creating domain-specific Ontologies requires lots of manual effort, validation, and approval. Ontology learning is an attempt to automate the process of the generation of Ontologies, using an algorithmic approach on the natural language text, which is available at the internet scale. There are various approaches to Ontology learning, as follows:
- Ontology learning from text: In this approach, the textual data is extracted from various sources in an automated manner, and keywords are extracted and classified based on their occurrence, word sequencing, and patterns.
- Linked data mining: In this processes, the links are identified in the published RDF graphs in order to derive Ontologies based on implicit reasoning.
- Concept learning from OWL: In this approach, existing domain-specific Ontologies are leveraged for expand the new domains using an algorithmic approach.
- Crowdsourcing: This approach combines automated Ontology extraction and discovery based on textual analysis and collaboration with domain experts to define new Ontologies. This approach works great since it combines the processing power and algorithmic approaches of machines and the domain expertise of people. This results in improved speed and accuracy.
Here are some of the challenges of Ontology learning:
- Dealing with heterogeneous data sources: The data sources on the internet, and within application stores, differ in their forms and representations. Ontology learning faces the challenge of knowledge extraction and consistent meaning extraction due to the heterogeneous nature of the data sources.
- Uncertainty and lack of accuracy: Due the the inconsistent data sources, when Ontology learning attempts to define Ontology structures, there is a level of uncertainty in terms of the intent and representation of entities and attributes. This results in a lower level of accuracy and requires human intervention from domain experts for realignment.
- Scalability: One of the primary sources for Ontology learning is the internet, which is an ever growing knowledge repository. The internet is also an unstructured data source for the most part and this makes it difficult to scale the Ontology learning process to cover the width of the domain from large text extracts. One of the ways to address scalability is to leverage new, open source, distributed computing frameworks (such as Hadoop).
- Need for post-processing: While Ontology learning is intended to be an automated process, in order to overcome quality issues, we require a level of post-processing. This process need to be planned and governed in detail in order to optimize the speed and accuracy of new Ontology definitions.
推薦閱讀
- 數(shù)據(jù)存儲架構(gòu)與技術(shù)
- 數(shù)據(jù)要素安全流通
- Greenplum:從大數(shù)據(jù)戰(zhàn)略到實現(xiàn)
- Python數(shù)據(jù)分析與挖掘?qū)崙?zhàn)
- Word 2010中文版完全自學(xué)手冊
- Architects of Intelligence
- SQL查詢:從入門到實踐(第4版)
- UDK iOS Game Development Beginner's Guide
- 數(shù)據(jù)庫系統(tǒng)原理及應(yīng)用教程(第4版)
- Oracle高性能自動化運維
- OracleDBA實戰(zhàn)攻略:運維管理、診斷優(yōu)化、高可用與最佳實踐
- 智能數(shù)據(jù)時代:企業(yè)大數(shù)據(jù)戰(zhàn)略與實戰(zhàn)
- Oracle PL/SQL實例精解(原書第5版)
- IPython Interactive Computing and Visualization Cookbook(Second Edition)
- Instant Autodesk AutoCAD 2014 Customization with .NET