官术网_书友最值得收藏!

Unstructured

Unstructured data consists of any dataset that does not have a predefined organizational schema as in the table in the prior section. Spoken words, music, videos, and even books, including this one, would be considered unstructured. This by no means implies that the content doesn’t have organization. Indeed, a book has a table of contents, chapters, subchapters, and an index--in that sense, it follows a definite organization.

However, it would be futile to represent every word and sentence as being part of a strict set of rules. A sentence can consist of words, numbers, punctuation marks, and so on and does not have a predefined data type as spreadsheets do. To be structured, the book would need to have an exact set of characteristics in every sentence, which would be both unreasonable and impractical.

Data from social media, such as posts on Twitter, messages from friends on Facebook, and photos on Instagram, are all examples of unstructured data.

Unstructured data can be stored in various formats. They can be Blobs or, in the case of textual data, freeform text held in a data storage medium. For textual data, technologies such as Lucene/Solr, Elasticsearch, and others are generally used to query, index, and other operations.

主站蜘蛛池模板: 阿拉尔市| 长沙市| 肇源县| 承德市| 共和县| 滦南县| 专栏| 巴楚县| 西吉县| 南昌市| 常州市| 宁蒗| 上林县| 定远县| 湟中县| 崇阳县| 高尔夫| 惠来县| 涿鹿县| 缙云县| 镇原县| 五河县| 长宁县| 河西区| 巩留县| 资溪县| 巴塘县| 特克斯县| 卓资县| 诏安县| 永定县| 宿州市| 卢湾区| 六安市| 辽中县| 磴口县| 平罗县| 龙山县| 广汉市| 通州市| 舟曲县|