官术网_书友最值得收藏!

Unstructured

Unstructured data consists of any dataset that does not have a predefined organizational schema as in the table in the prior section. Spoken words, music, videos, and even books, including this one, would be considered unstructured. This by no means implies that the content doesn’t have organization. Indeed, a book has a table of contents, chapters, subchapters, and an index--in that sense, it follows a definite organization.

However, it would be futile to represent every word and sentence as being part of a strict set of rules. A sentence can consist of words, numbers, punctuation marks, and so on and does not have a predefined data type as spreadsheets do. To be structured, the book would need to have an exact set of characteristics in every sentence, which would be both unreasonable and impractical.

Data from social media, such as posts on Twitter, messages from friends on Facebook, and photos on Instagram, are all examples of unstructured data.

Unstructured data can be stored in various formats. They can be Blobs or, in the case of textual data, freeform text held in a data storage medium. For textual data, technologies such as Lucene/Solr, Elasticsearch, and others are generally used to query, index, and other operations.

主站蜘蛛池模板: 临汾市| 阳山县| 广安市| 尼木县| 星子县| 曲沃县| 万载县| 米泉市| 永康市| 洛浦县| 苍南县| 德格县| 陵水| 婺源县| 望谟县| 温宿县| 旬阳县| 青龙| 仪征市| 余姚市| 望江县| 中西区| 敖汉旗| 繁峙县| 阿拉尔市| 邯郸市| 乌苏市| 察隅县| 青神县| 威远县| 东安县| 会昌县| 阿拉善左旗| 福建省| 鄂尔多斯市| 万荣县| 瓮安县| 塔城市| 邹平县| 宝清县| 桃园市|