- Mastering PostgreSQL 10
- Hans Jürgen Sch?nig
- 310字
- 2021-06-30 19:03:59
Understanding full-text search - FTS
If you are looking up names or for simple strings, you are usually querying the entire content of a field. In Full-Text-Search (FTS), this is different. The purpose of the full-text search is to look for words or groups of words, which can be found in a text. Therefore, FTS is more of a contains operation as you are basically never looking for an exact string.
In PostgreSQL, FTS can be done using GIN indexes. The idea is to dissect a text, extract valuable lexemes (= "preprocessed tokens of words"), and index those elements rather than the underlying text. To make your search even more successful, those words are preprocessed.
Here is an example:
test=# SELECT to_tsvector('english', 'A car, I want a car. I would not even mind having many cars');
to_tsvector
---------------------------------------------------------------
'car':2,6,14 'even':10 'mani':13 'mind':11 'want':4 'would':8
(1 row)
The example shows a simple sentence. The to_tsvector function will take the string, apply English rules, and perform a stemming process. Based on the configuration (english), PostgreSQL will parse the string, throw away stop words, and stem individual words. For example, car and cars will be transformed to the car. Note that this is not about finding the word stem. In the case of many, PostgreSQL will simply transform the string to mani by applying standard rules working nicely with the English language.
Note that the output of the to_tsvector function is highly language dependent. If you tell PostgreSQL to treat the string as dutch, the result will be totally different:
test=# SELECT to_tsvector('dutch', 'A car, I want a car. I would not even mind having many cars');
to_tsvector
-----------------------------------------------------------------
'a':1,5 'car':2,6,14 'even':10 'having':12 'i':3,7 'many':13
'mind':11 'not':9 'would':8
(1 row)
To figure out which configurations are supported, consider running the following query:
SELECT cfgname FROM pg_ts_config;
- Ansible Configuration Management
- 輕松學C語言
- TIBCO Spotfire:A Comprehensive Primer(Second Edition)
- 手把手教你玩轉RPA:基于UiPath和Blue Prism
- 返璞歸真:UNIX技術內幕
- B2B2C網上商城開發指南
- 變頻器、軟啟動器及PLC實用技術260問
- Learn CloudFormation
- Mastering Predictive Analytics with scikit:learn and TensorFlow
- 計算機應用基礎實訓·職業模塊
- ADuC系列ARM器件應用技術
- 人工智能云平臺:原理、設計與應用
- 精通ROS機器人編程(原書第2版)
- Kubernetes on AWS
- x86/x64體系探索及編程