官术网_书友最值得收藏!

  • Mastering PostgreSQL 9.6
  • Hans Jurgen Schonig
  • 294字
  • 2021-07-09 19:57:23

Understanding full-text search - FTS

If you are looking up names or for simple strings, you are usually querying the entire content of a field. In FTS, this is different. The purpose of full-text search is to look for words or groups of words, which can be found inside a text. Therefore, FTS is more of a contains operation as you are basically never looking for an exact string.

In PostgreSQL, FTS can be done using GIN indexes. The idea is to dissect a text, extract valuable lexemes, and index those elements rather than the underlying text. To make your search even more successful, those words are preprocessed.

Here is an example:

test=# SELECT to_tsvector('english', 'A car, I want a car. I would not even mind having many cars'); 
to_tsvector
---------------------------------------------------------------
'car':2,6,14 'even':10 'mani':13 'mind':11 'want':4 'would':8
(1 row)

The example shows a simple sentence. The to_tsvector function will take the string, apply English rules and perform a stemming process. Based on the configuration (english), PostgreSQL will parse the string, throw away stop words and stem individual words. For example, car and cars will be transformed to car. Note that this is not about finding the word stem. In the case of many, PostgreSQL will simply transform the string to mani by applying standard rules working nicely with the English language.

Note that the output of the to_tsvector function is highly language dependent. If you tell PostgreSQL to treat the string as Dutch, the result will be totally different:

test=# SELECT to_tsvector('dutch', 'A car, I want a car. I would not even mind having many cars'); 
to_tsvector
-----------------------------------------------------------------
'a':1,5 'car':2,6,14 'even':10 'having':12 'i':3,7 'many':13
'mind':11 'not':9 'would':8
(1 row)

To figure out which configurations are supported, consider running the following query:

SELECT cfgname FROM pg_ts_config;
主站蜘蛛池模板: 酒泉市| 望城县| 望江县| 凤城市| 双辽市| 乐亭县| 临海市| 大方县| 普陀区| 五河县| 农安县| 瓦房店市| 盘山县| 五河县| 昌宁县| 莱阳市| 中西区| 栾城县| 萝北县| 遵义县| 苏尼特右旗| 涿鹿县| 黄大仙区| 盐亭县| 高淳县| 四会市| 梁山县| 华阴市| 平利县| 玛沁县| 禹州市| 南澳县| 扎囊县| 高邑县| 嘉义县| 中方县| 黄平县| 无为县| 托克逊县| 浠水县| 且末县|