- Mastering PostgreSQL 9.6
- Hans Jurgen Schonig
- 294字
- 2021-07-09 19:57:23
Understanding full-text search - FTS
If you are looking up names or for simple strings, you are usually querying the entire content of a field. In FTS, this is different. The purpose of full-text search is to look for words or groups of words, which can be found inside a text. Therefore, FTS is more of a contains operation as you are basically never looking for an exact string.
In PostgreSQL, FTS can be done using GIN indexes. The idea is to dissect a text, extract valuable lexemes, and index those elements rather than the underlying text. To make your search even more successful, those words are preprocessed.
Here is an example:
test=# SELECT to_tsvector('english', 'A car, I want a car. I would not even mind having many cars');
to_tsvector
---------------------------------------------------------------
'car':2,6,14 'even':10 'mani':13 'mind':11 'want':4 'would':8
(1 row)
The example shows a simple sentence. The to_tsvector function will take the string, apply English rules and perform a stemming process. Based on the configuration (english), PostgreSQL will parse the string, throw away stop words and stem individual words. For example, car and cars will be transformed to car. Note that this is not about finding the word stem. In the case of many, PostgreSQL will simply transform the string to mani by applying standard rules working nicely with the English language.
Note that the output of the to_tsvector function is highly language dependent. If you tell PostgreSQL to treat the string as Dutch, the result will be totally different:
test=# SELECT to_tsvector('dutch', 'A car, I want a car. I would not even mind having many cars');
to_tsvector
-----------------------------------------------------------------
'a':1,5 'car':2,6,14 'even':10 'having':12 'i':3,7 'many':13
'mind':11 'not':9 'would':8
(1 row)
To figure out which configurations are supported, consider running the following query:
SELECT cfgname FROM pg_ts_config;
- Instant Raspberry Pi Gaming
- 輕輕松松自動化測試
- 樂高機器人:WeDo編程與搭建指南
- PIC單片機C語言非常入門與視頻演練
- Creo Parametric 1.0中文版從入門到精通
- Ceph:Designing and Implementing Scalable Storage Systems
- JavaScript典型應用與最佳實踐
- Storm應用實踐:實時事務(wù)處理之策略
- Red Hat Linux 9實務(wù)自學手冊
- Cloud Security Automation
- 手把手教你學Flash CS3
- 筆記本電腦使用與維護
- PostgreSQL High Performance Cookbook
- 網(wǎng)絡(luò)信息安全項目教程
- 開放自動化系統(tǒng)應用與實戰(zhàn):基于標準建模語言IEC 61499