- Solr Cookbook(Third Edition)
- Rafa? Ku?
- 796字
- 2021-08-06 19:39:24
Updating document fields
Imagine that you have a system where you store documents your users upload. In addition to this, your users can add other users to have access to the files they uploaded. Before Solr 4, you had to reindex the whole document to update it. With the release of Solr 4 and later versions, we are allowed to update a single field if we fulfill some basic requirements. This recipe will show you how to do this.
How to do it...
Let's look at the steps we need to take to update the document field:
- For the purpose of the recipe, let's assume we have the following index structure (put the following entries into your
schema.xml
file):<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="file" type="text_general" indexed="true" stored="true"/> <field name="count" type="int" indexed="true" stored="true"/> <field name="user" type="string" indexed="true" stored="true" multiValued="true" />
- In addition to this, we need the
_version_
field:<field name="_version_" type="long" indexed="true" stored="true"/>
That's all when it comes to the
schema.xml
file. - In addition to this, let's assume we have the following data indexed:
<add> <doc> <field name="id">1</field> <field name="file">Sample file</field> <field name="count">2</field> <field name="user">gro</field> <field name="user">negativ</field> </doc> </add>
- So, we have a sample file and two usernames specifying which users in our system can access the file. However, what if we want to add another user called
jack
? Is it possible? To add the value to a field that has multiple values, we should send the following command:curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","user":{"add":"jack"}}]'
Let's see if it works by sending the following query:
http://localhost:8983/solr/cookbook/select?q=*:*&indent=true
The response sent by Solr is as follows:
<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> <lst name="params"> <str name="q">*:*</str> </lst> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="id">1</str> <str name="file">Sample file</str> <int name="count">2</int> <arr name="user"> <str>gro</str> <str>negativ</str> <str>jack</str> </arr> <long name="_version_">1467522939960164352</long></doc> </result> </response>
As you can see, it works without any problems.
- Now, imagine that one of the users changed the name of the document, and we want to update the
file
field of this document to match the change. In order to do so, we should send the following command:curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","file":{"set":"New file name"}}]'
- Again, we send the same query as before to see if the command succeeds:
<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> <lst name="params"> <str name="q">*:*</str> </lst> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="id">1</str> <str name="file">New file name</str> <int name="count">2</int> <arr name="user"> <str>gro</str> <str>negativ</str> <str>jack</str> </arr> <long name="_version_">1467522994255429632</long></doc> </result> </response>
- Finally, let's increment the
count
field, which specifies how many times the file is accessed. To do this, we run the following command:curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","count":{"inc":1}}]'
- Again, we send the same query as before to see if the command succeeds:
<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">1</int> <lst name="params"> <str name="q">*:*</str> </lst> </lst> <result name="response" numFound="1" start="0"> <doc> <str name="id">1</str> <str name="file">New file name</str> <int name="count">3</int> <arr name="user"> <str>gro</str> <str>negativ</str> <str>jack</str> </arr> <long name="_version_">1467523747367878656</long></doc> </result> </response>
Again, the command works well. So, let's see how Solr does this.
How it works...
As you can see, the index structure is pretty simple; we have the document identifier and its name and users that can access the file. As you can see, all the fields in the index are marked as stored
(stored="true"
). This is required for the partial update functionality to work. This is because, under the hood, Solr takes all the values from the fields and updates the one we tell it to update. So, it is just a typical document indexing, but instead of you having to provide all the information, it's Solr's responsibility to get it from the index.
Another thing that is required for the atomic update functionality to work is the _version_
field. You don't have to set it during indexation; it is used internally by Solr. The example data we index is also very simple. It is a single document with two users defined.
The interesting stuff comes with the update
command. As you can see, this command is run against a standard update handler you run indexation against. The commit=true
parameter tells Solr to perform the commit operation right after update. The -H 'Content-type:application/json'
part is responsible for setting the correct HTTP headers for the update request.
Next, we have the request contents. It is sent as a JSON object. We specify that we are interested in the document with identifier 1
("id":"1"
). We want to change the user
field and add the jack
value to this field (the add
command). So, as you can see, the add
command is used when we want to add a new value to a field that can hold multiple values.
The second command shows how to change the value of a single-valued field. It is very similar to what we had before, but instead of using the add
command, we use the set
command. Again, as you can see, it works perfectly.
The third command shown in the recipe illustrates how to increment a field. We can run this command against any numeric field. We need to use the inc
command and specify a number that will be added to the value of the field in the index. In our case, we add 1
.
Note that apart from the add
, set
, and inc
commands, we can also remove values (the remove
command) using regex (the removeregex
command). The number of commands can grow with time, so keep an eye on https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents.
- 自己動手寫搜索引擎
- Java高并發核心編程(卷2):多線程、鎖、JMM、JUC、高并發設計模式
- 數據結構(Java語言描述)
- BeagleBone Media Center
- 64位匯編語言的編程藝術
- 征服RIA
- Raspberry Pi 2 Server Essentials
- Learn React with TypeScript 3
- 用戶體驗增長:數字化·智能化·綠色化
- Advanced Oracle PL/SQL Developer's Guide(Second Edition)
- 編寫高質量代碼:改善Objective-C程序的61個建議
- Node.js 12實戰
- 算法設計與分析:基于C++編程語言的描述
- Oracle 12c從入門到精通(視頻教學超值版)
- Spring Data JPA從入門到精通