官术网_书友最值得收藏!

Updating document fields

Imagine that you have a system where you store documents your users upload. In addition to this, your users can add other users to have access to the files they uploaded. Before Solr 4, you had to reindex the whole document to update it. With the release of Solr 4 and later versions, we are allowed to update a single field if we fulfill some basic requirements. This recipe will show you how to do this.

How to do it...

Let's look at the steps we need to take to update the document field:

  1. For the purpose of the recipe, let's assume we have the following index structure (put the following entries into your schema.xml file):
    <field name="id" type="string" indexed="true" stored="true" required="true" />
    <field name="file" type="text_general" indexed="true" stored="true"/>
    <field name="count" type="int" indexed="true" stored="true"/>
    <field name="user" type="string" indexed="true" stored="true" multiValued="true" />
  2. In addition to this, we need the _version_ field:
    <field name="_version_" type="long" indexed="true" stored="true"/>
    

    That's all when it comes to the schema.xml file.

  3. In addition to this, let's assume we have the following data indexed:
    <add>
     <doc>
      <field name="id">1</field>
      <field name="file">Sample file</field>
      <field name="count">2</field>
      <field name="user">gro</field>
      <field name="user">negativ</field>
     </doc>
    </add>
  4. So, we have a sample file and two usernames specifying which users in our system can access the file. However, what if we want to add another user called jack? Is it possible? To add the value to a field that has multiple values, we should send the following command:
    curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","user":{"add":"jack"}}]'
    

    Let's see if it works by sending the following query:

    http://localhost:8983/solr/cookbook/select?q=*:*&indent=true

    The response sent by Solr is as follows:

    <?xml version="1.0" encoding="UTF-8"?>
    <response>
     <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">0</int>
      <lst name="params">
       <str name="q">*:*</str>
      </lst>
     </lst>
     <result name="response" numFound="1" start="0">
      <doc>
       <str name="id">1</str>
       <str name="file">Sample file</str>
       <int name="count">2</int>
       <arr name="user">
        <str>gro</str>
        <str>negativ</str>
        <str>jack</str>
       </arr>
       <long name="_version_">1467522939960164352</long></doc>
     </result>
    </response>

    As you can see, it works without any problems.

  5. Now, imagine that one of the users changed the name of the document, and we want to update the file field of this document to match the change. In order to do so, we should send the following command:
    curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","file":{"set":"New file name"}}]'
    
  6. Again, we send the same query as before to see if the command succeeds:
    <?xml version="1.0" encoding="UTF-8"?>
    <response>
     <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">1</int>
      <lst name="params">
       <str name="q">*:*</str>
      </lst>
     </lst>
     <result name="response" numFound="1" start="0">
      <doc>
       <str name="id">1</str>
       <str name="file">New file name</str>
       <int name="count">2</int>
       <arr name="user">
        <str>gro</str>
        <str>negativ</str>
        <str>jack</str>
       </arr>
       <long name="_version_">1467522994255429632</long></doc>
     </result>
    </response>
  7. Finally, let's increment the count field, which specifies how many times the file is accessed. To do this, we run the following command:
    curl 'localhost:8983/solr/cookbook/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","count":{"inc":1}}]'
    
  8. Again, we send the same query as before to see if the command succeeds:
    <?xml version="1.0" encoding="UTF-8"?>
    <response>
     <lst name="responseHeader">
      <int name="status">0</int>
      <int name="QTime">1</int>
      <lst name="params">
       <str name="q">*:*</str>
      </lst>
     </lst>
     <result name="response" numFound="1" start="0">
      <doc>
       <str name="id">1</str>
       <str name="file">New file name</str>
       <int name="count">3</int>
       <arr name="user">
        <str>gro</str>
        <str>negativ</str>
        <str>jack</str>
       </arr>
       <long name="_version_">1467523747367878656</long></doc>
     </result>
    </response>

Again, the command works well. So, let's see how Solr does this.

How it works...

As you can see, the index structure is pretty simple; we have the document identifier and its name and users that can access the file. As you can see, all the fields in the index are marked as stored (stored="true"). This is required for the partial update functionality to work. This is because, under the hood, Solr takes all the values from the fields and updates the one we tell it to update. So, it is just a typical document indexing, but instead of you having to provide all the information, it's Solr's responsibility to get it from the index.

Another thing that is required for the atomic update functionality to work is the _version_ field. You don't have to set it during indexation; it is used internally by Solr. The example data we index is also very simple. It is a single document with two users defined.

The interesting stuff comes with the update command. As you can see, this command is run against a standard update handler you run indexation against. The commit=true parameter tells Solr to perform the commit operation right after update. The -H 'Content-type:application/json' part is responsible for setting the correct HTTP headers for the update request.

Next, we have the request contents. It is sent as a JSON object. We specify that we are interested in the document with identifier 1 ("id":"1"). We want to change the user field and add the jack value to this field (the add command). So, as you can see, the add command is used when we want to add a new value to a field that can hold multiple values.

The second command shows how to change the value of a single-valued field. It is very similar to what we had before, but instead of using the add command, we use the set command. Again, as you can see, it works perfectly.

The third command shown in the recipe illustrates how to increment a field. We can run this command against any numeric field. We need to use the inc command and specify a number that will be added to the value of the field in the index. In our case, we add 1.

Note that apart from the add, set, and inc commands, we can also remove values (the remove command) using regex (the removeregex command). The number of commands can grow with time, so keep an eye on https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents.

主站蜘蛛池模板: 英德市| 寿光市| 临泽县| 得荣县| 黎城县| 大厂| 民权县| 云林县| 丹寨县| 九江市| 施甸县| 抚顺县| 台前县| 江城| 孙吴县| 宜阳县| 六盘水市| 山东| 读书| 全椒县| 巴彦县| 江阴市| 新野县| 南汇区| 额济纳旗| 宣威市| 铜陵市| 柘城县| 龙岩市| 峡江县| 平阳县| 黄大仙区| 永胜县| 大庆市| 兴和县| 浪卡子县| 当雄县| 松溪县| 乌鲁木齐县| 陇西县| 峨眉山市|