Can't create index using custom analyzer with ES 0.90.0RC1

Hi all,

I'm using a custom analyzer which is essentially a list of my own stopwords

  • i use this in string fields (use multi-field to do specific language
    analysis on the 1st sub-field and then my own generic stopword analysis on
    the 2nd sub field).

I've been using this since ES 0.20.x and each time simply upgraded my
instance (single node in a cluster) - testing the analyzer works fine (e.g.
the indexes are all there, counts are correct post upgrade and searches
work as expected)

To do some data migration,i've run my schema creation script on es
0.90.0RC1 and get no errors on creation. However, setting the gateway log
to trace, i see lots of:
BroadcastShardOperationFailedException[[users][3] No active shard(s)]

When i go to insert data into the index (or do any kind of operation on the
index e.g. count), i get (after 1minute wait):

{"error":"UnavailableShardsException[[items][2] [2] shardIt, [0] active :
Timeout waiting for [1m], request: index
{[items][item][_XsJJvbXSoqRQqzhgKevUQ],
source[{"item_id":"1"}]}]","status":503}

No entry is seen in the ES logs while this is happening.

I've added a gist (https://gist.github.com/derryos/7a64c1fcc9416f91f561)
where i recreated the flow using the following steps:

  1. Setup a new ES instance (elasticsearch-test)
  2. Use the github example to make a new index (twitter/user/kimchy)
  3. Verify that all works ok (using search/count)
  4. Run my schema/index creation script
  5. Note the errors in the es log with gateway.local set to TRACE
  6. Try and do insert/search/count operations and note the timeout error
    reported above.
  7. repeat step 2 using a new index (newindex) and verify that it is working
    ok

So it seems that
a) Upgrading an index with this analyzer in it works ok
b) The instance itself seems ok for the other indexes (twitter run before
my index schema creation and newindex run after)
c) I can't interact with my newly created indexes now with 0 data in them
(apart from twitter/newindex)

Any help greatly appreciated - i've tested this on multiple machines (over
both windows/OS) and get similar outcomes.

Derry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

After discussions with clinton on the IRC channel, I've created a test
script which when run on both 0.20.4 and 0.90.0RC1 shows the issue.

Derry

On Tuesday, 26 March 2013 11:36:22 UTC, Derry O' Sullivan wrote:

Hi all,

I'm using a custom analyzer which is essentially a list of my own
stopwords - i use this in string fields (use multi-field to do specific
language analysis on the 1st sub-field and then my own generic stopword
analysis on the 2nd sub field).

I've been using this since ES 0.20.x and each time simply upgraded my
instance (single node in a cluster) - testing the analyzer works fine (e.g.
the indexes are all there, counts are correct post upgrade and searches
work as expected)

To do some data migration,i've run my schema creation script on es
0.90.0RC1 and get no errors on creation. However, setting the gateway log
to trace, i see lots of:
BroadcastShardOperationFailedException[[users][3] No active shard(s)]

When i go to insert data into the index (or do any kind of operation on
the index e.g. count), i get (after 1minute wait):

{"error":"UnavailableShardsException[[items][2] [2] shardIt, [0] active :
Timeout waiting for [1m], request: index
{[items][item][_XsJJvbXSoqRQqzhgKevUQ],
source[{"item_id":"1"}]}]","status":503}

No entry is seen in the ES logs while this is happening.

I've added a gist (ES Schema Creation Issus · GitHub)
where i recreated the flow using the following steps:

  1. Setup a new ES instance (elasticsearch-test)
  2. Use the github example to make a new index (twitter/user/kimchy)
  3. Verify that all works ok (using search/count)
  4. Run my schema/index creation script
  5. Note the errors in the es log with gateway.local set to TRACE
  6. Try and do insert/search/count operations and note the timeout error
    reported above.
  7. repeat step 2 using a new index (newindex) and verify that it is
    working ok

So it seems that
a) Upgrading an index with this analyzer in it works ok
b) The instance itself seems ok for the other indexes (twitter run before
my index schema creation and newindex run after)
c) I can't interact with my newly created indexes now with 0 data in them
(apart from twitter/newindex)

Any help greatly appreciated - i've tested this on multiple machines (over
both windows/OS) and get similar outcomes.

Derry

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Wed, 2013-03-27 at 07:22 -0700, Derry O' Sullivan wrote:

After discussions with clinton on the IRC channel, I've created a test
script which when run on both 0.20.4 and 0.90.0RC1 shows the issue.

Testing regression issue on 0.20.4 vs 0.90.RC1 with script testing and sample output. Just create 0.20.4 and 0.90.RC1 and then run script across both. · GitHub

As follow up, the issue was that Derry was creating an index, closing
it, attempting to update the settings (including analyzers, which is now
not allowed) then trying to open it.

Instead, creating the index with the appropriate settings solved the
issue

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.