Hi,
I am using elasticsearch 0.19.4 and I've set up a wikipedia river 1 on
top of that. I have three questions (sorry to pack them up in one thread):
-
the wikipedia dump is around 8GB -unzipped. But my data folder after
indexing wikipedia is around 7GB. I assume that there was an interrupt
during the indexing process, so it is not complete yet. Is there a way to
make elasticsearch to continue indexing? -
I am using elasticsearch to query tweets. I am experimenting if it's
available to get some topic assignments from the search engine by doing
this. I am currently using pyelasticsearch 2 Python library and using
basic search function. Do you think of any better search type that might be
useful to search queries like the following?
res = conn.search("Donna Karan To Attend Haiti Charity Fundraiser")
Top 10 results and their scores:
Donna Karan 0.43873835
Haiti/Government 0.1892942
Haiti/People 0.1892942
Haiti/Transportation 0.1892942
Donna Summers 0.18910009
Haiti/History 0.1801775
Haiti/Communications 0.1801775
Haitian music 0.1600614
ISO 3166-1:HT 0.15908843
Haiti/Transnational issues 0.15862915
- Does elasticsearch supports query rewriting/pre-processing? If I search
for "Obama" and "Obama's", the results are quite different.
Thanks,
Pinar