Hi, I observed a situation but I don't know is this normal or not.
I use Elasticsearch 1.7.2 and my observation as below.
I set a index with refresh 6s and when I start to add a document to this index, every 6s I see filesystem change, that is a *.cfe,*cfs file were created.
If I keep adding document to index(every 2s), every 6s, file were created.
If this normal?
Is refresh will trigger flush action if there is a change occurred? That measn if there is any pending document in buffer(or translog change), refresh will flush to disk and generate a small segment file?
PS: If I stop to add document to index. There is no file created every 6s.
@warkolm thanks.
But if refresh calls a flush and write files to the filesystem periodically by setting (says 6s). Does it means no matter index buffer /translog full or not(512MB by default) every time interval reach, flush to real segment files happened?
I suppose the "flush" you are talking is NOT "fsync", right?
From that guide, it says:
"...Once every second, the shard is refreshed:
The docs in the in-memory buffer are written to a new segment, without an fsync. ..."
That means the flush just move the docs from memory buffer to file system cache, and they will be fully committed to disk after "flush" operation.
@Youxu, From that guide, it says:
"...Once every second, the shard is refreshed:
The docs in the in-memory buffer are written to a new segment, without an fsync. ..."
I'm not clear understanding this paragraph. Where did the new segment go without fsync? I mean is the segment move to HEAP or VirtualMemory somewhere and wating for fsync() called by whom?
What I really want to know is "doing frequent refresh will hurt disk I/O performance or not?"
I assume that frequent refresh will generate more segment files(and they really flush into disk.) during indexing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.