Elasticsearch configuration for high sustainable bulk feed
Test on single node, MacBook Pro, 16 GB RAM, 1TB SSD, OS X Maverick
ES 1.1.0 with Java 8, G1 GC, 12 GB heap
/Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/bin/java -Xms12g -Xmx12g -Djava.awt.headless=true -XX:+UseG1GC -Delasticsearch -Des.foreground=yes -Des.path.home=/Users/es/elasticsearch-1.1.0 -cp :/Users/es/elasticsearch-1.1.0/lib/elasticsearch-1.1.0.jar:/Users/es/elasticsearch-1.1.0/lib/:/Users/es/elasticsearch-1.1.0/lib/sigar/ org.elasticsearch.bootstrap.Elasticsearch
no bloom filter cache
concurrent merge scheduler
max 4 threads for merge, also for optimize API
max 4 segments per tier
max 1gb segment size
1/3 of heap for index buffer
for SSD, disable store throttling
adjust merge and bulk thread pools
index:
codec:
bloom:
load: false
merge:
scheduler:
type: concurrent
max_thread_count: 4
policy:
type: tiered
max_merged_segment: 1gb
segments_per_tier: 4
max_merge_at_once: 4
max_merge_at_once_explicit: 4
indices:
memory:
index_buffer_size: 33%
store:
throttle:
type: none
threadpool:
merge:
type: fixed
size: 4
queue_size: 32
bulk:
type: fixed
size: 8
queue_size: 32
1 shard
0 replica
no refresh interval (-1)
index.number_of_shards: 1
index.number_of_replica: 0
index.refresh_interval: -1
Mapping for string texts: all norms, freqs can be disabled because of the nature of the input data
"mappings" : {
"_default_" : {
"dynamic_templates" : [
{
"string_template" : {
"match_mapping_type" : "string",
"path_match" : "*",
"mapping" : {
"type" : "string",
"norms" : { "enabled" : false },
"index_options" : "docs"
}
}
}
]
}
}