Distributed document store and search server based on Apache Lucene
$ curl -XGET localhost:9200/jug/customers/1?pretty { "_index" : "jug", "_type" : "customers", "_id" : "1", "_version" : 1, "found" : true, "_source":{"name": "Sunrise Inc.", "location": "Dallas, TX"} }
Elasticsearch | RDBMS |
---|---|
Node | Database server |
Index | Database/namespace |
Type | Table |
Field | Column |
Document | Row in a table |
Doc | Terms |
#1 | [big, brown, fox ] |
#2 | [attract, brown, dog, window] |
Q | [fox, say] |
Term | Docs |
attract | #2 |
big | #1 |
brown | #1, #2 |
dog | #2 |
fox | #1 |
window | #2 |
In presence of network partitions, we can't have both consistency and availability.
Watch out for:
minimum_master_nodes=n/2+1
discovery: zen: minimum_master_nodes: 2
You should store your data in a real database and replicate it to Elasticsearch
— @aphyr #CraftConf
https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0
Scenario:
Cluster was available for writes (and reads) all the time, but the data are inconsistent at the end.
How to avoid it?
Do not let elasticsearch recreate missing shards. Make it drop the write requests and return partial results on read.
Example config for 4 data node cluster:
gateway: expected_data_nodes: 4 recover_after_time: 48h
Use mlockall
to avoid swapping and to make sure all the memory is
locked at start up.
elasticsearch.yml
bootstrap: mlockall: true
But make sure that OS allowed it.
[2015-02-27 13:58:00,828][WARN ][common.jna ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
curl -s $URL/_nodes/process?pretty | grep mlockall "mlockall" : true "mlockall" : true "mlockall" : true "mlockall" : true "mlockall" : true
Elasticsearch uses a circuit breaker to avoid OOM
CircuitBreakingException[Data too large, data for field [canonicalFolderPath] would be larger than limit of [19285514649/17.9gb]]
Long stop the world pauses may result in nodes leaving the cluster.
[2015-04-10 15:42:20,057][WARN ][monitor.jvm ] [sjc-elasticsearch-data03-si] [gc][old][708332][307] duration [7.3m], collections [37]/[7.3m], total [7.3m]/[25.6m], memory [29.9gb]->[26.7gb]/[29.9gb], all_pools {[young] [532.5mb]->[341.3mb]/[532.5mb]} {[survivor] [62.4mb]->[63.9mb]/[66.5mb]}{[old] [29.3gb]->[26.3gb]/[29.3gb]}
Causes of long GC:
Watch out for OOMs!
Kopf — cluster overview
HQ — basic diagnostics
Bigdesk — node level monitoring
Nagios:
Other:
_cluster/stats
_cat