Geek Igor

Accessing Elasticsearch Cluster over https

26 Nov 2014 on elasticsearchssl

Access your cluster securely in python or in java.

Elasticsearch offers no security out-of-the box. The connections are over http or their native protocol, both unencrypted, for all the world to see. This is not a problem if your cluster is in the same datacenter as the rest of your infrastructure - safely behind a firewall. Quite often this is not the case. You can put the cluster in Amazon EC2 or Google Compute Engine or something similar. If you do this, then need to encrypt the connection. Arguably, the easiest solution is to use https. There are various ways to configure the cluster, usually through some third-party proxy like nginx (please see the discussion). Recently, we’ve decided to spin a new cluster in Google Compute Engine and to allow only https access. I was curious if my existing java and python client code will work out-of-the box.

For java we use jest client. Under the hood it uses Apache http client and the change was only to substitute http with https. If you use self-signed cert then you’ll need to make an additional step. It is possible to convince http client accept a self-signed cert, but you will end up with extending JestClientFactory to provide own implementation of the configureHttpClient method.

The python client is not that smart. It strips the protocol part of the host url (see the _normalize_hosts method). So https://elasticsearch.me.org becomes elasticsearch.me.org and is accessed via regular, unencrypted http connection. You can switch it to https by providing an additional parameter use_ssl=True. If you already have https:// in the host address then this is a duplication and a place for errors. You can use the following snippet to avoid it and “guess” the correct schema automatically.

es = Elasticsearch(host, use_ssl=host.startswith("https://"))

Update 2014-11-27 By the way, the python client doesn’t verify the certs out-of-the box. IMHO this is not a very good practice and you can force the verification with verify_certs=True.

es = Elasticsearch(host, use_ssl=host.startswith("https://"), verify_certs=True)