Comparison with elasticsearch-py, the “Official Client”¶
pyelasticsearch was created before Elasticsearch-the-company provided its own
client libraries for anything other than Java. There was no reliable,
large-scale ES client for Python: pyes was closest, but it suffered from
unreliability and pervasive weirdness, like closing sockets in
doing things which were obvious no-ops. We adapted pyelasticsearch from an
older, very simple client library and gave it a complete API overhaul in
version 0.2, inspired by the principles of poetic API design.
Elasticsearch-the-company later created its own clients, with a strong leaning toward keeping them similar across languages for ease of support and maintenance. The upside is that their libraries always support the latest ES features, down to every last nook and cranny, because the relevant parts are autogenerated from a generic API description language. The downside is that they feel autogenerated: some things end up less than Pythonic.
Which Should You Use?¶
The official Python client borrows much design—and code—from pyelasticsearch. Starting in 1.0, we return the favor, using elasticsearch-py’s transport layer rather than maintaining our own. The important differences remain at the API level.
In general, pyelasticsearch focuses on...
pyelasticsearch is designed to feel elegant to the caller. For example, we strive for symmetry: creating an index is
es.create(), and searching one is
es.search(). In elasticsearch-py, creating an index is nested inside
es.indices.create(<index name>), an artifact of code organization. The tradeoff for added design thought is that the project moves slower.
Good defaults and simple interfaces
For example, there is only a single transport, HTTP(S), but it is almost always the right one. Thrift, the leading alternative, yields a 15% speed boost but only when using many small requests. It doesn’t help at all for bulk indexing, where speed is most often a concern, and it complicates troubleshooting, proxying, and setup. In fact, it’s deprecated in ES 1.5 and will be removed in 2.0.
For another example, if you use an HTTPS URL, the authenticity of the server certificate will be automatically verified using Mozilla’s certificate authority store. You neither have to manually enable verification nor provide your own store.
The tradeoff here is that we don’t expose as many knobs to twiddle as the official client. If you have unusual needs, like using self-signed SSL certificates, we might not be for you. Otherwise, you can enjoy less verbose code.
If something fails, it always raises an exception, making it hard to accidentally ignore. elasticsearch-py doesn’t always do this: you need to check for errors explicitly when using its bulk indexing helper, for example.
In addition, data loss is hard to stumble into; we put up guiderails. For example, calling the update-settings API with no indices would, if we simply followed the ES REST API, update all indices, a far-reaching destructive action caused by an omission. We require the explicit use of an
update_all_settings()method if you want to do this.
You should never need to read the source code to figure out what to do. In order to twiddle many of the aforementioned knobs in elasticsearch-py, you must squirrel kwargs down through multiple undocumented layers, from constructor to constructor, until something finally understands them. On the way, it’s often unclear what’s public and what’s private.
Our top-level docs are comprehensive with regard to our API, we link to the ES docs for details about their system, and we try to respect the Law of Demeter in our layering.
Conversely, elasticsearch-py focuses on...
It provides explicit hooks into every corner of ES and keeps up to date with ES releases.
If you’re using ES from multiple languages every day, you might enjoy an API that looks similar across them.
Conversely, we aim for idiomatic Python.