- Add support for custom certificate authorities via the
ca_certsarg to the
- Add support for client certificates via the
- Add support for HTTPS.
- Add username, password, and port kwargs to the constructor so you don’t have to repeat their values if they’re the same across many servers.
- Don’t crash when the
query_paramskwarg is omitted from calls to
- Fix a bug in which specifying
_allas an index name sometimes caused doctype names to be treated as index names.
- Correct a typo in the
- Update ES doc links, now that Elastic has changed domains and reorganized its docs.
- Require elasticsearch lib 1.3 or greater, as that’s when it started exposing
- Make sure the Content-Length header gets set when calling
create_index()with no explicit
settingsarg. This solves 411s when using nginx as a proxy.
bulk_chunks()compute perfectly optimal results, no longer ever exceeding the byte limit unless a single document is over the limit on its own.
- Introduce new bulk API, supporting all types of bulk operations (index,
update, create, and delete), providing chunking via
bulk_chunks(), and introducing per-action error-handling. All errors raise exceptions–even individual failed operations–and the exceptions expose enough data to identify operations for retrying or reporting. The design is decoupled in case you want to create your own chunkers or operation builders.
bulk_index()in favor of the more capable
- Make one last update to
bulk_index(). It now catches individual operation failures, raising
BulkError. Also add the
type_fieldargs, allowing you to index across different indices and doc types within one request.
ElasticSearchobject now defaults to http://localhost:9200/ if you don’t provide any node URLs.
- Improve docs: give a better overview on the front page, and document how to customize JSON encoding.
- Switch to elasticsearch-py’s transport and downtime-pooling machinery, much of which was borrowed from us anyway.
- Make bulk indexing (and likely other network things) 15 times faster.
- Add a comparison with the official client to the docs.
delete_by_query()to work with ES 1.0 and later.
percolate()es_kwargs up to date.
- Fix all tests that were failing on modern versions of ES.
- Tolerate errors that are non-strings and create exceptions for them properly.
- Drop compatibility with elasticsearch < 1.0.
cluster_state()to work with ES 1.0 and later. Arguments have changed.
- InvalidJsonResponseError no longer provides access to the HTTP response
responseproperty): just the bad data (the
- Change from the logger “pyelasticsearch” to “elasticsearch.trace”.
revival_delayparam from ElasticSearch object.
send_request(). Now all dicts are JSON-encoded, and all strings are left alone.
- Brings tests up to date with
- When an
id_fieldis specified for
bulk_index(), don’t index it under its original name as well; use it only as the
get_aliases()for consistency with other methods. Original name still works but is deprecated. Add an
aliaskwarg to the method so you can fetch specific aliases.
update_aliases()no longer requires a dict with an
actionskey; that much is implied. Just pass the value of that key.
- Update package requirements to allow requests 2.0, which is in fact compatible. (Natim)
- Properly raise
IndexAlreadyExistsExceptioneven if the error is reported by a node other than the one to which the client is directly connected. (Jannis Leidel)
Note the change in behavior of
bulk_index() in this release. This change
probably brings it more in line with your expectations. But double check,
since it now overwrites existing docs in situations where it didn’t before.
Also, we made a backward-incompatible spelling change to a little-used
bulk_index()now overwrites any existing doc of the same ID and doctype. Before, in certain versions of ES (like 0.90RC2), it did nothing at all if a document already existed, probably much to your surprise. (We removed the
'op_type': 'create'pair, whose intentions were always mysterious.) (Gavin Carothers)
- Rename the
overwrite_existing. The old name implied the opposite of what it actually did. (Gavin Carothers)
- Support multiple indices and doctypes in
delete_by_query(). Accept both string and JSON queries in the
queryarg, just as
search()does. Passing the
qarg explicitly is now deprecated.
percolate. Thanks, Adam Georgiou and Joseph Rose!
- Add ability to specify the parent document in
bulk_index(). Thanks, Gavin Carothers!
- Remove the internal, undocumented
from_pythonmethod. django-haystack users will need to upgrade to a newer version that avoids using it.
- Refactor JSON encoding machinery. Now it’s clearer how to customize it: just
plug your custom JSON encoder class into
- Don’t crash under
- Support non-ASCII URL path components (like Unicode document IDs) and query string param values.
- Switch to the nose testrunner.
- Fix a bug introduced in 0.4 wherein “None” was accidentally sent to ES when
an ID wasn’t passed to
- Support Python 3.
- Support more APIs:
update(existed but didn’t work before)
- Support the
sizeparam of the
searchmethod. (You can now change
sizein your code if you like.)
- Support the
updatemethods, new since ES 0.20.
- Maintain better precision of floats when passed to ES.
- Change endpoint of bulk indexing so it works on ES < 0.18.
- Support documents whose ID is 0.
- URL-escape path components, so doc IDs containing funny chars work.
- Add a dedicated
IndexAlreadyExistsErrorexception for when you try to create an index that already exists. This helps you trap this situation unambiguously.
- Add docs about upgrading from pyes.
- Remove the undocumented and unused
- Correct the
requestsrequirement to require a version that has everything we need. In fact, require requests 1.x, which has a stable API.
send_requestmethod public so you can use ES APIs we don’t yet explicitly support.
- Handle JSON translation of Decimal class and sets.
more_like_this()take an arbitrary request body so you can filter the returned docs.
- Replace the
mlt_fields. This makes it actually work, as it’s the param name ES expects.
- Make explicit our undeclared dependency on simplejson.
Many thanks to Erik Rose for almost completely rewriting the API to follow best practices, improve the API user experience, and make pyelasticsearch future-proof.
This release is backward-incompatible in numerous ways, please read the following section carefully. If in doubt, you can easily stick with pyelasticsearch 0.1.
count()calling conventions. Each now supports either a textual or a dict-based query as its first argument. There’s no longer a need to, for example, pass an empty string as the first arg in order to use a JSON query (a common case).
- Standardize on the singular for the names of the
doc_typekwargs. It’s not always obvious whether an ES API allows for multiple indexes. This was leading me to have to look aside to the docs to determine whether the kwarg was called
indexes. Using the singular everywhere will result in fewer doc lookups, especially for the common case of a single index.
more_like_thisfor consistency with other methods.
(index, doc_type, doc)rather than
(doc, index, doc_type), for consistency with
bulk_index()and other methods.
(index, doc_type, mapping)rather than
(doc_type, mapping, index).
- To prevent callers from accidentally destroying large amounts of data...
delete()no longer deletes all documents of a doctype when no ID is specified; use
delete_index()no longer deletes all indexes when none are given; use
update_settings()no longer updates the settings of all indexes when none are specified; use
setup_logging()is gone. If you want to configure logging, use the logging module’s usual facilities. We still log to the “pyelasticsearch” named logger.
- Rethink error handling:
- Raise a more specific exception for HTTP error codes so callers can catch it without examining a string.
- Catch non-JSON responses properly, and raise the more specific
NonJsonResponseErrorinstead of the generic
- Remove mentions of nonexistent exception types that would cause crashes
- Crash harder if JSON encoding fails: that always indicates a bug in pyelasticsearch.
- Remove the ill-defined
ElasticSearchErrorif we can’t connect to a node (and we’re out of auto-retries).
ElasticSearchErrorif no documents are passed to
- All exceptions are now more introspectable, because they don’t
immediately mash all the context down into a string. For example, you can
recover the unmolested response object from
quietkwarg, meaning we always expose errors.
- Add Sphinx documentation.
- Add load-balancing across multiple nodes.
- Add failover in the case where a node doesn’t respond.
- Support passing arbitrary kwargs through to the ES query string. Known ones are taken verbatim; unanticipated ones need an “es_” prefix to guarantee forward compatibility.
- Automatically convert
datetimeobjects when encoding JSON.
- Recognize and convert datetimes and dates in pass-through kwargs. This is
- In routines that can take either one or many indexes, don’t require the caller to wrap a single index name in a list.
- Many other internal improvements
Initial release based on the work of Robert Eanes and other authors