Index
=====

This class represents an Index inside an Elasticsearch cluster. It provides a
set of methods that allow the user to query the index and add new data.

The class also keeps a buffer of documents waiting to be pushed to the index,
the user can add documents to the buffer and the class will push them as soon as
the buffer is full. The user can also force the push of the records by flushing
the buffer.

To initialize an index:

.. code-block:: ruby

  client = JayAPI::Elasticsearch::ClientFactory.new(
    cluster_url: 'https://my-cluster.elastic.io'
  ).create(max_attempts: 3, wait_strategy: :constant, wait_interval: 2)

  index = JayAPI::Elasticsearch::Index.new(
    client: client,
    index_name: 'my_index'
  )

The ``cluster_url`` and the ``index_name`` are the only required parameters. If
the cluster is configured to use Elasticsearch's default port (``9200``) and has
no authentication in place this is all you need. However in most cases that
would not be enough, so you can also provide the following extra parameters:

* ``port``: The port number where the Elasticsearch cluster is listening for
  connections.
* ``username``: The username to use when authentication against the cluster.
* ``password``: The user's password
* ``batch_size``: The amount of documents the ``Index`` will store in its buffer
  before triggering an automatic flush.
* ``logger``: If you want the messages to be logged to a particular logger. If
  you don't pass a logger then the class will create one.

The ``create`` method, that returns the client object, also takes optional arguments,
which define connection re-try behaviour:

* ``max_attempts``: Sets the maximum number of reconnection attempts in
  response to server errors.
* ``wait_strategy``: Determines the strategy for wait intervals between
  reconnection attempts. Options are:

  * ``:constant`` - Maintains a consistent wait time specified by ``wait_time``.
  * ``:geometric`` - Increases the wait time geometrically based on ``wait_time``.

* ``wait_time``: Specifies the base wait time (in seconds) for the chosen
  ``wait_strategy``.

#push
*****

The ``push`` method stores a document in the ``Index``'s buffer. If the buffer
reaches the maximum number of records the buffer will be flushed automatically.

``push`` takes a single ``Hash``, the document you want to send to the index.

.. warning::

  When using the ``push`` method make sure to call ``flush`` at the end.
  Automatic flushing only occurs when the buffer is full, if you do not call
  ``flush`` at the end of the run you might lose some documents.

Example:

.. code-block:: ruby

  documents.each do |document|
    # do something with your document, then push it
    index.push(document)
  end

  index.flush # Do not forget to flush the index at the end.

#index
******

``index`` pushes a document directly to the Elasticsearch cluster without adding
it to the buffer first. So you don't need to call ``flush``:

``index`` takes a single ``Hash``, the document you want to send to the index.

Example:

.. code-block:: ruby

  index.index(my_document)

.. note::

  Pushing documents one at a time is very inefficient because the ``Index``
  needs to perform an HTTP Request for each one. If you want to send many
  documents use ``push`` instead.

.. _`Index#search`:

#search
*******

The ``search`` method allows you to search the Elasticsearch index for documents
matching the provided query. This method takes two arguments:

* ``query`` A ``Hash`` with the query you want to execute, this Hash will be
  converted to JSON before being sent to Elasticsearch. It must follow
  `Elasticsearch's DSL`_. There is no limit to what you can put in this
  Hash, no validation, nor transformation is performed. Queries can be as simple
  or as complex as you want.
* ``type`` (optional): Specify `:search_after` for using the `Search After`_ feature.
  This is needed if you have more than 10,000 matching documents.

You can compose the ``query`` by yourself or you can use the
:doc:`query_builder`, which offers an easier, albeit limited interface.

The ``search`` method returns a :doc:`query_results` class which you can use to
iterate the result set in batches.

Example:

.. code-block:: ruby

  index.search(
    query: {
      match_all: { }
    },
    sort: [
      {
        '@timestamp': 'desc'
      }
    ],
    type: :search_after
  )

#flush
******

Flushes the current buffer to Elasticsearch, pushing all the documents currently
stored in the queue (if there are any).

Example:

.. code-block:: ruby

  documents.each do |document|
    index.push(document)
  end

  index.flush

#queue_size
***********

Returns the current number of documents currently waiting to be flushed to
Elasticsearch:

Example

.. code-block:: ruby

  index.queue_size # => 16

#delete_by_query
****************

This method allows you to remove the documents that match the given query from
the index. The method has a single parameter:

* ``query``: A ``Hash`` with the query you want to use to match documents for
  deletion. For more information on this parameter or how to create queries see
  the :ref:`Index#search` method documentation.

On success the method will return a ``Hash`` with information about the executed
command, for example:

.. code-block:: ruby

   {
     took: 740,
     timed_out: false,
     total: 1748,
     deleted: 1748,
     batches: 2,
     version_conflicts: 0,
     noops: 0,
     retries: { bulk: 0, search: 0 },
     throttled_millis: 0,
     requests_per_second: -1.0,
     throttled_until_millis: 0,
     failures: []
   }

On error an ``Elasticsearch::Transport::Transport::ServerError`` will be raised.

.. _`Elasticsearch's DSL`: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html
.. _`Search After`: https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#search-after