EVERYTHING YOU NEED TO KNOW WHEN CONSIDERING ELASTICSEARCH REPORTING FOR YOUR BUSINESS

The Best Tools for Exporting Elasticsearch Data to CSV

export-tools

Introduction

This blog post shows you how to export data from Elasticsearch to a CSV file. Imagine that you have infrastructure or security log data in Elasticsearch that you would like to export as a CSV and open in Excel or other tools for further analysis. In this post, we’ll introduce the ways to export Elasticsearch data to a CSV using the top available tools on the market.

Possible Scenarios

There are multiple ways you can extract data from Elasticsearch. We will look at the following scenarios:

Export data directly from Elasticsearch

We wrote an honest and unbiased review of the following tools that are available for exporting data directly from Elasticsearch.

  1. Es2csv – A CLI tool for exporting data from Elasticsearch to a CSV file
  2. Python pandas – A python software library that has built in functions for exporting elasticsearch data in a CSV, Excel or HTML format.
  3. Elasticsearch Data Format Plugin

1. es2csv

https://github.com/taraslayshchuk/es2csv

Es2csv is the command-line utility, written in Python, for querying Elasticsearch in Lucene query syntax or Query DSL syntax and exporting the results as documents into a CSV file. This tool can query bulk docs in multiple indices and get only selected fields, which reduces query execution time.

Here are the major pros and cons of es2csv :

Pros of es2csv

Here are the most essential advantages of es2csv.

  • It’s simple to install and configure
  • This tool can query bulk docs in multiple indices and get only selected fields.
  • It reduces query execution time.

Cons of es2csv

After the advantages, it’s time to throw some light on the disadvantages of es2csv.

  • This tool supports only 2x and 5x versions of Elasticsearch. It does not support 6.x or 7.x
  • You need Python 2.7.x and pip. So you must install a python environment in your system.
  • For non-technical users it’s difficult to use
  • To automate the exporting of data on a periodic basis, you need to write your own cron job.

2. Python-pandas

https://kb.objectrocket.com/elasticsearch/export-elasticsearch-documents-as-csv-html-and-json-files-in-python-using-pandas-348

One of the advantages of having a flexible database and using Python’s Pandas Series is being able to export documents in a variety of formats. When you use Pandas IO Tools Elasticsearch to export Elasticsearch files in Python, you can analyze documents faster.
This requires the following prerequisites

  1. Install Python
  2. Install pip
  3. Pip install elasticsearch
  4. Pip install numpy
  5. Pip install pandas

Here are the major pros and cons of python pandas.

Pros of python pandas

  • Faster writing of documents
  • Since it is written in python, when we analyze it in terms of code level, the amount of code is less when compared to the amount of code written in nodejs
  • Supports elasticsearch version 7.x as well

Cons of Python pandas

  • Python needs to be installed properly.
  • Not able to export values with queries.
  • Automation requires scripting
  • It is a tool for developers and data scientists, not for non-technical users.

3. Elasticsearch Data format

https://github.com/codelibs/elasticsearch-dataformat

This is an elastic search plugin. You need to add and configure it into your elasticsearch plugins. It provides a feature to download the response of a search result in several formats other than JSON. The supported formats are CSV, Excel and JSON(Bulk).
For this, there are the following prerequisites

  1. Elastic search 5.X or below
  2. Java installed and JAVA_HOME path configured

Here are the major pros and cons of the Elasticsearch Data format.

Pros of Elasticsearch Data format

  • Easy to install.
  • Added as an elastic search plugin.
  • Uses simple curl commands and arguments.

Cons of Elasticsearch Data format

  • The response format is poor.
  • Supports only up to elasticsearch 5.x.
  • Difficult to use for non-technical users

Overall Summary

When we tried these tools, we were unable to export only the fields that are given in the query. So it takes all the values in the index. Whereas, with products like Skedler Reports, Kibana, and Grafana, it is possible to export the selected fields as a CSV/Excel file. Furthermore, only python pandas works with the latest versions of elasticsearch(>5.x). Last, but not least, a major drawback of these open source tools is that they are designed for use by technical users.

If you are looking for an easy and automated way to export Elastic Stack data to CSV, XLS or PDF, we invite you to try Skedler Reports. It is free to try for 21 days, and it could save you a ton of time. You can download Skedler Reports here.