Episode 5 – Elasticsearch Data Leaks: Top 5 Prevention Steps
For this week’s episode, Shankar discussed Elasticsearch data
[video_embed video=”N5F79BHgTiI” parameters=”” mp4=”” ogv=”” placeholder=”” width=”700″ height=”400″]
Recent Elasticsearch Data Leaks
There were three instances of massive data leaks involving Elasticsearch databases just in the week prior to our interview with Simone.
- An Elasticsearch database containing the records of 2.5 million customers of Yves Rocher, a cosmetics company, was found unsecured.
- A database containing the personal data of the entire population of Ecuador (16.6 million people) was found unsecured.
- An Elasticsearch database containing personally identifiable information linked to 198 million car buyer records was found unsecured.
The frequent occurrence of Elasticsearch database data leaks raises the question, “How can we prevent a data leak in Elasticsearch data stores?” For the answer, we interviewed an Elasticsearch security expert and asked his opinion on the top 5 data leak prevention techniques.
What are the Root Causes of These Data Leaks?
The common theme among these different data leaks regarding what caused them was related to the outsourcing contracts. Contracts should not only include the functional requirement but should also include a security requirement. The good thing is that solutions already exist and they are free.
If you think about Amazon Elasticsearch Service, it’s very cheap and convenient. However, you can’t install any plugin in Amazon because it’s blocked. So a developer will just find a way around this problem without a viable security plugin, which ultimately leaves the database vulnerable. So a lot of the issue has to do with how Amazon built the Amazon Elasticsearch Service. They split the responsibility for security between the user and the infrastructure manager, which is them (Amazon), so Amazon is not contractually liable for the problems that arise regarding security.
Amazon allows anyone to open up an Elasticsearch cluster without any warning. Simone says he “does not agree with this practice. Amazon should either avoid it or have a very big warning” so that data leaks like the three recent ones can be avoided.
Another problem is that the companies that had these clusters exposed had a massive amount of data accumulated, and Simone says that “even if it was secure, it is not a good practice and the entities that created the GDPR would not agree with the practice” of holding that much data in such
5 Ways To Prevent An Elasticsearch Data Leak
If you have an Elasticsearch cluster and want to keep it protected follow these rules:
- Remember that data accumulation is a liability and you should only collect what is necessary at all times. Every piece of data should have an expiration date.
- Every company from the minute they obtain user data should accept the responsibility it comes with and should center their attention on the importance of data handling and data management. Outsource access to the data less, but keep all of the different objectives of the different actors in line at all times.
- Use security plugins. When you accumulate data, the security layer should be as close as possible to the data itself.
- Use encryption on the
httpinterface and between the Elasticsearch nodes for next-level security.
- Rigorously implement local data regulations and laws like the GDPR in the European Union.
If you are looking to increase the security for your elasticsearch cluster, using a security plugin is a great security measure to start with and can help you prevent a data leak from exposing your clients’ data. Learn more about ReadOnlyREST’s security plugin for Elasticsearch and Kibana here.
The Infralytics Show
Thanks for reading our article and tuning in to episode 5 of the Infralytics show. We have a great show planned for next week as well, so be sure to come back! Interested in checking out your past episodes? Here’s a link to episode 4.