In the previous post, we presented a system architecture to convert audio and voice into written text with AWS Transcribe, extract useful information for quick understanding of content with AWS Comprehend, index this information in Elasticsearch 6.2 for fast search and visualize the data with Kibana 6.2.

In this post we are going to see how to implement the previosly described architecture.
The main steps performed in the process are:

  1. Configure S3 Event Notification
  2. Consume messages from Amazon SQS queue
  3. Convert the recording to text with AWS Transcribe
  4. Entities/key phrases/sentiment detection using AWS Comprehend
  5. Index to Elasticsearch 6.2
  6. Search in Elasticsearch by entities/sentiment/key phrases/customer
  7. Visualize, report and monitor with Kibana dashboards
  8. Use Skedler and Alerts for reporting, monitoring and alerting

1. Configure S3 Event Notification

When a new recording has been uploaded to the S3 bucket, a message will be sent to an Amazon SQS queue.

You can read more information on how to configure the S3 Bucket and read the queue programmatically here: Configuring Amazon S3 Event Notifications.

This is how a message notified from S3 looks. The information we need are the object key and bucket name.

2. Consume messages from Amazon SQS queue

Now that the S3 bucket has been configured, a notification will be sent to the SQS queue when a recording is uploaded to the bucket. We are going to build a consumer that will perform the following operations:

  • Start a new AWS Transcribe transcription job
  • Check the status of the job
  • When the job is done, perform text analysis with AWS Comprehend
  • Index the results to Elasticsearch

With this code you can read the messages from a SQS queue, fetch the bucket and key (used in S3) of the uploaded document and use them to invoke AWS Transcribe for the speech to text task:

3. AWS Transcribe – Start Transcription Job

Once we have consumed a S3 message and we have the url of the new uploaded document, we can start a new transcription job (asynchronous) to perform the speech to text task.

We are going to use the start_transcription_job method.

It takes a job name, the S3 url and the media format as parameters.

To use the AWS Transcribe API be sure that your AWS Python SDK – Boto3 is updated.

Read more details here: Python Boto3 AWS Transcribe.

3a. AWS Transcribe – Check Job Status

Due to the asynchronous nature of the transcription job (it could take a while depending on the length and complexity of your recordings), we need to check the job status.

Once the stauts is “COMPLETED” we can retrieve the result of the job (the text converted from the recording).

Here’s how the output looks:

4. AWS Comprehend – Text Analysis

We have converted our recording to text. Now, we can run the text analysis using AWS Comprehend. The analysis will extract the following elements from the text:

  • Sentiment
  • Entities
  • Key phreses

Read more details here: Python Boto3 AWS Comprehend.

5. Index to Elasticsearch

Given a recording, we now have a set of elements that characterize it. Now, we want to index this information to Elasticsearch 6.2. I created a new index called audioarchive and a new type called recording.

The recording type we are going to create will have the following properties:

  • customer id: the id of the customer who submitted the recording (substring of the s3 key)
  • entities: the list of entities detected by AWS Comprehend
  • key phrases: the list of key phrases detected by AWS Comprehend
  • sentiment: the sentiment of the document detected by AWS Comprehend
  • s3Location: link to the document in the S3 bucket

Create the new index:

Add the new mapping:

We can now index the new document:

6. Search in Elasticsearch by entities, sentiment, key phrases or customer

Now that we indexed the data in Elasticsearch, we can perform some queries to extract business insights from the recordings.


Number of positive recordins that contains the _feedback_ key phrases by customer.

Number of recordings by sentiment.

What are the main key phares in the nevative recordings?

7. Visualize, Report, and Monitor with Kibana dashboards and search

With Kibana you can create a set of visualizations/dashboards to search for recording by customer, entities and to monitor index metrics (like number of positive recordings, number of recordings by customer, most common entities/key phreases in the recordings).

Examples of Kibana dashboards:

Percentage of documents by sentiment, percentage of positive feedback and key phrases:

kibana report dashboard

Number of recordings by customers, and sentiment by customers:

kibana report dashboard

Most common entities and heat map sentiment-entities:

kibana report

8. Use Skedler Reports and Alerts to easily monitor data

Using Skedler, an easy to use report scheduling and distribution application for Elasticsearch-Kibana-Grafana, you can centrally schedule and distribute custom reports from Kibana Dashboards and Saved Searches as hourly/daily/weekly/monthly PDF, XLS or PNG reports to various stakeholders. If you want to read more about it: Skedler Overview.

If you want to get notified when something happens in your index, for example, a certain entity is detected or the number of negative recording by customer reaches a certain value, you can use Skedler Alerts. It simplifies how you create and manage alert rules for Elasticsearch and it provides a flexible approach to notification (it supports multiple notifications, from Email to Slack and Webhook).


In this post we have seen how to use Elasticsearch as the search engine for customer recordings. We used the speech to text power of AWS Transcribe to convert our recording to text and then AWS Comprehend to extract semantic information from the text. Then we used Kibana to aggregate the data and create useful visualizations and dashboards. Then scheduled and distribute custom reports from Kibana Dashboards using Skedler Reports.

Environment configurations:

  • Elasticsearch and Kibana 6.2
  • Python 3.6.3 and AWS SDK Boto3 1.6.3
  • Ubuntu 16.04.3 LTS
  • Skedler Reports & Alerts
March 19, 2018

How to Extract Business Insights from Audio Using AWS Transcribe, AWS Comprehend and Elasticsearch – Part 2 of 2

In the previous post, we presented a system architecture to convert audio and voice into written text with AWS Transcribe, extract useful information for quick understanding […]
March 12, 2018
aws transcribe comprehend elasticsearch architecture

Extract business insights from audio using AWS Transcribe, AWS Comprehend and Elasticsearch – Part 1

Many businesses struggle to gain actionable insights from customer recordings because they are locked in voice and audio files that can’t be analyzed. They have a […]
January 26, 2018
APM with Elasticsearch Kibana Skedler

Application Performance Monitoring with Elasticsearch 6.1, Kibana and Skedler Alerts

Have you ever wonder how to easily monitor the performance of your application and how to house your application metrics in Elasticsearch? The answer is Elastic […]
December 19, 2017
Document Text Analytics using Amazon(AWS) Comprehend - Elasticsearch 6.0

How to Combine Text Analytics and Search using AWS Comprehend and Elasticsearch 6.0

How to automatically extract metadata from documents? How to index them and perform fast searches? In this post, we are going to see how to automatically […]