Determining data sentiment in the Amazon Aurora database using Amazon Comprehend

Oleksii Bebych

February 8, 2024

Problem statement

Collecting customer feedback is essential for every business. But collecting is not enough. We must understand and react to them to improve our product and grow the business. Processing feedback manually was a common thing earlier, and many companies still do it nowadays, but we live in the era of machine learning, and we have many ways to take advantage of it, including text analysis.

Let’s imagine we have a web application where users can leave feedback. Feedback is stored in a database, and later, we process feedback to understand if a customer is happy or if we have to improve something.

Amazon Aurora databases (both MySQL and PostgreSQL) can be integrated with Amazon Comprehend, a managed natural language processing (NLP) service used to extract insights from documents. With Amazon Comprehend, you can deduce sentiment based on the content of documents by analyzing entities, key phrases, language, and other features.

Configuring Amazon Aurora MySQL

I skip explaining how to create an Aurora database. Assume you already have it. We first need to create an IAM role for the Aurora Cluster, which allows using Amazon Comprehend.

Go to the “Manage IAM roles” tab, select Amazon Comprehend, and click “Connect service“:

IAM role will be created automatically:

IAM policy allows only two actions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAuroraToComprehendService",
            "Effect": "Allow",
            "Action": [
                "comprehend:DetectSentiment",
                "comprehend:BatchDetectSentiment"
            ],
            "Resource": "*"
        }
    ]
}

Trust relationships allow Amazon Aurora service to assume this role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "rds.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Then, we need to change one parameter of the MySQL engine. We can do it only in a custom parameter group, so let’s create it:

Change parameter aws_default_comprehend_role and set the previously created IAM role ARN:

Modify your Aurora cluster and change the parameter group from the default to the newly created custom one:

Apply changes and reboot cluster:

I’ve created a sample table in the database, which contains customer names, feedback, and date:

For Aurora MySQL, Aurora machine learning provides the following two built-in functions for working with Amazon Comprehend and your text data:

aws_comprehend_detect_sentiment
This function identifies the text as having a positive, negative, neutral, or mixed emotional posture. This function’s reference documentation is as follows.

aws_comprehend_detect_sentiment_confidence
This function measures the confidence level of the sentiment detected for a given text.

Let’s try it in MySQL:

SELECT feedback AS 'Customer feedback', 
    aws_comprehend_detect_sentiment(feedback, 'en') AS Sentiment, 
    ROUND(aws_comprehend_detect_sentiment_confidence(feedback, 'en'), 6) 
    AS Confidence FROM customer_feedback 
    ORDER BY Confidence DESC;

So, feedback “Disappointed with the product…” is NEGATIVE with 0.999 confidence, which seems correct =)

Meanwhile, “Great service! …” is POSITIVE with 0.999 confidence.

We can see calls to Amazon Comprehend in CloudWatch, so don’t forget to monitor the usage of such services to avoid unexpected costs.

Configuring Amazon Aurora PostgreSQL

Let’s check how to do it in PostgreSQL, as it’s a bit different. You still need the same IAM role, but you don’t need any changes in the Parameter group.

Select the Comprehend service and add the role:

Next, you need to install the Aurora machine-learning extension

demo=> CREATE EXTENSION IF NOT EXISTS aws_ml CASCADE;
NOTICE:  installing required extension "aws_commons"
CREATE EXTENSION


demo=> \dx
                        List of installed extensions
    Name     | Version |   Schema   |              Description              
-------------+---------+------------+-------------------------------------
 aws_commons | 1.2     | public     | Common data types across AWS services
 aws_ml      | 1.0     | public     | ml integration
 plpgsql     | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

We use a similar dataset:

For Aurora PostgreSQL, Aurora machine learning provides the following Amazon Comprehend function for working with your text data:

aws_comprehend.detect_sentiment

This function takes text as input and evaluates whether the text has a positive, negative, neutral, or mixed emotional posture. It outputs this sentiment along with a confidence level for its evaluation.

demo=> SELECT feedback, s.sentiment,s.confidence 
     	FROM customer_feedback,aws_comprehend.detect_sentiment(feedback, 'en') s 
     	ORDER BY s.confidence DESC;

Conclusion

In this post, we looked at how Amazon Aurora can be integrated with Amazon Comprehend to evaluate the sentiment of customer feedback. It can be integrated with other AWS Machine learning services as well, for example:

For MySQL, it can be SageMaker.

For PostgreSQL, you can use SageMaker and Amazon Bedrock.

This integration may not be supported in some AWS regions and some versions of DB engines, so check the availability here.

You don’t need to create any infrastructure for Amazon Comprehend. Just provide Aurora with permissions (IAM role), configure the DB engine (parameter group for MySQL and extension for PostgreSQL), and have fun =)

Determining data sentiment in the Amazon Aurora database using Amazon Comprehend

Table of Contents

Problem statement

Configuring Amazon Aurora MySQL

Configuring Amazon Aurora PostgreSQL

Conclusion

Scale with Confidence