Intro
Welcome! If, like me, you’ve jumped between countless posts trying to understand how DynamoDB transactions work with Amplify, and still haven’t found a concise explanation, then congratulations! You’ve stumbled upon the right article.
If you're here hoping to learn the secret recipe for making pancakes that double as frisbees, or perhaps how to teach your goldfish to play chess, then I'm afraid I won't be able to help.
This article is meant for people who have a basic understanding of Amplify and React. But don't worry I’ll do my best to explain everything clearly.
Overview of the Tools & Technologies
Before diving into transactions, let’s first understand the tools we will be using:
- React:
- Takes care of our UI.
- AWS Amplify:
- Manages our backend and provides everything needed to build web and mobile apps.
- AppSync:
- Connects apps to data and events with secure, serverless, and performant GraphQL APIs.
- DynamoDB:
- A serverless, NoSQL, fully managed database with single-digit millisecond performance at any scale.
These tools work together seamlessly to enable developers to build full-stack applications. React handles the frontend user interface, while Amplify abstracts away the complexity of managing backend infrastructure by providing services like authentication, storage, and API integration. AWS AppSync acts as the glue between the frontend and backend, allowing developers to define GraphQL APIs and perform real-time data updates.
Now, Let’s Rock🤘
What Are Transactions?
In the world of databases, transactions are like magical spells. They allow you to group multiple operations into a single, all-or-nothing batch. This means that if one part of the transaction fails, the entire operation is rolled back, ensuring data integrity. This is especially important for complex applications where multiple related updates need to happen simultaneously.
It’s simply a single unit of logic or work, made up of multiple operations that are guaranteed to succeed or fail together.
Why bother with transactions ?
Transactions provide several key advantages:
- Atomicity:
- Ensures that all parts of your operations are completed successfully or none at all.
- Consistency:
- Guarantees that the database remains in a valid state before and after your operations.
These properties make transactions incredibly reliable and essential for applications that require consistent and accurate data mapping.
A Practical Example Where a Transaction is Needed
Imagine you’re managing a complex application where you need to ensure that multiple operations are completed together. For example, you might have a vacation request system where employees submit vacation requests and, upon approval, their remaining vacation days are updated. This involves updating both the vacation request status and the user’s remaining vacation days in the database.
Handling these updates separately can lead to inconsistencies if something goes wrong in between. That’s where DynamoDB transactions come to the rescue!
With the provided data models, you can create powerful GraphQL queries to retrieve users and their associated vacations or vice versa. This flexibility allows you to efficiently fetch related data in a single query, improving performance and simplifying data retrieval.
But wait, can’t I do the same with mutations ?
GraphQL queries and mutations serve different purposes in managing data. Queries are designed to retrieve data, allowing you to fetch nested related information in a single request. With the provided data models, queries like getUserWithVacations
work seamlessly.
However, when it comes to mutations, the story is different. In GraphQL, mutations are used to modify data. Ideally, you would want to perform complex operations, such as updating a user's vacation status and their remaining vacation days, in a single, nested mutation. This would ensure data consistency and reduce the risk of errors during the update process.
The Limitations of Nested Mutations to Amplify GraphQL
In GraphQL, the ability to perform nested mutations, where you update related resources in a single operation, can be incredibly convenient and efficient. However, when working with AWS Amplify, this feature is not directly supported.
While Amplify simplifies many aspects of building GraphQL APIs, it does have its limitations, particularly when it comes to nested mutations.
My opinions on why it’s was done this way.
In a typical GraphQL schema, you define your data models and their relationships using types and fields. While GraphQL itself supports nested mutations, the implementation and handling of these mutations depend on the server-side framework or service you’re using.
- Custom Resolvers: To handle nested mutations in GraphQL, you often need to implement custom resolvers on the server side. These resolvers interpret the nested mutation requests and perform the necessary operations on the underlying data.
- Data Integrity and Consistency: Nested mutations can introduce challenges related to data integrity and consistency. Ensuring that related resources are updated atomically and consistently requires careful planning and implementation, which may not align with the goals of simplicity and ease of use that Amplify prioritizes.
Amplify’s simplicity model
💡 Amplify’s Simplified API: - Amplify aims to simplify the process of building and deploying GraphQL
APIs by providing a straightforward API for common use cases. While this makes development easier for many scenarios, it also means sacrificing some flexibility, such as the ability to perform nested mutations.
💡 Resource Management: - Amplify manages your backend resources, including databases like DynamoDB. While this abstracts away much of the infrastructure management, it also means that you have less control over the underlying data operations, including nested mutations.
💡 Complexity and Performance: - Implementing nested mutations can introduce complexity and potential performance issues, especially when dealing with large datasets or complex relationships between resources. Amplify’s focus on simplicity and ease of use may prioritize avoiding such complexities.
Now that we have understood why Amplify does not support nested mutations directly, it’s clear that this decision is rooted in the goal of maintaining simplicity and avoiding potential complexities and performance issues. While this limitation may seem restrictive, it actually promotes better practices by encouraging developers to implement transactional logic in a more reliable and robust manner.
The Naive implementation without a transaction: (Sequential Updates on the client)
You might think you got the solution implementing this kind of logic on your front end. Let’s go over a typical implementation you might come up with on the frontend. In this approach,Separate operations are performed:
- 🔖 Fetch Vacation Details with the user’s data:
✅ Perform a GraphQL API call to retrieve the vacation and user details using the provided
vacationId
. - 🔖 Calculate Vacation Days:
- ✅ Calculate the number of days between
start
andend
.
- ✅ Calculate the number of days between
- 🔖 Update Vacation Status:
- ✅ Perform an API call to update the vacation status to
ACCEPTED
.
- ✅ Perform an API call to update the vacation status to
- 🔖 Update User’s Remaining Vacation Days:
- ✅ Calculate the new
remainingVacationDays
by subtractingvacationDays
from the currentremainingVacationDays
. - ✅ Perform an API call to set the new
remainingVacationDays
for the user.
- ✅ Calculate the new
The Problem with Separate Updates
Consider the following scenario:
- Vacation Request Submission:
- John Doe submits a vacation request for a week.
- The request is initially marked as
PENDING
.
- Vacation Approval Process:
- An administrator reviews and approves John’s vacation request.
- The vacation status is updated to
ACCEPTED
. - John’s remaining vacation days need to be reduced by the number of days requested.
The Risk of Inconsistency.
What happens if the vacation status update succeeds, but the user update fails due to a network issue or some other error? You’ll end up with an ACCEPTED
vacation request, but John’s remaining vacation days will not be correctly updated. This inconsistency can lead to significant issues in your application’s data integrity.
On the bright side, John might just become the luckiest man alive – enjoying his approved vacation without losing any vacation days! Imagine the envy in the office when everyone realizes John’s figured out the ultimate hack: a never-ending vacation with untouched vacation days. But alas, this might be good for John, but it’s a disaster for maintaining accurate records in your system.
Implementing this type of transaction logic on the frontend is fraught with pitfalls and is generally considered bad practice for several reasons:
- Network Latency and Reliability:
- Frontend operations are susceptible to network issues. If a network error occurs between the two API calls, you might end up with an inconsistent state.
- Error Handling Complexity:
- Handling errors and rollbacks in the frontend adds unnecessary complexity and increases the likelihood of bugs and inconsistent states.
Attempting to implement transaction logic on the frontend is untenable and hard to conceive. It's also a guaranteed way of losing respect from your peers and compromising the integrity of your application.
Too much talk, now show me the transaction.
Let’s create our transaction.
Different Paths to Implementing a Nested Mutation in a Transactional Way
To overcome the limitation of nested mutations in Amplify, you can explore several alternative approaches:
💡 Custom Lambda Function Resolver:
- Define your business logic in a custom AWS Lambda function. This function can handle complex transactions, ensuring that all related operations are executed atomically.
- When a mutation is triggered, the Lambda function can update multiple tables and handle errors appropriately to maintain data integrity.
- Use an HTTP endpoint to manage the logic for nested mutations. This endpoint can be a microservice that processes the transaction and updates all necessary resources.
- This approach allows you to decouple the transaction logic from your frontend and leverage existing APIs or microservices.
💡 AppSync JavaScript or VTL Resolver:
- Utilize AppSync's JavaScript or VTL resolvers to write custom logic directly within your GraphQL API.
- These resolvers offer a powerful way to handle nested mutations and ensure that all operations are performed as a single transaction.
AWS AppSync uses VTL to translate GraphQL requests from clients into a request to your data source. Then it reverses the process to translate the data source response back into a GraphQL response. VTL is a logical template language that gives you the power to manipulate both the request and the response in the standard request/response flow of a web application.
And speaking of VTL (Velocity Template Language), let’s just say it’s not anybody’s cup of tea. VTL is the kind of template language that makes you question your life’s choices. It gives you the same feeling you get when trying to debug a regex pattern that was written in hieroglyph, It does not matter if you are regex certified🤣.
AppSync JavaScript or VTL Resolver
Since we’re using AppSync, it makes the most sense to choose the third option: A JavaScript resolver. This approach allows us to implement our custom business logic directly in JavaScript, making it more manageable and familiar compared to other methods.
Don’t you worry, we’ll be running away from VTL as much as we can.
Steps 1: Add our function
Creating a Lambda function in Amplify to handle transactions requires configuring permissions properly to ensure it has access to the necessary resources. Here’s a step-by-step guide on how to do this, with explanations to help you understand why each step is needed.
- Run the Amplify CLI Command
This command initializes the process of adding a new Lambda function to your Amplify project.
- Select Lambda Function
Choose “Lambda function” to create a new serverless function. Lambda functions can run backend logic in response to various events, such as HTTP requests or database updates.
- Name Your Function
Provide a descriptive name for your function. Naming it clearly helps identify its purpose.
- Choose the Runtime
Select “NodeJS” for the runtime environment since we’re using JavaScript. This determines the programming language used to write your function.
- Select the Function Template
Start with the “Hello World” template. This provides a basic function structure that you can build upon.
- Configure Advanced Settings
Choose “Yes” to configure additional settings that grant your function the necessary permissions.
- Grant Access to Other Resources
To perform transactions, your function needs access to specific resources in your project, like DynamoDB tables.
- Select Resource Categories
Use the space key to select both api
and storage
. This step ensures your function can interact with your AppSync API and DynamoDB storage.
- Specify Operations for API Access
Select “Mutation” (and optionally “Query”) so your function can perform data modifications via your GraphQL API.
- Select Specific Resources
Ensure your function has access to both ”Vacations
” and ”Users
” models to read and update records. You might have more than two resources, select the ones you need your function to access. (Principle of least privilege)
- Set Permissions for Operations
Give your function permission to read and update records in the “Vacations” table. Do the same for the “Users” resource.
- Final Configuration Steps
These final settings ensure the function is configured as needed without additional complexity for scheduling or environment variables.
By following these steps, you configure a Lambda function that can interact with your AppSync API and DynamoDB tables. This setup ensures your function has the necessary permissions to perform transactions, maintaining data integrity and consistency.
Your function will be created and saved on the location :
amplify\backend\function
This is where you will write your function code
When working with AWS Lambda and using Node.js, especially if you are using an environment such as
Node.js 20
, it is crucial to understand the module resolution system. By default, Lambda functions created with Amplify are set up with aindex.js
file. However, if you want to use ES module syntax withimport
statements, you'll need to use a.mjs
file extension instead of.js
.
Function to contain the business logic
index.mjs
- Marshall function
- The
marshall
function takes a plain JavaScript object and converts it into a DynamoDB-compatible format.
- The
- Transaction Items:
- The
transactItems
object contains two operations that are part of the same transaction. Both operations must succeed for the transaction to be committed; otherwise, all operations are rolled back. Each of this array items must have a one of top-level property:- 🔥 Put
- ✅ Initiates a
PutItem
operation to write a new item. This structure specifies the primary key of the item to be written, the name of the table to write it in, an optional condition expression that must be satisfied for the write to succeed, a list of the item’s attributes, and a field indicating whether to retrieve the item’s attributes if the condition is not met.
- ✅ Initiates a
- 🔥 Update
- ✅ Initiates an
UpdateItem
operation to update an existing item. This structure specifies the primary key of the item to be updated, the name of the table where it resides, an optional condition expression that must be satisfied for the update to succeed, an expression that defines one or more attributes to be updated, and a field indicating whether to retrieve the item’s attributes if the condition is not met.
- ✅ Initiates an
- 🔥 Delete
- ✅ Initiates a
DeleteItem
operation to delete an existing item. This structure specifies the primary key of the item to be deleted, the name of the table where it resides, an optional condition expression that must be satisfied for the deletion to succeed, and a field indicating whether to retrieve the item’s attributes if the condition is not met.
- ✅ Initiates a
- 🔥 ConditionCheck
- ✅ Applies a condition to an item that is not being modified by the transaction. This structure specifies the primary key of the item to be checked, the name of the table where it resides, a condition expression that must be satisfied for the transaction to succeed, and a field indicating whether to retrieve the item’s attributes if the condition is not met.
- 🔥 Put
- The
- TransactWriteItemsCommand: This command ensures that both updates are performed atomically, maintaining data integrity. If either operation fails, no changes are made to the database.
Step2: Define our Mutation type
we define a mutation field UpdateVacationStatusAndDaysForUser
, with an input type UpdateVacationStatusAndDaysForUserInput
. This structure allows us to encapsulate the parameters for updating vacation status and days in a structured manner. The @function
directive links the mutation field to an AWS Lambda function, enabling the execution of custom business logic
amplify/backend/api/app-name/schema.graphql
Once you’ve made all these changes, be sure to push your code using the Amplify CLI to deploy the updates to your AWS environment.
If you visit the AWS AppSync console and navigate to the Schema section, you will be able to see the custom type
and mutation
definitions we created. There, you will also notice the pipeline resolver attached to our mutation. This resolver coordinates the transaction logic we implemented, ensuring that our UpdateVacationStatusAndDaysForUser
function executes correctly.
Tips
Logging
When developing and running AWS Lambda functions, logging is crucial for monitoring, debugging, and gaining insights into your application’s behavior. AWS Lambda automatically integrates with Amazon CloudWatch Logs, allowing you to capture and store log data To view the logs of your function in AWS Amplify, follow these steps:
- Navigate to the AWS Lambda console.
- Select your function from the list.
- Go to the Monitoring tab.
- Click on View logs in CloudWatch.
Multiple environment naming
AWS Amplify manages different environments (e.g., dev, prod) by appending the environment name to various resources, including DynamoDb. This allows you to have multiple isolated environments without conflicts.
To have access to the actual name of your resources you can inspect the environment variables to get a full list of what you have access to.
Hey, I hope I've saved you tons of time! Now that you're a transaction pro, it's time to leave my site.