A Simple Family Budget Tracker with AWS SES, Lambda and DynamoDB

A Simple Family Budget Tracker with AWS SES, Lambda and DynamoDB

How to control the family budget? Many people often have difficulties with following the budget, especially now that all purchases are made on a credit card.

If the majority of purchases are paid by credit card, then the only way to find out how much money is left or how much is spent is to go to the Internet bank/mobile bank, or use financial aggregators, for example, Mint to go and check the balance. This is possible, but it requires discipline, and it is difficult to establish it when not only you pay with the same card.

The obvious option is to use the bank’s API or go to its Internet bank programmatically using some kind of headless browser. Unfortunately, access to the API of a bank is often paid, and going to the Internet bank is problematic due to two-factor authentication. However, there is another option. Almost all banks today send alerts to each transaction, informing when, how much and where the transaction took place. It is the information that is needed to maintain the budget. It remains to figure out how to handle it.

Many banks can send alerts to your mobile phone and email. However, now people can automate the budget somewhere in the cloud, for example, AWS.

What do we need in AWS?

There are many services in AWS, but we need only three: SES – to receive and send emails, Lambda – to process them, and to store the DynamoDB result. Plus, a couple of additional ones for the bundle – SNS, Kinesis, CloudWatch. This is not the only way to handle messages: instead of Lambda, you can use EC2, instead of DynamoDB, you can store data in RDS (MySQL, PostgreSQL, Oracle, …), or you can write a simple script on your small server on Perl and BerkleyDB.

What does the whole processing look like in general? A transaction email arrives, we record the date, amount and place of payment in the database, and once a day we send the email with the balance for the given month. The whole architecture is a bit more complicated and looks like this:

What do we need in AWS?

  • An email arrives at SES.
  • SES sends an email to the SNS topic.
  • The ProcessCharge Lambda feature is triggered by the arrival of the email via SNS, parses the email and writes transaction data to the DynamoDB table Transactions.
  • The Lambda feature UpdateSummary works as a trigger after writing to the Transactions table and updates the current state of the budget in the Summary table.

Let’s consider these steps in details.

Email receiving

Simple Email Service, also known as SES, is a service for receiving and sending emails. When you receive an email, you can specify which action should be performed: save the email in S3, call the Lambda feature, send the email to the SNS and others. To receive emails, you need to bind your domain, namely, specify the SES server in the MX domain.

When bounding a domain to SES, verification is required. To do this, SES asks to add some records to the DNS zone (MX and TXT), and then checks their presence. If the domain is hosted in Route 53, then all this is done automatically. When the domain is verified, you can proceed to set up rules for receiving mail.

email receiving

Email processing

When a new mail is published to the SNS topic, the Lambda feature ProcessCharge is offered. It needs to do two actions – parse the email and save the data to the database.

Email processing

Parse_content () method is responsible for parsing.

Email processing

We remove unnecessary characters in it and use a regular expression to check whether the email contains information about the transaction and if so, break it into parts. The text sought is as follows:

Email processing

Unfortunately, the standard Python library knows few time zones, and EDT (Eastern Daylight Time) is not among them. Therefore, we replace the EDT with the numerical designation -0400 and do the same for the main time zone, EST. After that, we can parse the date and time of the transaction, and convert it to the ISO 8601 standard format supported by DynamoDB.

The method returns a hash table with the transaction amount, store name and date with time. This data is passed to the process_transaction method:

the process_transaction method

In it, we save data to the Transactions table, generating a unique transaction identifier.

the Transactions table

Budget update

The moment how the budget is monitored should be described in details. We define several values:

  • budget – the size of the budget for the month;
  • total – the amount of spending per month;
  • available – rest (balance), (buget – total);

At any time we want to know all these values. This can be done in two ways:

  • Every time you need to know the state of the budget, transactions are summarized to get total, then available = (budget – total).
  • Each time a new transaction is recorded, the total is updated. When you need to know the state of the budget, it becomes available = (budget – total).

Both approaches have pros and cons, and the choice is highly dependent on the requirements and limitations of the system. The first approach is good because it does not denormalize the data, keeping the number of transactions separately.

On the other hand, the amount must be calculated with each request. If you have a limitation caused by DynamoDB – to calculate the sum of N transactions, you need to read N records, which means to spend N read capacity units. Obviously, this is not a very scalable solution, which will cause difficulties (or high cost) even with several dozen transactions.

When using the second approach, the total is updated after each transaction and is always relevant, which avoids the summation of all transactions. You can implement it  in different ways:

  • Update total after each transaction is recorded in the same Lambda feature ProcessCharge.
  • Update total in a trigger after adding a new item to the Transactions table.

Updating in a trigger is more practical, including in terms of multithreading, so I the Lambda feature UpdateSummary is being created:

Lambda feature UpdateSummary

We are only interested in the event of adding elements to the table, all others are ignored.

Lambda feature UpdateSummary

In process_transaction (), we calculate the period, in the form year-month, to which the transaction relates, and call the update method total.

Lambda feature UpdateSummary

This code is not perfect, and the reason is the interesting feature of Python that it cannot parse the date/time with the time zone in the -HH: MM format, which conforms to the ISO 8601 standard, and which Python itself generated (code above , in the parse_content ()) method.

Therefore, we can simply replace the time zones we need with the -HHMM format that it understands.

Total update:

In this method, we load the summary for the current period using the load_summary () method, the total in which we need to update. If the summary does not already exist, we create it in the create_summary () method, if it exists, update to update_summary ().

update to update_summary

Since the update of the summary can be made from several streams, we use consistent read, which is more expensive, but guarantees that we will get the last recorded value.

consistent read

When creating a new summary, for the same reason that it is possible to record from multiple streams, the conditional expression is used, ConditionExpression = ‘attribute_not_exists (period)’, which will save the new summary only if it does not exist.

Thus, if someone managed to create a summary in the interval, when we tried to load it into load_summary () and it was not there, and when we tried to create it in create_summary (), our call to put_item () will end with the exception and the entire Lambda feature will be restarted.

How to create a summary

Updating the total value in the summary is done inside DynamoDB:

Updating the total value

Actually, this should be enough for a safe update, but we can act conservatively and add the condition that the record should occur only if the summary was not updated in another stream, and it still contains the value that we have:

Updating the total value i

Since total is a keyword for DynamoDB, to use it in DynamoDB expressions, you need to create a synonym:

creating a synonym

This completes the transaction processing and budget update process:

transaction processing and budget update process

Sending budget notifications

The last part of the system – notification of the state of the budget. For someone, it’s enough to receive a notification once a day. However, nothing prevents to notify after each transaction, or after some threshold values of expenses/balance. The architecture of sending an e-mail notification is quite simple and looks like this:

sending an e-mail notification

  1. The CloudWatch Timer timer is triggered once a day and calls the DailyNotification Lambda feature.
  2. DailyNotification loads data from the DynamoDB Summary table and calls SES to send an email.

Daily notification and summary

First, we try to load a summary for the current period, and if it’s not, then we’ll finish. If there is – we prepare and send the email:

sending the email

Final thoughts

That’s all. Now, after each transaction, the incoming email is processed and the budget is updated, and the mail is sent once a day with a budget status notification. If you have any questions, comments or edits – feel free to share them.

Laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam.