Big Data, Bigger Testing: BeaconIQ’s Approach to a Real-World QA

Yasmine Rigby

Building robust and scalable products requires thorough testing, especially when dealing with large datasets. At Hamilton Robson, we’re no strangers to this challenge, particularly with BeaconiQ, our office occupancy product that processes over 1.6 million data requests per day, spanning 4 continents and over 7 time zones.

What is BeaconiQ?

BeaconiQ is developed and maintained by our AV and software departments in collaboration. It is used to monitor occupancy within our clients offices. Our solution utilises various LoRaWAN IoT devices to obtain and report occupancy data for an office – which is made available to our clients through our bespoke dashboard. The dashboard has been developed by the software team to visualise the data with tools such as heatmaps, charts, and detailed views of meetings – providing insights that help our clients effectively manage their workspaces.

Leveraging AWS tools for our cloud infrastructure, BeaconiQ uses AWS DynamoDB for managing our high volumes of sensor data and AWS Lambda for processing, ensuring scalability and performance as we continue to expand our product globally. To put its scale into perspective: over the summer, the team celebrated BeaconiQ handling over 50 million IoT data requests in just one month, averaging around 1.6 million requests per day!

With such a massive influx of data, ensuring that BeaconiQ performs as expected is super important. However, replicating real-world conditions in QA is easier said than done. Accurate testing requires realistic datasets, meaning that connecting directly to a production database for development is too big a risk  – as it could lead to accidental database reads or writes – impacting the live systems. Instead, we need to clone a section of the data in production, anonymise it, and then pull it into our QA environment to make it safe and easy to test.

How We Replicate Production Data in QA

To address these challenges, we’ve developed a secure and scalable solution for replicating production data into QA. Here’s how it works:

  1. Configuring cross-account access to Amazon DynamoDB: We needed to set up cross-account access to DynamoDB resources using AWS IAM identity-based policies. This sets up permissions and establishes a trust relationship between our production and QA accounts. In this case, we configured our Lambda function to allow QA to read data from our production DynamoDB table, as they are both in different accounts. For some visual aid – below is a multi-account architecture diagram, where Account B could be seen as our QA account and Account A is our production account. This diagram is displaying cross-account access between Lambda and DynamoDB, allowing our Lambda in QA to extract the data it needs from our DynamoDB table in production.

2. Data Extraction via AWS Lambda: Our Lambda function set up in QA will read the required data from production, and thanks to our policies and permissions that have been configured it will have the access rights to do so. These permissions have also been locked down to only allow us to read the production data to avoid any accidental edits, inserts, or deletes! Our Lambda uses a DynamoDB client to read the data and store it in an S3 bucket in our QA account, to be later loaded into our QA DynamoDB table.

3. Storage in S3: As discussed above, we are storing the data from production in an S3 bucket that has been created in our QA account. This is where the dummy data is held before being loaded into our database table. 

  1. Loading Data into QA DynamoDB: The replicated data is then uploaded into our QA DynamoDB database table, which will display the replicated stats within our QA dashboard environment – ready for testing.

Our cross-account access roles ensure read-only access to production data, preventing accidental modifications or deletions. Although the data we handle isn’t highly sensitive, we still wanted to ensure both security and integrity of the data we are replicating.

By using replicated production data in QA, we can:

  • Validate that new features and changes work seamlessly with large datasets.
  • Ensure the dashboard remains efficient when receiving huge payloads, delivering fast and reliable insights to clients.
  • Proactively identify and fix issues such as exceeding Lambda runtimes or oversized API requests before deployment.

The efforts of replicating production data into QA has allowed us to really fine tune this product in its testing phases, ensuring the final product is optimised for performance and scalability before it is deployed to production.

Final Thoughts

For any team working with high-scale IoT or cloud solutions, investing in production data replication for QA is invaluable. It enables realistic testing, reduces risk, and ensures a smoother production deployment process. At Hamilton Robson, this approach has been crucial in refining BeaconiQ’s ability to handle millions of daily data requests while delivering insights quickly and efficiently.

BeaconiQ’s success is a testament to the importance of rigorous testing, thoughtful data replication, and leveraging cloud tools like AWS. For any team tackling similar challenges, we recommend taking the time to replicate production data securely in QA. It might take a little extra effort upfront, but it’s a game changer for catching issues before they reach production.

LETS TALK.

Want to find out how the subject of this blog could help your business? 

Our blended team of experts go over and above with our services to our customers, no matter what the challenge. Get in touch to find out how we can work together.