Arduino String Obfuscator
This assignment was done in partial fulfilment of CEG5105 Cyber Security for Computer Systems. For the project, we were allowed to do any project related to Cyber Security. As a person interested in Embedded systems, I wanted to do a project related to securing embedded systems.
The Problem
Its surprisingly easy to read strings from an Arduino if you have access to the bootloader. Using the avrdude utility, you can hexdump the contents of flash to a firmware hex file. After converting the hex file to binary, you can use the linux strings uitility to look for ASCII readable strings!
This is also demonstrated in this repository thomasbbrunner/arduino-reverse-engineering: Reverse engineering of an Arduino application where the author reverse-engineered the intent of the Arduino. This is a security issue, especially if your code has plenty of debug outputs.
The Approach
There are a few ways to solve this. The easiest way would be to use #ifdef compilation directives to ensure that during deployment, you don't have any debug outputs. However, some applications must have strings. For example, if the application parses MQTT topics.
One Time Pad
One possible way would be to use one time pad.
- For each String
messagecreate a char array of the same lengthcipher - XOR the String with char array and store it
new_msg = message ^ cipher - When you need the string back, XOR the char array.
message = new_msg ^ cipher
This means that
- If you hex dump the firmware, you will see
new_msgandcipherin the firmware - You don't see
message
However, this means that your memory has increased, and that if the adversary is smart enough, he will try to XOR any equal length character arrays and he will thus obtain message. Its makes reverse engineering harder, but its not very secure
AES
Another option is to use AES encryption which requires a 128-bit key for all messages. This means that the only increase in memory usage is 128 bits. Even if the attacker gets access to the cipher, it is hard for him to know which string is the cipher.
To make it even more challenging, we could store the key in an array of different size, and drop any unused bytes before decrypting the string to be used. This would make it slightly harder for the adversary to know how to parse the string before it can be used as the key.
We could also consider using hardware to improve security. For instance, storing the AES key in a secure element.
Conclusion
Even the AES method is not foolproof. An adversary could use Ghidra to trace the flash and figure out which string is being used the most or locate the decryption function. However that is much more challenging than dumping the binary and extracting ASCII readable strings.
Its important to note that when creating secure systems all systems are fallible. Its just the resources needed. Hence, we need to make assumptions on what the adversary can or cannot do. By implementing string obfuscation, we effectively raise the resources needed to reverse engineer our embedded system.
This project hence demonstrates a relatively simple way to secure an embedded system from adversaries from reading stored strings, which is the easiest way (in my humble opinion) of reverse engineering firmware. If I was working on a secure project, I would implement some CI/CD pipelines to ensure sensitive strings are not stored in plaintext.
