Malware has been using unicode since time ago, to hide / obfuscate urls, filenames, scripts, etc... Right-to-left Override character (e2 80 ae) is a classic. In this post a PoC is shared, where a shellcode is hidden / encoded into a string in a python script (probably this would work with other languages too), with invisible unicode characers that will not be displayed by the most of the text editors.
The idea is quite simple. We will choose three "invisible" unicode characters:
e2 80 8b : bit 0
e2 80 8c : bit 1
e2 80 8d : delimiter
With this, and having a potentially malicious script, we can encode the malicious script, bit by bit, into these unicode characters:
(delimiter e2 80 8d) .....encoded script (bit 0 to e2 80 8b, bit 1 to e2 80 8c)...... (delimiter e2 80 8d)
I have used this simple script to encode the malicious script:
Now, we can embbed this encoded "invisible" unicode chars into a string. The following source code looks like a simple hello world:
However, if you download and open the file with an hexadecimal editor you can see all that encoded information that is part of the hello world string:
Most of the text editors that I tested didn't display the unicode characters: Visual Studio, Geany, Sublime, Notepad, browsers, etc...
The following script decodes and executes a secondary potentially malicious python script (the PoC script only executes calc) from invisible unicode characters:
And the following script decodes a x64 shellcode (the shellcode executes calc) from invisible unicode characters, then it loads the shellcode with VirtualAlloc+WriteProtectMemory, and calls CreateThread to execute it:
The previous scripts are quite obvious and suspicious, but if this encoded malicious script and these lines are mixed into a longer and more complicated source code, probably it would be harder to notice the script contains malicious code. So, careful when you download your favorite exploits! ;)
I have not tested it, but probably this will work with other languanges. Visual Studio for example, doesn't show this characters into a C source code.