Skip to content

Polymorphic x86/x64 Block API #13832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 9, 2020
Merged

Conversation

zeroSteiner
Copy link
Contributor

@zeroSteiner zeroSteiner commented Jul 8, 2020

This PR adds the necessary infrastructure to load and process polymorphic assembly stubs from external data files and uses it to dynamically reorder the instructions for the Block API stub that powers all of the x86 and x64 native Windows payloads. The intention is to add polymorphic qualities to the payloads generated by Metasploit without either changing the size or requiring that the payloads be self modifying and thus reside in RWX memory which in and of itself is often identified as malicious. Now each time a payload is generated, the common Block API section is randomized making writing signatures to match it much more complicated. This is done independently of encoding, and therefor the benefits are present even without encoding.

I also consolidated the x86 Block API stubs, removing the duplicates from util/exe.rb so they would reap the same benefits. These ones however used a slightly modified version that would allow for full 32-bit jumps, thus making the stub slightly larger. The x64 stub was left unmodified. If the size increase is unacceptable, I'll need to readd the older version in something like block_api_small.x86.graphml. See the "x86 Small" column below for a metric comparison on this scenario.

Known Limitations

Known limitations include the fact that the only changes that are made are in the ordering of instructions. New instructions are not added at this time to maintain the desirable small size, however if size restrictions could be propagated down, this could be an easy area for improvement. The second limitation is that instructions themselves are not altered to use different registers for storage or anything like that.

With these limitations in place however we'll still see:

x86 x86 Small 1 x64
Total Instructions 67 61 73
Possible Permutations 17,280 4,800 216,000
Size 140 bytes (up from 130) 130 (unchanged) 200 bytes (unchanged)

1These are metrics for the hypothetical smaller version of the x86 Block API. Due to the nature of how the algorithm works, the more optimized the assembly code is, the few opportunities there are to shuffle the instructions while retaining the original functionality.

Future Work

With the data loaded in the graph, there's an opportunity for future work to inject NOPs and generally deoptimize instructions. By splitting instructions into two and injecting them into the graph with the appropriate constraints, the build process would allow them to be reconstructed without necessarily being placed sequentially. This would require working more with the assembly source code and thus coupling the logic more tightly with the x86 and AMD64 architectures.

GraphML?

Data is loaded from a GraphML file that is precalculated so it is unnecessary for Metasploit to perform the binary analysis to determine the positional constraints for the instructions. I chose to use GraphML because:

  • The native data is structured as a graph of instruction nodes connected by edges representing constraints on their placement within their respective basic blocks.
  • Native capabilities within the specification for arbitrary attributes allowing for metadata such as an instruction's source code to be stored.
  • An easily parsed XML based format (as opposed to something like GraphViz)
  • It's a standard with documentation that I don't have to write.

The new GraphML parser library I wrote follows the GraphML specification for all of the implemented elements. Hyperedges and ports are omitted from the current implementation because they are not necessary at this time.

It's not realistic to create the necessary GraphML data files by hand. I used the tools/analysis/graph.py utility included within my crimson-forge project to generate it. The gist of this process was:

  1. Assembly the Block API source code from external/source/shellcode/windows/(x64|x86)/src/block/block_api.asm into a binary file (either use nasm or the tools/assembler.py tool in crimson-forge)
  2. Generate the graph data with tools/analysis/graph.py -a x86 block_api.bin graphml block_api.x86.graphml

For posterity, this is the build automation script I used:

GraphML Data Generation

Script

Note the source files in the home directory and the use of amd64 instead of x64.

#!/bin/bash
set -e
set -x

for arch in amd64 x86; do
  printf "\e[1;34m[*]\e[0m Building files for $arch\n"
  tools/assembler.py      -a $arch ~/block_api.$arch.asm ~/block_api.$arch.bin
  tools/analysis/graph.py -a $arch ~/block_api.$arch.bin graphml ~/block_api.$arch.graphml
  cp ~/block_api.$arch.graphml ~/Repositories/metasploit-framework/data/shellcode/
done

mv ~/Repositories/metasploit-framework/data/shellcode/block_api.amd64.graphml \
   ~/Repositories/metasploit-framework/data/shellcode/block_api.x64.graphml

Output

+ for arch in amd64 x86
+ printf '\e[1;34m[*]\e[0m Building files for amd64\n'
[*] Building files for amd64
+ tools/assembler.py -a amd64 ~/block_api.amd64.asm ~/block_api.amd64.bin
[*] raw output hash (SHA-256): a6a612c9352ae0f6a6b1933a791748971539ea4b383a7d77ca023b83aeb7fc54
+ tools/analysis/graph.py -a amd64 ~/block_api.amd64.bin graphml ~/block_api.amd64.graphml
[*] Crimson-Forge Engine: v0.4.0
[*] Architecture set as: amd64
[*] Input hash (SHA-256): a6a612c9352ae0f6a6b1933a791748971539ea4b383a7d77ca023b83aeb7fc54
[*] Using analysis profile: shellcode (auto-detected)
[*] Total blocks: 15
[*]     basic:    15
[*]     data:     0
[*] Total instructions: 73
[*] Possible permutations: 216,000
[*] Randomization potential score: 0.05049
[*] Completed in 13.000 seconds
+ cp ~/block_api.amd64.graphml ~/Repositories/metasploit-framework/data/shellcode/
+ for arch in amd64 x86
+ printf '\e[1;34m[*]\e[0m Building files for x86\n'
[*] Building files for x86
+ tools/assembler.py -a x86 ~/block_api.x86.asm ~/block_api.x86.bin
[*] raw output hash (SHA-256): f5c3467abfdb0747664b3f38d0120607cfaf8a875bfe8613cd45297dbeecc8f6
+ tools/analysis/graph.py -a x86 ~/block_api.x86.bin graphml ~/block_api.x86.graphml
[*] Crimson-Forge Engine: v0.4.0
[*] Architecture set as: x86
[*] Input hash (SHA-256): f5c3467abfdb0747664b3f38d0120607cfaf8a875bfe8613cd45297dbeecc8f6
[*] Using analysis profile: shellcode (auto-detected)
[!] Analysis failed while identifying tainted self-references
[!] Encountered address that does not correlate to a basic-block at 0xdfdfff88
[*] Total blocks: 14
[*]     basic:    14
[*]     data:     0
[*] Total instructions: 67
[*] Possible permutations: 17,280
[*] Randomization potential score: 0.04481
[*] Completed in 4.000 seconds
+ cp ~/block_api.x86.graphml ~/Repositories/metasploit-framework/data/shellcode/
+ mv ~/Repositories/metasploit-framework/data/shellcode/block_api.amd64.graphml ~/Repositories/metasploit-framework/data/shellcode/block_api.x64.graphml

This approach uses data as opposed to the existing Rex::Polymorphic library like Shikata Ga Nai does because the Rex library targets binary code as opposed to assembly source. When I looked into update it to work with both x86 and x64 source code, the alterations seemed non-trivial. Furthermore, had the necessary alterations been made, we would then have needed to defined each stub independently in Ruby in such a way that updating it would be more complicated than generating a new data file.

Assembly Source Files

Since I consolidated the Block API usage within Metasploit as part of this work, I updated the original asm files because they are now the authoritative source. This mostly altered the x86 version to be the larger Block API variant that works in more scenarios. One self-serving change I had to make was to update all numeric literals to their hex representation. This doesn't change anything for nasm or metasm, however it was necessary since the keystone engine used by crimson-forge assumes that numeric literals are base-16 instead of decimal.

Testing

There are three primary things that need to be tested. Both x86 and x64 native Windows payloads and 32-bit EXEs generated using msfvenom / generate -f exe ....

  • Start msfconsole
    • use exploits/windows/smb/psexec
    • Set all the options appropriately
    • Run it with payloads for each of the x64 and x86 block_api, repeat multiple times
      • x64: set PAYLOAD windows/x64/meterpreter/reverse_tcp
      • x86: set PAYLOAD windows/meterpreter/reverse_tcp
    • Test the exe generation (x86 only), repeat multiple times
      • use payload/windows/meterpreter/reverse_tcp
      • Set the options appropriately
      • Create an EXE using generate -f exe -o /path/to/your/output.exe
      • Start a handler with to_handler
      • Run the EXE payload and get a shell

Thanks for reading!

This changes the x86 version to the (10 bytes) larger variant that can
handle full 32-bit jumps which is necesary for maximum compatibility
within the framwork.

Additionally, numeric literals are expressed in hex for compatibility
with the keystone assembler allowing these files to be compatitble with
external tools.
@zeroSteiner zeroSteiner added library payload enhancement msf6 PRs that need to be landed into the msf 6 branch labels Jul 8, 2020
@bwatters-r7 bwatters-r7 self-assigned this Jul 8, 2020
@zeroSteiner
Copy link
Contributor Author

Unit tests appear to be failing due to my use of Ruby's Array.filter method which is unavailable in 2.5.6. I'll be able to update this tomorrow.

@wvu
Copy link
Contributor

wvu commented Jul 9, 2020

Data is loaded from a GraphML file that is precalculated so it is unnecessary for Metasploit to perform the binary analysis to determine the positional constraints for the instructions.

Dude, YES.

Copy link
Contributor

@wvu wvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look sensible to me. I was able to reference Crimson Forge easily. I have yet to test this, of course.

@bwatters-r7
Copy link
Contributor

FWIW on testing..... I'm mostly hitting windows/[x64/]meterpreter/reverse_tcp as the payload. There was one failed test in each suite.... not terrifically concerned about it.
After this gets landed, the full testing will pick it up for all supported payloads.

venom/generate payloads

windows/meterperter/reverse_tcp

(Not super concerned with that one failure....)
image

windows/x64/meterperter/reverse_tcp

image

Working through psexec tests currently

@bwatters-r7
Copy link
Contributor

psexec testing....

I need to redo tests after the latest push, but here's a working test....
The two Win8.1 failures are likely due to the fact these are not patched and still speak SMBv2/3 incorrectly.
image

@zeroSteiner
Copy link
Contributor Author

The latest commit was strictly comment changes in the x64 source code, so nothing executable should have been changed. I think you can safely skip restarting all of the tests.

@bwatters-r7
Copy link
Contributor

Just to round everything out, windows/x64/meterpreter/reverse_tcp:
image

@bwatters-r7 bwatters-r7 merged commit 24bf14b into rapid7:6.x Jul 9, 2020
@bwatters-r7
Copy link
Contributor

bwatters-r7 commented Jul 9, 2020

Original Release Notes

This PR adds the necessary infrastructure to load and process polymorphic assembly stubs from external data files and uses it to dynamically reorder the instructions for the Block API stub that powers all of the x86 and x64 native Windows payloads. The intention is to add polymorphic qualities to the payloads generated by Metasploit without either changing the size or requiring that the payloads be self modifying and thus reside in RWX memory which in and of itself is often identified as malicious. Now each time a payload is generated, the common Block API section is randomized making writing signatures to match it much more complicated. This is done independently of encoding, and therefor the benefits are present even without encoding.

@OJ
Copy link
Contributor

OJ commented Jul 10, 2020

This is awesome stuff Spencer. Love ya work mate 👍

@adfoster-r7 adfoster-r7 added the rn-enhancement release notes enhancement label Aug 6, 2020
@pbarry-r7
Copy link
Contributor

Release Notes

Improved Metasploit-generated x86 and x64 native Windows payloads by adding polymorphic qualities without changing the size or requiring that the payloads be self modifying.

@zeroSteiner zeroSteiner deleted the feat/poly-blk-api branch February 23, 2021 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement library msf6 PRs that need to be landed into the msf 6 branch payload rn-enhancement release notes enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants