Home » EVM bytecode to Graph using EtherSolve
EVM bytecode to Graph using EtherSolve
Introduction
Smart contracts are self-executing programs that run on a blockchain, such as Ethereum. The most notable feature of a smart contract is that the program and all its transactions are immutable. Once written to the distributed blockchain, they cannot be updated, even in case of programming defects identified after deployment.
In this article, I will present how to transform the EVM bytecode to a graph using a tool named, EtherSolve, created by F. Contro et al.
Challenges of extracting a CFG from EVM bytecode
The offical blockchain provides direct access to the Ethereum bytecode. The bytecode can be easily parsed, but because of some language design decisions, its semantics and control flow graph (CFG) are challenging to create.
To begin with, the JUMP destination is not an opcode argument. The destination address, which was dynamically generated by the previous code, is assumed to be available on the stack by a jump opcode.
Secondly, returning from functions does not have an opcode.Pushing the return address to the stack and then making a jump is how return is implemented.
Thirdly, the compiler eliminates functions. The function calls within the contract are replaced by jumps. A dispatcher at the smart contract entry point handles inter-contract function calls, determining which address to jump to based on the call’s specific arguments.
Ultimately, the smart contract constructor is only used once, during the contract’s initial blockchain deployment before being removed. It performs the initial operations and deploys the runtime code on the blockchain. There is no bytecode for the constructor in the blockchain.
Generating CFG from EVM bytecode
The quickest way to generate CFG from the EVM bytecode is by using the EtherSolve.jar
. You can download it from the Github page of EtherSolve.
curl -O -L https://github.com/SeUniVr/EtherSolve/raw/main/artifact/EtherSolve.jar
After the download, make sure you have java installed on your machine.
java --version
Now, we must insert the bytecode of the smart contract inside a file named bytecode.evm
. You could use the following command.
echo "0x6080604052600436106100af576000357c0100000000000000000000000000000000000000000000000000000000900463ffffffff16806306fdde03146100b4578063095ea7b31461014457806318160ddd146101a957806323b872dd146101d457806327e235e314610259578063313ce567146102b05780635c658165146102e157806370a082311461035857806395d89b41146103af578063a9059cbb1461043f578063dd62ed3e146104a4575b600080fd5b3480156100c057600080fd5b506100c961051b565b6040518080602001828103825283818151815260200191508051906020019080838360005b838110156101095780820151818401526020810190506100ee565b50505050905090810190601f1680156101365780820380516001836020036101000a031916815260200191505b509250505060405180910390f35b34801561015057600080fd5b5061018f600480360381019080803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803590602001909291905050506105b9565b604051808215151515815260200191505060405180910390f35b3480156101b557600080fd5b506101be6106ab565b6040518082815260200191505060405180910390f35b3480156101e057600080fd5b5061023f600480360381019080803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803590602001909291905050506106b1565b604051808215151515815260200191505060405180910390f35b34801561026557600080fd5b5061029a600480360381019080803573ffffffffffffffffffffffffffffffffffffffff16906020019092919050505061094b565b6040518082815260200191505060405180910390f35b3480156102bc57600080fd5b506102c5610963565b604051808260ff1660ff16815260200191505060405180910390f35b3480156102ed57600080fd5b50610342600480360381019080803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803573ffffffffffffffffffffffffffffffffffffffff169060200190929190505050610976565b6040518082815260200191505060405180910390f35b34801561036457600080fd5b50610399600480360381019080803573ffffffffffffffffffffffffffffffffffffffff16906020019092919050505061099b565b6040518082815260200191505060405180910390f35b3480156103bb57600080fd5b506103c46109e4565b6040518080602001828103825283818151815260200191508051906020019080838360005b838110156104045780820151818401526020810190506103e9565b50505050905090810190601f1680156104315780820380516001836020036101000a031916815260200191505b509250505060405180910390f35b34801561044b57600080fd5b5061048a600480360381019080803573ffffffffffffffffffffffffffffffffffffffff16906020019092919080359060200190929190505050610a82565b604051808215151515815260200191505060405180910390f35b3480156104b057600080fd5b50610505600480360381019080803573ffffffffffffffffffffffffffffffffffffffff169060200190929190803573ffffffffffffffffffffffffffffffffffffffff169060200190929190505050610bdb565b6040518082815260200191505060405180910390f35b60038054600181600116156101000203166002900480601f0160208091040260200160405190810160405280929190818152602001828054600181600116156101000203166002900480156105b15780601f10610586576101008083540402835291602001916105b1565b820191906000526020600020905b81548152906001019060200180831161059457829003601f168201915b505050505081565b600081600260003373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002060008573ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020819055508273ffffffffffffffffffffffffffffffffffffffff163373ffffffffffffffffffffffffffffffffffffffff167f8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925846040518082815260200191505060405180910390a36001905092915050565b60005481565b600080600260008673ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002060003373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002054905082600160008773ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002054101580156107825750828110155b151561078d57600080fd5b82600160008673ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000206000828254019250508190555082600160008773ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020600082825403925050819055507fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff8110156108da5782600260008773ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002060003373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020600082825403925050819055505b8373ffffffffffffffffffffffffffffffffffffffff168573ffffffffffffffffffffffffffffffffffffffff167fddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef856040518082815260200191505060405180910390a360019150509392505050565b60016020528060005260406000206000915090505481565b600460009054906101000a900460ff1681565b6002602052816000526040600020602052806000526040600020600091509150505481565b6000600160008373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020549050919050565b60058054600181600116156101000203166002900480601f016020809104026020016040519081016040528092919081815260200182805460018160011615610100020316600290048015610a7a5780601f10610a4f57610100808354040283529160200191610a7a565b820191906000526020600020905b815481529060010190602001808311610a5d57829003601f168201915b505050505081565b600081600160003373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000205410151515610ad257600080fd5b81600160003373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff1681526020019081526020016000206000828254039250508190555081600160008573ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020600082825401925050819055508273ffffffffffffffffffffffffffffffffffffffff163373ffffffffffffffffffffffffffffffffffffffff167fddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef846040518082815260200191505060405180910390a36001905092915050565b6000600260008473ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff16815260200190815260200160002060008373ffffffffffffffffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600020549050929150505600a165627a7a7230582076ce5edca0cb79a34f44093c5ee1fcd5fadde8c358ba33d53c447da4f1d4e7ef0029" >> bytecode.evm
Running the following command, it will generate the CFG from the bytecode in HTML format. The HTML file is created in the same directory that you are running the command and you can display it in the browser. The -r
attribute specifies that inside bytecode.evm
is present the runtime code. The runtime code is the one executing in the blockchain, without the constructor.
java -jar EtherSolve.jar -H -r bytecode.evm
If you are looking to create a .dot
file, you need to specify the -d
attribute instead of -H
attribute in the command above.
java -jar EtherSolve.jar -d -r bytecode.evm
Final result
The generated CFG from the EVM bytecode specified above looks as follows. Below you can find a piece of the .dot
file and of the svg
file.
digraph G {
bgcolor=transparent rankdir=UD;
node [shape=box style=filled color=black fillcolor=white fontname=arial fontcolor=black];
1087 [label="1087: JUMPDEST\l1088: CALLVALUE\l1089: DUP1\l1090: ISZERO\l1091: PUSH2 0x044b\l1094: JUMPI\l" fillcolor=lemonchiffon ];
575 [label="575: JUMPDEST\l576: PUSH1 0x40\l578: MLOAD\l579: DUP1\l580: DUP3\l581: ISZERO\l582: ISZERO\l583: ISZERO\l584: ISZERO\l585: DUP2\l586: MSTORE\l587: PUSH1 0x20\l589: ADD\l590: SWAP2\l591: POP\l592: POP\l593: PUSH1 0x40\l595: MLOAD\l596: DUP1\l597: SWAP2\l598: SUB\l599: SWAP1\l600: RETURN\l" fillcolor=lemonchiffon shape=Msquare color=crimson ];
480 [label="480: JUMPDEST\l481: POP\l482: PUSH2 0x023f\l485: PUSH1 0x04\l487: DUP1\l488: CALLDATASIZE\l489: SUB\l490: DUP2\l491: ADD\l492: SWAP1\l493: DUP1\l494: DUP1\l495: CALLDATALOAD\l496: PUSH20 0xffffffffffffffffffffffffffffffffffffffff\l517: AND\l518: SWAP1\l519: PUSH1 0x20\l521: ADD\l522: SWAP1\l523: SWAP3\l524: SWAP2\l525: SWAP1\l526: DUP1\l527: CALLDATALOAD\l528: PUSH20 0xffffffffffffffffffffffffffffffffffffffff\l549: AND\l550: SWAP1\l551: PUSH1 0x20\l553: ADD\l554: SWAP1\l555: SWAP3\l556: SWAP2\l557: SWAP1\l558: DUP1\l559: CALLDATALOAD\l560: SWAP1\l561: PUSH1 0x20\l563: ADD\l564: SWAP1\l565: SWAP3\l566: SWAP2\l567: SWAP1\l568: POP\l569: POP\l570: POP\l571: PUSH2 0x06b1\l574: JUMP\l" fillcolor=lemonchiffon ];
0 [label="0: PUSH1 0x80\l2: PUSH1 0x40\l4: MSTORE\l5: PUSH1 0x04\l7: CALLDATASIZE\l8: LT\l9: PUSH2 0x00af\l12: JUMPI\l" fillcolor=lemonchiffon shape=Msquare fillcolor=gold ];
446 [label="446: JUMPDEST\l447: PUSH1 0x40\l449: MLOAD\l450: DUP1\l451: DUP3\l452: DUP2\l453: MSTORE\l454: PUSH1 0x20\l456: ADD\l457: SWAP2\l458: POP\l459: POP\l460: PUSH1 0x40\l462: MLOAD\l463: DUP1\l464: SWAP2\l465: SUB\l466: SWAP1\l467: RETURN\l" fillcolor=lemonchiffon shape=Msquare color=crimson ];
1196 [label="1196: PUSH1 0x00\l1198: DUP1\l1199: REVERT\l" fillcolor=lemonchiffon shape=Msquare color=crimson ];
1428 [label="1428: JUMPDEST\l1429: DUP2\l1430: SLOAD\l1431: DUP2\l1432: MSTORE\l1433: SWAP1\l1434: PUSH1 0x01\l1436: ADD\l1437: SWAP1\l1438: PUSH1 0x20\l1440: ADD\l1441: DUP1\l1442: DUP4\l1443: GT\l1444: PUSH2 0x0594\l1447: JUMPI\l" ];
433 [label="433: PUSH1 0x00\l435: DUP1\l436: REVERT\l" fillcolor=lemonchiffon shape=Msquare color=crimson ];
// ....
Future work
I am looking to use the generated CFG from EVM bytecode to develop a GNN model aiming at detecting vulnerabilities in Ethereum smart contracts. The accurate extraction of the CFG from the bytecode is at the basis of the success of vulnerability detection.