make_cfg is what __main__.py calls, it takes the Abstract Syntax Tree, creates an ExprVisitor and returns a Control Flow Graph.
stmt_visitor.py and expr_visitor.py mirror the abstract grammar of Python. Statements can contain expressions, but not the other way around. This is why ExprVisitor inherits from StmtVisitor, which inherits from ast.NodeVisitor; from the standard library.
This is how ast.NodeVisitor works:
def visit(self, node):
"""Visit a node."""
method = 'visit_' + node.__class__.__name__
visitor = getattr(self, method, self.generic_visit)
return visitor(node)
So as you'll see, there is a visit_ function for almost every AST node type. We keep track of all the nodes while we visit by adding them to self.nodes, connecting them via ingoing and outgoing node attributes.
The two most illustrative functions are stmt_star_handler and expr_star_handler. expr_star_handler has not been merged to master so let's talk about stmt_star_handler.
Example code
if some_condition:
x = 5
This is the relevant part of the abstract grammar
If(expr test, stmt* body, stmt* orelse)
# Note: stmt* means any number of statements.
Upon visiting an if: statement we will enter visit_If in stmt_visitor.py. Since we know that the test is just one expression, we can just call self.visit() on it. The body could be an infinite number of statements, so we use the stmt_star_handler function.
stmt_star_handler returns a namedtuple (ConnectStatements) with the first statement, last_statements and break_statements of all of the statements that were in the body of the node. stmt_star_handler takes care of connecting each statement in the body to the next one.
We then connect the test node to the first node in the body (if some_condition -> x = 5) and return a namedtuple (ControlFlowNode) with the test, last_statements and break_statements.
def visit_If(self, node):
test = self.append_node(IfNode(
node.test,
node,
path=self.filenames[-1]
))
body_connect_stmts = self.stmt_star_handler(node.body)
# ...
test.connect(body_connect_stmts.first_statement)
if node.orelse:
# ...
else:
# if there is no orelse, test needs an edge to the next_node
body_connect_stmts.last_statements.append(test)
last_statements = remove_breaks(body_connect_stmts.last_statements)
return ControlFlowNode(
test,
last_statements,
break_statements=body_connect_stmts.break_statements
)
def stmt_star_handler(
self,
stmts
):
"""Handle stmt* expressions in an AST node.
Links all statements together in a list of statements.
Accounts for statements with multiple last nodes.
"""
break_nodes = list()
cfg_statements = list()
first_node = None
node_not_to_step_past = self.nodes[-1]
for stmt in stmts:
node = self.visit(stmt)
if isinstance(node, ControlFlowNode):
break_nodes.extend(node.break_statements)
elif isinstance(node, BreakNode):
break_nodes.append(node)
cfg_statements.append(node)
if not first_node:
if isinstance(node, ControlFlowNode):
first_node = node.test
else:
first_node = get_first_node(
node,
node_not_to_step_past
)
connect_nodes(cfg_statements)
if first_node:
first_statement = first_node
else:
first_statement = get_first_statement(cfg_statements[0])
last_statements = get_last_statements(cfg_statements)
return ConnectStatements(
first_statement=first_statement,
last_statements=last_statements,
break_statements=break_nodes
)
For more information on AST nodes, see the Green Tree Snakes documentation.