-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds the stub resolver abi pass #1036
adds the stub resolver abi pass #1036
Conversation
The problem arises when we deal with libraries: even if a callee is in the same library, a call will be done through a plt entry first, meaning that we won't visit the target subroutine. Our solution is based on the function name: we would better to call the function itself, but not the corresponding plt entry, so since a function and a plt entry has the same name, then any call to a plt entry we substitute to a call to the function. Also, this PR fixes SP in the `primus-x86-loader` (again). The history. PR BinaryAnalysisPlatform#993 introduced a proper initialization of PLT entries, but unfortunately, introduced a bug as well. Briefly, we didn't want primus to dive deep into PLT entries because anyway, we would fail with unresolved call exception, but to that moment a SP would be changed several times. The solution was to raise an unresolved call exception as soon as possible, right after a primus machine entered an entry and restore SP by `unresolved-call` observation. Unfortunately, the implementation was wrong: we linked an unresolved call handler to the address of a jump in a plt entry, so actually we just call a handler instead, and no actual unresolved call happens, no observation made, and no steps to restore SP taken. This PR fixes it and now we just unlink all the addresses of plt entries, so we get a fair unresolved call exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I like the idea of unlinking PLT and relinking jumps to implementation. I am a little bit concerned that this is not a pass, like trivial-condition-form for example, and I am not quite happy that Primus run modifies the program. So we should consider moving this to a normal pass (in fact the abi pass), as what it does is called relocating a procedure that in general useful not only for Primus.
So, I'm suggesting to split the relocation part into an abi pass.
Concerning the rest of implementation, aka component Component. We will discuss it more in depth on the second round, I have suspicions that your initial problem evaluation has some flaws. I'm currently revisiting the old PRs.
@@ -3,7 +3,8 @@ open Bap_primus.Std | |||
include Self() | |||
|
|||
let init _ = | |||
Primus.Machine.add_component (module Primus_x86_loader.Component);; | |||
Primus.Machine.add_component (module Primus_x86_loader.Component); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a component named component?
That is not true :) Observations had happened by that time, therefore the code was correct. Weneed to reevaluate the problem and confirm that observations indeed do not happen and if they not identify why. My personal bet is that they do happen :) |
b5c5121
to
4824890
Compare
In BinaryAnalysisPlatform#1051 we begun inserting `start` and `exit` pseudonodes and introduced a small bug, when the graph is singleton and has a node with both indegree and outdegree equal to zero. For that graphs, instead of ``` digraph { start -> n -> exit } ``` we got ``` digraph { start -> n; start -> exit; } ``` since we didn't insert an edge from `n` to `exit`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The algorithm should do the following:
For each call that has a destination that is a stub, s.t. there is a function that is not a stub that we can unambiguously link with that stub, relink the destination to that function.
Keep in mind, that for each stub there should be at most one function that matches this stub. And for each non-stub there should be at most one stub that matches this function. In case of any ambiguity, i..e, a function that can satisfy more than one stubs should be removed from the linking process, the same a stub that satisfy more than one functions, shouldn't be relinked.
- write knowledge providers that will provide the
stub
attribute for known to you stubs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the issues that will be relevant after you will rewrite this to enable the announced property that the rewriting preserves one-to-one correspondence between stubs and implementation:
In case if a non-bijective mapping is detected (a stub has more than one implementation candidates or implementation has more than one stubs) no redirection is established.
For that, implement the relinker using the unification algorithm: start by partitioning the set of program subroutines by the equivalence relation E in which two functions are equivalent if they share the same alias. Then refine this quotient set by removing all sets that do not have exactly two elements. Finally, leave only those pairs in which one and only one element is a stub. Those pairs will give the most general unifier.
let update prog = relink prog (find_links prog) | ||
|
||
let main proj = | ||
Plt.provide proj; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be removed from the abi pass. In other words, move this line to line 148.
open KB.Syntax | ||
|
||
type t = { | ||
aliases : tid option String.Map.t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type binding = Unique of tid | Ambiguous
instead of tid option
so that at the end you will have a tri-state binding - either None
, Some (Unique x)
, or Some Ambiguous
.
Maps with 'a option
as a value is very confusing, as you have the None
constructor overloaded to have two completely different meanings. And that in Some None
, None
meanys too many. Something is wrong in representing "too many" with None.
open KB.Syntax | ||
|
||
type t = { | ||
aliases : tid option String.Map.t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, rename aliases
to stubs
and symbols to reals
or impls
to reflect the actual meaning of both fields.
| Second _ -> jmp | ||
| First tid -> match Map.find links tid with | ||
| Some (Some tid') -> | ||
Jmp.reify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't create a new jmp term as we will lose as a result al the attributes of the old term. Map the old term using the with_alt
function. That is not exposed in the interface, so please expose it also (together with the with_dst
function).
Also, please add unit tests and function tests (with a .so that has functions with external and static linkage calling each other). Also, make sure that examples from draperlaboratory/cbat_tools#132 are linked correctly. |
|
||
(** [with_dst jmp d] updates jump's intra-procedural destination *) | ||
val with_dst : t -> dst option -> t | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you also add @since
attributes? since 2.0.0 for with_{kind,cond} and 2.10 for the latter two.
plugins/api/api/c/gnu.h
Outdated
@@ -1 +1,3 @@ | |||
int __libc_start_main(int (*main) (int, char **, char **), int, char **, void *auxv); | |||
|
|||
void __stack_chk_fail(void) __attribute__((noreturn)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it here?
Stub_resolver_tests.suite; | ||
] | ||
|
||
let load_plugins () = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete this
type t = { | ||
groups : groups; | ||
names : names; | ||
next : int; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, a trick that I usually employ is that instead of maintaining the counter, we can use Map.max_elt
instead of next
. But you can keep next
if you like.
] | ||
~expected:["siera", "papa"; | ||
"tango", "quebec"; | ||
"uniform", "romeo";]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need cases where symbols have many aliases, some intersecting aliases and some not, but where there is more than two cases, e.g.,
test "several intersections" [
stub, "a", ["a1"; "a2"; "a3"];
real, "a0", ["a"];
stub, "b", ["b1"; "b2"; "b3"];
real, "b0", ["b"; "c"];
real, "c", ["d"];
stub, "d", [];
] ~expect:[
"a", "a0";
]
and more tests like this.
Ok, it's good to go. Merge as soon as Travis greenlights! |
|} | ||
|
||
let () = Extension.declare @@ fun _ctxt -> | ||
Stream.observe Project.Info.data @@ Plt.provide; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it Project.Info.data
not Project.Info.code
? The .plt
entries are in the code segment and are executable, so marking something in the data section as valid code seems quite wrong.
@gitoleg, any comments?
The stub resolver pass redirects on the IR level calls to stub functions to the calls to real functions if such could be identified. A function is a stub if it has the stub attribute present in the knowledge base. So far, we add only a knowledge provider for ELF stubs (that looks into the .plt section) others will be added later. The redirection algorithm establishes a one-to-one mapping between a stub and its prospective redirection. In case if a non-bijective mapping is detected (a stub has more than one implementation candidates or implementation has more than one stubs) no redirection is established.