-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Underapproximating to_int
/from_int
#117
Conversation
b6da399
to
c0490a0
Compare
Impact on benchmarks containing
There is no (or minimal) impact on other benchmarks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Just a few comments and ideas for a future work.
STRACE("str-conversion", tout << "failing NFA:" << std::endl << *aut_valid_part << std::endl;); | ||
// util::throw_error("cannot process to_int/from_int for automaton with infinite language"); | ||
res_precision = LenNodePrecision::UNDERAPPROX; | ||
if (max_length_of_words > 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a future work, we could propagate the LIA solver here and check, if the formula subst_var < 100
is valid. In that case we have a precise solution and we could set res_precision
to PRECISE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subst_var
is a string var, so we would have to check something else, probably you meant |subst_var| < 3
.
But I am not sure if it would work.
I was thinking about doing something a bit similar, returning multiple formulae (one underapproximation, one something like overapproximation), but it is quite complicated and it is not that clear how it should work.
Anyway, for stringfuzz
, the best thing would be to transform n = str.to_int(x)
on the input to n = str.to_int(x) & x ∈ 0*n
(and also do that for lenghts, i.e. n = str.len(x) & x ∈ .{n}
), I think that would solve everything.
So doing something with overapproximation or something similar would only be helpful for full_str_int
, and it is questionable how helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ok, and what about the same checking for
from_int
? Is it possible at all in our setting? The replacement offrom_int
byto_int
is done in relevant_eh, right ? - Agree, there is a good hope that this can be done in seq rewriter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I do not understand the question. What do you mean by same checking? Also
from_int
is not replaced withto_int
, I do handle them in similar way during the LIA formula creation, but it is still important to know if I haveto_int
orfrom_int
(for handling invalid cases).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wrote it confusedly. I though that the current problem (mainly for full_str_int
) are constraints of the form w = from_int(x)
where x
is some integer variable. Because we do not know about the possible values of x
, we are not currently able to solve it, right? But if we find out that x
is always smaller than 100, we can enumerate it quite easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think I understand now. It can probably be somehow applied for it, I'll have to think about it.
Anyway, there are only 363 formulae with from_int
in full_str_int
, so I don't think it's the main problem for this benchmark right now.
You can see that the underapproximation, which is useful only for SAT cases (see how number of UNSAT cases did not change at all), already helped a lot with unknown
s, however, most of them moved to TO/MO, so I think we also need to focus on simplifying the formula somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me overall.
Can I merge this? |
Yes |
I am fine with merging, too. |
This PR is mostly for
stringfuzz
benchmark.It adds a possibility to generate underapproximating LIA formula for
to_int
/from_int
, so that if we end up with non-finite language for the string variable of these two predicates, we still generate "something": we check all the words up to some length (right now 3).It can lead to SAT in most of the cases.
For the UNSAT formulae, I added a new preprocessing step (
conversions_validity
), that can immediately tell that there are no valid inputs forto_int
(i.e., its result is-1
). This can also help in a lot of cases.It seems that what would really help (and probably would solve nearly everything in stringfuzz), is to get information about input equations such as
5 = str.to_int(x)
.We could then generate
x ∈ 0*5
which would in most cases lead to UNSAT.However, I did not find a way to get this information into noodler.