Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underapproximating to_int/from_int #117

Merged
merged 18 commits into from
Feb 13, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
different limit
  • Loading branch information
jurajsic committed Feb 6, 2024
commit a2558ce1b25e6727b7adbf4f6fc40b8e8e8881a3
4 changes: 3 additions & 1 deletion src/smt/theory_str_noodler/decision_procedure.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -817,7 +817,9 @@ namespace smt::noodler {
STRACE("str-conversion", tout << "failing NFA:" << std::endl << aut_valid_part << std::endl;);
// util::throw_error("cannot process to_int/from_int for automaton with infinite language");
jurajsic marked this conversation as resolved.
Show resolved Hide resolved
is_underapproximation = true;
max_length_of_words = 3; // there are 10^max_length_if_words possible cases, we put limit so there is not MEMOUT
if (max_length_of_words > 3) {
vhavlena marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a future work, we could propagate the LIA solver here and check, if the formula subst_var < 100 is valid. In that case we have a precise solution and we could set res_precision to PRECISE.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subst_var is a string var, so we would have to check something else, probably you meant |subst_var| < 3.
But I am not sure if it would work.
I was thinking about doing something a bit similar, returning multiple formulae (one underapproximation, one something like overapproximation), but it is quite complicated and it is not that clear how it should work.

Anyway, for stringfuzz, the best thing would be to transform n = str.to_int(x) on the input to n = str.to_int(x) & x ∈ 0*n (and also do that for lenghts, i.e. n = str.len(x) & x ∈ .{n}), I think that would solve everything.
So doing something with overapproximation or something similar would only be helpful for full_str_int, and it is questionable how helpful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. ok, and what about the same checking for from_int? Is it possible at all in our setting? The replacement of from_int by to_int is done in relevant_eh, right ?
  2. Agree, there is a good hope that this can be done in seq rewriter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I do not understand the question. What do you mean by same checking? Also from_int is not replaced with to_int, I do handle them in similar way during the LIA formula creation, but it is still important to know if I have to_int or from_int (for handling invalid cases).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wrote it confusedly. I though that the current problem (mainly for full_str_int) are constraints of the form w = from_int(x) where x is some integer variable. Because we do not know about the possible values of x, we are not currently able to solve it, right? But if we find out that x is always smaller than 100, we can enumerate it quite easily.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I understand now. It can probably be somehow applied for it, I'll have to think about it.

Anyway, there are only 363 formulae with from_int in full_str_int, so I don't think it's the main problem for this benchmark right now.
You can see that the underapproximation, which is useful only for SAT cases (see how number of UNSAT cases did not change at all), already helped a lot with unknowns, however, most of them moved to TO/MO, so I think we also need to focus on simplifying the formula somehow.

max_length_of_words = 3; // there are 10^max_length_if_words possible cases, we put limit so there is not MEMOUT
}
}

std::vector<std::vector<mata::Word>> new_cases;
Expand Down