Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add Regex Support For Email Verifier #1

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

shreyas-londhe
Copy link
Member

This PR adds regex matching for email body and headers.

Copy link

@SoraSuegami SoraSuegami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you add any unit tests?

Comment on lines +41 to +54
let (is_public, dfa) = part;
let fwd = dense::DFA::from_bytes(&dfa.fwd).unwrap().0;
let rev = dense::DFA::from_bytes(&dfa.bwd).unwrap().0;

let re = Regex::builder().build_from_dfas(fwd, rev);
let matches: Vec<Match> = re.find_iter(email_body).collect();

if !matches.is_empty() {
regex_verified = true;
if *is_public {
let substring = email_body[matches[0].start()..matches[0].end()].to_string();
regex_matches.push(substring);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shreyas-londhe
Could you explain why these processes are enough to verify the regex?
Specifically, what are roles of fwd and rev here?

Copy link
Member Author

@shreyas-londhe shreyas-londhe Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bug in the regex matching which I'm fixing, and will reflect in #2.

Specifically, what are roles of fwd and rev here?

fwd and rev are the DFAs serialized out of the circuit and are fed to the circuit as bytes. In the circuit we just deserialize it and construct the Regex which is the most efficient for matching ops.

Reference: https://docs.rs/regex-automata/latest/regex_automata/dfa/regex/struct.Builder.html#method.build_from_dfas

#[derive(Debug, Serialize, Deserialize)]
pub struct PublicKey {
pub key: Vec<u8>,
pub key_type: String,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the role of key_type here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key_type is a string which determines whether the public key is a RSA public key or ED25519 public key.

@SoraSuegami
Copy link

This is a comment for future PR.
Although I don't a standard manner to defined proved programs for risc0, I think it is better to separate common functions as library functions that can be used in a third-party's proved programs, which is a similar design to templates of our circom circuits.

@shreyas-londhe
Copy link
Member Author

@SoraSuegami I understand your concern and yes I have thought of this.

The reason you don't see a library implementation of the core code is because the current ZkVM implementations have different crates supported for precompiles and hence it becomes difficult to manage the dependencies properly. Once the ZkVM implementations become a bit stable, I'm planning to do an abstraction where we just write the rust and different ZkVM implementations are built automatically.

Something like this - https://github.com/MatteoMer/any-zkvm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants