Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove if len(match) != 2 check in detectors #2746

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rgmz
Copy link
Contributor

@rgmz rgmz commented Apr 25, 2024

Description:

I believe the intention behind this code was to make sure that a given pattern matched, but it doesn't actually do this. A while ago, I explained in a PR comment how the if len(match) != 2 check doesn't necessarily do what it appears to, and how it can actually silently break detectors.

In short, FindAllStringSubmatch only returns results that match the given pattern. The number of match groups in a specific match is static, meaning that the pattern (fo.)bar(.?) will always have 3 groups even if the group after bar is empty. If the number of match groups in a pattern is changed and the if statement isn't, it will cause matches to be silently discarded.

https://go.dev/play/p/TXYh1bQItaO

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@rgmz rgmz requested a review from a team as a code owner April 25, 2024 12:40
@bugbaba
Copy link

bugbaba commented Apr 26, 2024

Even I was curious as to know when and how many times does this if len(match) != 2 is equating to true. So I added a print statement for every instance of it for all the detectors in the first week of Jan this year.

I have only seen this message for jira token detector few times for domainPat and emailPat. But I haven't done the root cause analysis for the same.

image

@rgmz
Copy link
Contributor Author

rgmz commented Apr 26, 2024

The JIRA email is slightly different because it's checking the length of the split and not the match groups. I don't really understand it's purpose, though.

email = strings.Split(email[0], " ")
if len(email) != 2 {
continue
}

@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from 9959218 to 50f3e98 Compare April 30, 2024 16:53
@zricethezav
Copy link
Collaborator

@rgmz I believe this check is to prevent index errors like the one demonstrated here https://go.dev/play/p/MS8AyzGb1uq. Rather than removing the check, it would be better if we change the condition to if len(match) < 2 to ensure at least one capture group is present in the regex.

@rgmz
Copy link
Contributor Author

rgmz commented May 1, 2024

That still has the problem of silently ignoring legitimate errors. I believe that logic errors should explicitly fail, rather than quietly continuing.

@zricethezav
Copy link
Collaborator

That still has the problem of silently ignoring legitimate errors. I believe that logic errors should explicitly fail, rather than quietly continuing.

I disagree. It's less of a logic error and more of a incorrectly written regex issue. A single index error on a detector shouldn't cause the whole scan to fail imo.

@rgmz
Copy link
Contributor Author

rgmz commented May 1, 2024

Some patterns are precise and don't have any capture groups, so you'd only have one match (the entire matching string).

A single index error on a detector shouldn't cause the whole scan to fail imo.

This error means that the detector can never succeed and will miss valid secrets. It would be easy to detect in the build/test process, so a user should never actually ecounter this issue.

@zricethezav
Copy link
Collaborator

Some patterns are precise and don't have any capture groups, so you'd only have one match (the entire matching string).

Then the check could be safely removed if present.

For me to feel comfortable merging this blanket removal would require a test that ensures we would not get an index out of range error.

It would be easy to detect in the build/test process, so a user should never actually ecounter this issue.

I agree with this

@rgmz
Copy link
Contributor Author

rgmz commented May 1, 2024

For me to feel comfortable merging this blanket removal would require a test that ensures we would not get an index out of range error.

Ideally, each detector would have unit tests but only a few do at the moment.

The TestX_FromChunk case should cover this, however, it isn't clear what triggers those cases. They don't seem to run with test:
https://github.com/trufflesecurity/trufflehog/actions/runs/8912166932/job/24475012246

@rgmz rgmz force-pushed the chore/detector-len-check branch from 50f3e98 to 79a1bd9 Compare May 7, 2024 12:47
@rgmz rgmz force-pushed the chore/detector-len-check branch from 79a1bd9 to f2c38e0 Compare May 22, 2024 21:03
@rgmz rgmz force-pushed the chore/detector-len-check branch from f2c38e0 to c5833ad Compare June 5, 2024 00:41
@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from 1639822 to 2ad95b4 Compare June 17, 2024 23:11
@rgmz rgmz force-pushed the chore/detector-len-check branch from 2ad95b4 to 4e475a7 Compare June 21, 2024 02:44
@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from 2896043 to 672de68 Compare July 1, 2024 18:35
@rgmz rgmz mentioned this pull request Oct 4, 2024
2 tasks
@rgmz rgmz force-pushed the chore/detector-len-check branch from 672de68 to 69f7148 Compare November 3, 2024 14:20
@rgmz rgmz requested a review from a team as a code owner November 3, 2024 14:20
@rgmz rgmz force-pushed the chore/detector-len-check branch 3 times, most recently from b45f22f to 31cb8dc Compare November 11, 2024 19:18
@rgmz rgmz force-pushed the chore/detector-len-check branch from 31cb8dc to 2217aab Compare November 21, 2024 00:29
@rgmz rgmz requested a review from a team as a code owner November 21, 2024 00:29
@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from fbf049c to e054c80 Compare November 21, 2024 02:57
@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from 9baaec4 to 1d43308 Compare November 27, 2024 00:23
@rgmz rgmz force-pushed the chore/detector-len-check branch from 1d43308 to 887a9bf Compare December 2, 2024 13:29
@rgmz rgmz force-pushed the chore/detector-len-check branch from 887a9bf to 382c516 Compare December 12, 2024 22:07
@rgmz
Copy link
Contributor Author

rgmz commented Dec 12, 2024

For me to feel comfortable merging this blanket removal would require a test that ensures we would not get an index out of range error.

@zricethezav Are we able to merge this now that ever detector has unit tests? #3771 was the last of the batch.

@rgmz rgmz force-pushed the chore/detector-len-check branch 2 times, most recently from bc20ee5 to b88c84a Compare December 16, 2024 14:48
@rgmz rgmz force-pushed the chore/detector-len-check branch from b88c84a to ea9cd69 Compare December 21, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants