-
-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: TIME variables support #1223
Conversation
6ebf213
to
d3b9c23
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1223 +/- ##
========================================
Coverage 81.66% 81.67%
========================================
Files 168 168
Lines 9655 9757 +102
========================================
+ Hits 7885 7969 +84
- Misses 1519 1537 +18
Partials 251 251
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
tx.variables.time.Set(timestamp.Format(time.TimeOnly)) | ||
tx.variables.timeDay.Set(strconv.Itoa(timestamp.Day())) | ||
tx.variables.timeEpoch.Set(strconv.FormatInt(timestamp.Unix(), 10)) | ||
tx.variables.timeHour.Set(strconv.Itoa(timestamp.Hour())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am inclined to say we can come up with a map for this instead of doing the conversion since these values are deterministic. Still worth to write a benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you propose to use a TIME_MAP
variable with keys such as hour
, day
, month
, year
, and so on? Did I get this right?
If so, this approach would further widen the compatibility gap between ModSecurity and Coraza, whereas the main goal of this PR is to reduce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I wasn't clear enough. I was advocating for having a map in memory e.g.
var hourConversion map[int]string = map[int]string{1: "1", 2: "2", ...}
instead of doing the strconv.Itoa
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of benchmarking, I run BenchmarkTransactionCreation - because all the values are set in the constructor.
Before changes:
cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkTransactionCreation
BenchmarkTransactionCreation-8 53013 21910 ns/op
PASS
After changes:
BenchmarkTransactionCreation
BenchmarkTransactionCreation-8 51748 22462 ns/op
PASS
However, the results vary widely from run to run. Am I missing something? Maybe, you could explain in more detail, what should I benchmark?
@@ -1360,6 +1378,18 @@ func makeTransaction(t testing.TB) *Transaction { | |||
return tx | |||
} | |||
|
|||
func makeTransactionTimestamped(t testing.TB) *Transaction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder by testing.TB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same interface as in makeTransaction
and makeTransactionMultipart
to ensure uniformity.
First of all, thanks a lot @geoolekom for taking the time to put up this PR, we truly appreciate users that are willing to contribute. That said, I am not fully convinced on the need of this. I believe the case you point out is more like a logger concern as we are only interpolating the variable to display the timestamp. Interpolating a variable isn't cheap, nor it is allocating all the variables we allocated and converted by default from int to string. Notice native environments are OK with clock calls but other environments like Wasm don't get it that simple. I would say if we could come up with a solid use case where the time is part of the rule I am in to land the variables (only those showing value for the rule) but if the concern is timestamp in the logs we should definitively address that in the logger. What do you think? |
Thanks so much for your detailed feedback @jcchavezs! I really appreciate you taking the time to review this PR and share your thoughts. Let me clarify the rationale behind these changes and explore possible solutions.
If the benchmarks show a real hit to performance, I’ve got a few ideas to make things more flexible:
Let me know what you think of these ideas. |
Thanks @geoolekom! I think most of the points raised here make sense. Let me do a couple of experiments as what I have in mind is that this a good opportunity for trying lazy collections (i.e. collections that are filled on demand) given that a transaction will be read in a sync fashion. |
internal/corazawaf/transaction.go
Outdated
tx.variables.timeMin.Set(strconv.Itoa(timestamp.Minute())) | ||
tx.variables.timeSec.Set(strconv.Itoa(timestamp.Second())) | ||
tx.variables.timeWday.Set(strconv.Itoa(int(timestamp.Weekday()))) | ||
tx.variables.timeMon.Set(strconv.Itoa(int(timestamp.Month()) - 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having TIME_MON
in the range 0-11
instead of 1-12
feels very counterintuitive to me.
ModSecurity behaviour is not consistent, with v2
ranging 1-12
and v3
ranging 0-11
. I opened this issue about it: owasp-modsecurity/ModSecurity#3305.
I'm pushing for a unified behaviour between all the engines and I feel that the natural one would be 1-12
range.
Note: If we are going to implement it as 1-12
also Coraza Docs have to be updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I agree. Should I update this PR right now, or wait until Modsecurity issue is closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might take long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can update in advance, right? Or should we stick with modsec-v3 compatibility and use 0-11 range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would not be compliant with one of the two current ModSecurity implementations anyway 😄.
There is already an open PR owasp-modsecurity/ModSecurity#3306 that aligns v3 to use 1-12
range. So I believe that we have pretty much agreed that 1-12
range is the one we should stick with. Let's go with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
d3b9c23
to
22be149
Compare
@jcchavezs Real-life examples of rules where Track time for filescanWe use a set of rules that stores the SecRule &TX:lua_present "@eq 0" "id:33363,nolog,auditLog,block,chain,severity:2,phase:2,t:none,msg:'IM360 WAF: Attempt to upload malware||Scan duration:%{TX.py_scan_duration}||Sizes:%{FILES_SIZES}||Combined size:%{FILES_COMBINED_SIZE}||User:%{SCRIPT_USERNAME}||Filename:%{FILES}||Scanned:%{FILES_TMPNAMES}||SC:%{SCRIPT_FILENAME}||WPU:%{TX.wp_user}||RSV:6.52||T:APACHE||',tag:'service_i360'"
SecRule FILES_TMPNAMES "@inspectFile /opt/alt/python35/share/imunify360/scripts/modsec_scan.py" "t:none,setvar:TX.py_scan_duration=%{DURATION},setvar:TX.py_scan_duration=-%{TX.py_scan_start}" Dynamic IP Tracking and RBL ValidationThis example tracks client IPs and combines them with timestamps for uniqueness, then identifies risky actions by matching request filenames against a known list. It verifies risky actions through an external RBL lookup to confirm malicious or undesirable behavior and then blocks requests based on positive RBL matches.. SecAction "id:33368,phase:2,pass,nolog,severity:5,setvar:tx.rbl_ip=%{TIME_HOUR}-%{TIME_MIN}.%{tx.remote_addr},initcol:ip=%{tx.remote_addr}"
SecRule REQUEST_FILENAME "@pmFromFile risky-actions.list" "id:33315,phase:2,block,severity:2,nolog,auditlog,t:none,msg:'IM360 WAF: RBL block risky actions||RSV:6.52||T:APACHE||MV:%{MATCHED_VAR}',chain,setvar:tx.rbl_perf=1,tag:'service_i360'"
SecRule TX:RBL_IP "@rbl risky-actions.v2.rbl.imunify.com." "t:none,chain"
SecRule TX:RBL_IP "!@rbl nxdomain.v2.rbl.imunify.com." "t:none" |
22be149
to
3e0131e
Compare
@jcchavezs Could you please elaborate, what could we do to get this feature merged? |
I wasn't meant to approve but to comment.
I worked a PoC on lazy collections in geoolekom#1 |
@@ -242,6 +242,7 @@ func (w *WAF) newTransaction(opts Options) *Transaction { | |||
tx.variables.duration.Set("0") | |||
tx.variables.highestSeverity.Set("0") | |||
tx.variables.uniqueID.Set(tx.id) | |||
tx.setTimeVariables() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't better to just remove this call and generate the value for each variable at access time instead? Kind of being more an init than set.
The lazy collections idea is generic and will work. I would like to see if we add extra memory/benchmarks.
Another option is to have specific time collections that receive the timestamp, and the function to apply when setting and only evaluate when getting. I guess in the end is more or less the same?
We merged this PR as it was and with a couple of performance improvements in #1242 because lazy collections wasn't bringing enough improvements for this particular use case that it did not feel right to land such a concept for low benefit. |
Make sure that you've checked the boxes below before you submit PR:
Thanks for your contribution ❤️
This pull request addresses issue #1220 by introducing support for the following time-related variables in Coraza WAF:
These variables enhance Coraza's functionality by enabling detailed logging and time-based rule creation, improving compatibility with ModSecurity. Additionally, this PR includes tests to ensure the correct implementation and behavior of these new variables.