You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a feature request. Please allow more step -a options to specify the number of records back to be referenced.
Currently, the slwin sliding window averages option requires a _m_n suffix to specify how many “m” records back and “n” records forward to reference.
It would be incredibly helpful if at least shift_lag and a few other step options accept this as well: mlr step -a shift_lag_12 -f Sales would reference 12 records back in order to create, e.g. a field called Sales_12, referencing sales 12 months back.
Other options, such as shift_lead, delta and ratio, would clearly benefit from this possibility as well.
My use case is for database-query post-processing, analysis and preparation for machine learning. I frequently use sequenced or time-series data and need to reference and analyze current values vs the same attributes lagged specific periods of time.
I’m in the process of migrating and automating all my post-processing using Miller, which has been fantastic (thanks!).
I currently achieve the multiple-lag reference by then-chaining the shift option, as in mlr step -a shift -f Sales then step -a shift -f Sales_shift then…., also renaming the fields and deleting unnecessary ones. This seems to work but is rather lengthy and unfriendly.
Thanks so much.
The text was updated successfully, but these errors were encountered:
@AndyXuma awesome!! I knew when working on slwin that I was creating (within the code) some more general opportunities -- and I hoped there would be demand for them. I'm happy to hear that there is!! :)
This is a feature request. Please allow more
step -a
options to specify the number of records back to be referenced.Currently, the
slwin
sliding window averages option requires a_m_n
suffix to specify how many “m” records back and “n” records forward to reference.It would be incredibly helpful if at least
shift_lag
and a few otherstep
options accept this as well:mlr step -a shift_lag_12 -f Sales
would reference 12 records back in order to create, e.g. a field called Sales_12, referencing sales 12 months back.Other options, such as
shift_lead
,delta
andratio
, would clearly benefit from this possibility as well.My use case is for database-query post-processing, analysis and preparation for machine learning. I frequently use sequenced or time-series data and need to reference and analyze current values vs the same attributes lagged specific periods of time.
I’m in the process of migrating and automating all my post-processing using Miller, which has been fantastic (thanks!).
I currently achieve the multiple-lag reference by then-chaining the
shift
option, as inmlr step -a shift -f Sales then step -a shift -f Sales_shift then….
, also renaming the fields and deleting unnecessary ones. This seems to work but is rather lengthy and unfriendly.Thanks so much.
The text was updated successfully, but these errors were encountered: