-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplifyFragmentData - unusual aggregation behavior #70
Comments
This issue cropped up on my machine this week. I was in the middle of something so I just fixed it by installing the "agb" branch with the above mentioned fix. Is anyone else having this issue? |
Thanks for reporting this. The default behavior of reshape2::dcast has always bothered me: there are few cases where a warning + change in behavior is a good idea. Do you have some examples that I could check? I'll use this as the basis for the first soilDB test suite element (884a234). I suspect that the problem may be related to duplication with respect to |
It is not clear which pedons are causing this issue, but you are correct that it seems to be some sort of an issue with how the pedon is specified or how the records are uniquely identified. If you query all NASIS pedons in CA630 using fetchNASIS_pedons() you will notice that it defaults to length as the aggregation function which causes the total_frags_pct and total_frags_pct_nopf to calculate incorrectly. |
If I had a smaller set of pedons that I could reproduce the error with, I would. But the smaller selected sets I have been working on are summing correctly. |
There were several issues that may have been contributing to the unusual and hard to replicate data:
These were all accounted for in e2b237d. This commit comes with a slight change in behavior: records with NA This commit addresses an issue in #43: |
After some more testing, I no longer see the warning message from Reminder to future self: don't trust |
On Julie's machine, rock fragment volume sums within classes returned by simplifyFragmentData() were not correct.
Stepping through the code... the data in NASIS were correctly populated, and the code to sort fragment records into size classes and sum within classes is working correctly.
The problem arises when going from long->wide (1 record per horizon with frags) via the reshape2::dcast() call.
A warning printed to the console about the default aggregation (length) function being used led me to try using
fun.aggregate=sum
in the dcast() call, which solved the problem. However, presumably, the sums have already been calculated via:Has something changed in reshape2? Has the default aggregation always been length??
I have been unable to replicate the error condition on my machine. Instead of correct volumetric sums, integer values between 1 and 5 were being returned after conversion from long->wide format. Could these numbers be the lengths (number of records contributing) to the "sum" for each class rather than the sum itself?
The text was updated successfully, but these errors were encountered: