-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outdoor legacy classifications and generic classification API #103
base: main
Are you sure you want to change the base?
Conversation
Hooked up to imports, maintianing same api, built required dictionaries and functions. Placeholder data used for handicap thresholds. Checks for minimum distance/number of arrows not yet applied.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #103 +/- ##
==========================================
- Coverage 97.81% 96.41% -1.40%
==========================================
Files 30 32 +2
Lines 1742 1815 +73
==========================================
+ Hits 1704 1750 +46
- Misses 38 65 +27
|
As promised, I think fundamentally the fact that everything going into these classification functions aside from a score is a string, is a serious issue. Both in terms of what has to happen internally in the code to process these strings: #Theres lots of string processing and mangling going on to clean up the arguments
/archeryutils/classifications/agb_old_field_classifications.py:
178: if age_group.lower().replace(" ", "") in ("adult", "50+", "under21"):
179 age_group = "Adult"
180: elif re.compile("under(18|16|15|14|12)").match(age_group.lower().replace(" ", "")):
181 age_group = "Under 18"
190 if (
191: bowstyle.lower().replace(" ", "") in ("compound", "recurve")
192 and "wa_field_24_red_" not in roundname
193 ):
194 return "unclassified"
195 if (
196: bowstyle.lower().replace(" ", "")
197 in (
198 "barebow",
... And also for the end user as they are trying to figure out how to use the functions # you are utterly on your own when you are trying to find the right round name or category names...
In [6]: import archeryutils.classifications as cf
In [7]: cf.agb_outdoor_classification_scores('short warwick', 'recurve', 'womens', 'u18') #wrong round, wrong gender, wrong age class...
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[7], line 1
----> 1 cf.agb_outdoor_classification_scores('short warwick', 'recurve', 'womens', 'u18')
File ./archeryutils/classifications/agb_outdoor_classifications.py:543, in agb_outdoor_classification_scores(roundname, bowstyle, gender, age_group)
540 bowstyle = "Compound"
542 groupname = cls_funcs.get_groupname(bowstyle, gender, age_group)
--> 543 group_data = agb_outdoor_classifications[groupname]
545 # Get scores required on this round for each classification
546 class_scores = [
547 hc.score_for_round(
548 group_data["class_HC"][i],
(...)
553 for i in range(len(group_data["classes"]))
554 ]
KeyError: 'u18_womens_recurve' Finally you end up with the frustating situation that if you have already constructed a round object from scratch, or even loaded one from the pre-assigned rounds. You can't actually use it to work out classifications, even though the classification functions will use a fully instantiated round object to do thier own work. from archeryutils import load_rounds
rnd = load_rounds.AGB_indoor.portsmouth
cf.agb_indoor_classification_scores(rnd, 'compound', 'male', 'under 14')
# TypeError: Unhasable type: 'Round' It is, I think, a bit too far beyond the scope of what I want to get done in this request. But is something I'm keeping in mind as I work on it. |
@TomHall2020 a lot to unpack, so will try and place comments here, but please remind me if I miss something. As you already raise, this should remain a PR for reviving old outdoor classifications, with other amendments made through other pull requests please. These could include:
Other points:
For the linting issues you are seeing I have submitted a patch in #106 which you can rebase off of once merged. |
In the spirit of seperating things out I'll do a quick PR for that once the other small ones are merged in, and then I can rebase the next steps for this on top of everything else. Should be a trivial change as its just an implementation detail, no changes to docs/testing required.
And using the same convention where the package is imported into test files as well.
Huh, I hadn't really clocked that was the case but right you are. Example Ianseo results page. Having said that I don't know if most people will realise that there is that distinction, and at least at WA events it is also very common to see the abbreviated
At a minimum it needs to be documented, at the moment you can only find out from the souce code itself, and due to the way the data dictionaries are constructed even then the only place that actually contains all of the supported age category strings in a single file is the json data file. Having said all that it seems a bit silly to put too much effort into deciding which nomenclature is the most correct because I think its clear that the best thing to do is to support common aliases at the point of use, so long as the input is unambigous. We aren't really here to catch users out. I'm wondering if this is a use case for an Enum? That would be more self documenting and allows autocompletion. I'm fairly sure with the right use of StrEnum we could make it so that providing the old string values or valid aliases that we pre-define also still passes so it remains backwards compatible. I'll open an issue on the topic of supporting aliases, and list some suggestions as to which ones could be included. Then any implementation details could follow on from that.
Yeah, I did guess enforcing supported rounds per system was the reason. Passing round.name works fine, but on principle of least surprise it still seems frustrating to have to parse out my roundname from my perfectly functional round object just so it can be looked up again. I'd argue that given the fix of looking up the rounds name is so simple, why not just allow the functions to accept a round and we do that check as well? Could be abstracted to a utility function. #inside xx_classification_scores()
rnd = get_check_round_by_name(roundname, ALL_XX_ROUNDS)
...
def get_check_round_by_name(rnd, accepted_rounds)
if isinstance(rnd, Round) and rnd.name in accepted_rounds
return rnd
else:
return accepted_rounds[rnd] Naming feels a bit clunky but the principle is there. |
D'oh, once again caught out by linter versioning. I actually hadn't updated my venv with the changes to main since my last contributions. |
On mentioning the linting, I'm pre-empting making the generic classification functions and related to the talk about string arugments to the underlying methods, we are going to get complaints from the linters about having too many function arguments when implementing those, and to be fair they do seem right. An example will be clearer: # current example from agb_outdoors:
def calculate_agb_outdoor_classification(
score: float,
roundname: str,
bowstyle: str,
gender: str,
age_group: str,
) -> str:
...
# to make generic we now also need to take
# the handicap system (old or new)
# and discipline/format (indoor/outdoor/field)
# as parameters
def calculate_classification(
score: float,
roundname: str,
bowstyle: str,
gender: str,
age_group: str,
handicap_system: str,
discipline: str
) -> str:
...
# ERROR: PLR0913 Too many arguments in function definition (7 > 5) I agree it is too many, they are nearly all strings so type safety is pretty useless, and would be pretty unwieldy to use:
This ordering one especially bites me because while it matches the order the composite keys that Accepting preformed group names such as
Perhaps one way to cheat the linter would be to take those as a (named)tuple rather than seperate arugments: >>> calculate_classification(1200, ('recurve', 'female', '50+'), 'AGB', 'outdoor')
# smallest change from existing format, just need to enclose the category arguments in parentheses
>>> category = Category(gender='female', bowstyle='recurve', age='50+')
>>> calculate_classification(1200, category, 'AGB', 'outdoor')
# now we can also encapsulate the category information and pass arguements out of order by keyword I was remembering how we handled the parsing of units as tuples in the target/pass/round functionality. Food for thought. |
@TomHall2020 I have just added the handicap thresholds for the various categories. I pushed them direct to this branch (couldn't get GitHub to play nice with opening a PR from my repo to your fork) so you'll want to pull to get these locally before any further changes to avoid conflicts. More generally if you rebase off of the latest main on my repo that should hopefully resolve the conflicts and linting issues. For tests there is a version of the old classification tables available here: https://southleedsarchers.org.uk/wp-content/uploads/2022/03/AGB-Shooting-Admin-Procedures-2020.pdf |
First steps towards implementing the pre-2023 AGB outdoor classifications into archery utils.
Generally I have started by just trying to do the same style of implementation as the other 5 modules in the classifications package. I went for an explict style of declaring the categories (bowstyle/gender/age) and requirements rather than a looped implict method as in the new style, as that fits better with how its done for the old indoor and field requirements, and also it was easier to construct from scratch and visualise what they are.
I've not dug up a definitieve reference to the old rules of shooting yet but from memory and a bit of pouring over some old classification tables I've worked on the following basis:
Checklist
Plan is to get this part at least mostly working, and then get to work on the generic API for classifications, but we can begin the discussions on that while this is being done. There are also a few minor inconsistencies and tweaks to improve the existing classifications code that I'd like to work on while I've got the whole lot in the operating theatre so to speak...
itertools.product
, will reduce indentation and make easier to read without any changes to the underlying code"unclassified"
for scores below lowest threshold, where all other modules return"UC"
'adult', 'under 18'
but'50+'
? would make more sense to takeover 50
, but this also points to a deeper issue I'll get to laterold_agb_field_classification_scores
andold_agb_field_classifications
are inconsistent in anming convention with the rest of the package, suggest chaning instead toagb_old_field_classification_scores
etc. (this really got me for a while where I was trying the same parameters in different function by just editing the function name)class_funcs
in test suite when in source code the very similar namecls_funcs
is used to import classification_utils - suggest changing this tocf
similar to importingarcheryutils.handicaps as hc
More generally beyond this it comes to looking at the generic api, which I
thinkhope will be quite simple to implement along similar lines to how we did in #78, with generic functions along the lines of:That then just dispatches to the appropriate module. Currently each module publicly exposes its
calculate_xx_classification
method and axx_classification_scores
method which is required to power the calculation, though the return type with its fill value of-9999
isn't super friendly to work with. Both of these are also moved up into the overall classifications package namespace along with the module itself and I would plan to leave them there for backwards compatibility.Theres also a dictionary of the same name as the module which is essentially the compiled raw data needed by the functions, which we take great pains to make sure can never be recreated by deleting the private function that generatea it, something I can't quite work out a reason for. It's definitely bit me before earlier on when I first tried to work on the module and was trying to understand the code. I remember using some of the other private functions in the new outdoor classification module which mutate the underlying data in place and then being unable to regenerate the original data.
A side effect of that is that the namespace is really verbose when trying to get around in an interactive shell or do imports, especailly as tab completion will happily jump you into the modules quite easily where you have to repeat yourself over again to find the function you wanted. Even though they are promoted up to the package namespace I find myself accidentally doing this surprisingly often, and also getting confused by the module and then the data dictionary within that module having the same name.
I propose while adding the generic access function, we could simplify the names of the functions within each module, and just re-alias them to their current form when importing them up to the package. Such that each module would contain a
make_classification_dict/data/requirements
function, acalculate_classification
and aclassification_scores
(orscore_thresholds
function. This would also make life a bit easier when doing the implementation of the generic functions, as we could just match the correct module with the user provided system and discipline, and then call that modules identically named functions to do the calculations.At any rate thats a decent brain dump of where my thinking is.