Add precision classification metric #2293

tsanona · 2024-09-21T20:29:08Z

Checklist

Confirmed that run-checks all script has been executed.
Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Changes

Hello, so I wanted to add precision to the metrics and ended up realizing that implementing a confusion matrix would not only help with precision but also other classification metrics. So I implemented it also along with some code to deal with classification averaging (micro, macro) and thresholds .
I also sneaked in some code to generate some dummy data for testing the metrics. I'm not sure it is decent enough but open for suggestions.
I do realize that it is a long PR, if necessary I can split it into different ones.
I'm new to rust dev and come from python so lemme know if I need to adjust some patterns.

Testing

Like said above I added some dummy classification data generator and tested confusion matrix and precision metric's methods with it.

improve dummy classification input. reformat precision and add test with dummy data.

… module to lib.rs make precision a classification metric.

crates/burn-train/src/metric/precision.rs

antimora · 2024-09-22T19:20:43Z

+1 confusion matrix

codecov · 2024-09-22T19:33:24Z

Codecov Report

Attention: Patch coverage is 77.47748% with 50 lines in your changes missing coverage. Please review.

Project coverage is 82.90%. Comparing base (6d105ea) to head (03ebe1d).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/burn-train/src/metric/confusion_stats.rs	71.60%	23 Missing ⚠️
crates/burn-train/src/learner/classification.rs	0.00%	15 Missing ⚠️
crates/burn-train/src/metric/precision.rs	90.41%	7 Missing ⚠️
crates/burn-tensor/src/tensor/api/check.rs	0.00%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2293      +/-   ##
==========================================
- Coverage   82.93%   82.90%   -0.03%     
==========================================
  Files         815      818       +3     
  Lines      105344   105603     +259     
==========================================
+ Hits        87371    87555     +184     
- Misses      17973    18048      +75

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

laggui · 2024-09-23T12:04:10Z

Ha, I was just discussing this with another user on discord last week 😄

Thanks for the PR, I'll try to look into it today!

laggui

Ok, just took some time to review! Thanks again for contributing 🙏

Overall, the numerical implementation looks correct! 🙂

But I have some comments regarding the implementation.

Also, I would refactor the tests to split them into smaller individual units. I see that you wanted to reuse as much code, which is great, but in this case it also makes the tests complicated.

Each functionality should be isolated in the tests. It makes it easier to 1) understand what is going on (and what is being tested) and 2) pin point what test failed in the event of a regression.

With these changes, the ClassificationType enum iter can probably be eliminated so the binary, multiclass and multilabel cases are separated in different test cases.

crates/burn-train/src/lib.rs

crates/burn-train/src/metric/base.rs

crates/burn-train/src/metric/confusion_matrix.rs

…tion input, clarify descriptions, remove dead code, rename some objects

tsanona · 2024-10-14T14:52:40Z

Hey sorry it took me some time, I made some changes and continued the discussion on some of your comments. Lemme know what you think :)

laggui · 2024-10-16T13:21:14Z

Hey sorry it took me some time, I made some changes and continued the discussion on some of your comments. Lemme know what you think :)

No worries 🙂 Thanks for addressing my comments, I'll take a look at the changes today!

laggui

Great stuff! Especially with the tests, much easier to understand what is being tested now 😄

I have some minor comments, but once these are addressed I think we should be good to merge!

crates/burn-train/src/metric/base.rs

crates/burn-train/src/metric/confusion_matrix.rs

crates/burn-train/Cargo.toml

crates/burn-train/src/metric/confusion_stats.rs

crates/burn-train/src/metric/precision.rs

crates/burn-train/Cargo.toml

…approx lib and use tensordata asserts, move aggregate and average functions to ConfusionStats implementation

tsanona · 2024-10-23T01:43:21Z

Hey sorry I've added an extra change that I realized was missing. With just thresholds one couldn't use the metric correctly for multiclass thus I added top_k as an mutually exclusive option so it makes sense.

laggui · 2024-10-23T12:45:50Z

No problem, will try to make time to review this today or tomorrow!

laggui

Thanks for addressing the changes!

Just a couple of comments regarding some stuff that has moved or been added in the latest commits, but we're almost there 😄

crates/burn-train/src/metric/confusion_stats.rs

laggui · 2024-10-25T13:33:25Z

crates/burn-train/src/metric/classification.rs

+    /// Sample x Class Non thresholded normalized predictions.
+    pub predictions: Tensor<B, 2>,
+    /// Sample x Class one-hot encoded target.
+    pub targets: Tensor<B, 2, Bool>,


With your changes in #2413 we should be able to accept targets that are not one-hot encoded (e.g., [1, 0, 2, 2, 0] instead of [[0, 1, 0], [1, 0, 0], [0, 0, 1], [0, 0, 1], [1, 0, 0]]) and have it configurable for the metric.

But this can be done in a follow-up PR 🙂

That's true, I've been looking at the existing code in burn-trai/src/learner/classification and I think it would be easier to just use the adaptor to convert between bin/multiclass/multilable outputs to the general one hot encoded metrics, non?

Yeah my comment was not meant to say that this is where is should be handled specifically 😄

Yeah of course, I just wanted to get your thoughts on what I had thought to do. Now we have the ClassificationOutput and MultiLabelClassificationOutput adapted for ClassificationInput. It works but I'm still not super happy about it since then the user is able to use, for example BinaryPrecisionMetric with a MultiLabelClassificationOutput. My idea for the future would be to have separated Inputs and Outputs for each of the classification types such that this would not be possible and would complain at compile time. Thoughts?

I understand where you're coming from! But I don't see a straightforward way to do this.. we would have to have a different implementation for the binary, multiclass and multilabel precision metrics because the input is an associated type for the Metric trait.

For now I think it's fine.

Yeah, I couldn't think of a way around it but maybe it'll come to me while working on other metrics. 🤞

crates/burn-train/src/metric/classification.rs

crates/burn-train/src/metric/confusion_stats.rs

crates/burn-train/src/metric/precision.rs

… make dummy data more predictable and add tests for top_k > 1

…case num_class = 1, reformat dummy data, make use of derive(new) for metric init.

laggui · 2024-11-04T14:58:34Z

Fyi @antimora not all changes have been addressed so further review will be pending until then.

The metric should handle the different configurations for binary, multiclass and multi-label (previously suggested separate structs BinaryPrecision, MulticlassPrecision and MultilabelPrecision).

…th class_reduction as default and new setter implementation, move NonZerousize boundary to confusion_stats

tsanona · 2024-11-08T08:09:10Z

Hey hey, sorry it took me some time, I was trying to find a way to tie the metric with the type of Output chosen when implementing the TrainStep for the Model but I wasn't successful so I just went with what I pushed now. I also found an error on the one_hot function that we pushed before. Lemme know what you think :)

laggui

I think the changes are good overall!

Just come minor comments.

Also, what about splitting the metric to have BinaryPrecision, MulticlassPrecision and MultilabelPrecision instead of users having to fiddle with the right top_k and threshold parameters?

crates/burn-tensor/src/tensor/api/int.rs

crates/burn-train/src/metric/confusion_stats.rs

crates/burn-train/src/metric/precision.rs

…mplementation, deal with classification output with 1 class, make macro average default, expose ClassReduction type and split precision implementations by classification type

# Conflicts: # burn-book/src/building-blocks/metric.md # crates/burn-train/Cargo.toml

tsanona · 2024-11-16T02:55:04Z

I've separated the Precision metric by classification type. There is a lot of repeated code but I couldn't find a better way to do it, got any advice? Lemme know what you think 🚀

…onInput

laggui

@tsanona

I made some changes to reduce the repeated code. The PrecisionMetric now has a config and can only be created from one of

PrecisionMetric::binary(threshold)
PrecisionMetric::multiclass(top_k)
PrecisionMetric::multilabel(threshold)

I think this reduces the friction points we previously discussed.

See my comment also for the adaptor.

laggui · 2024-11-18T20:33:43Z

crates/burn-train/src/learner/classification.rs

+            PrecisionInput::new(
+                self.output.clone(),
+                self.targets.clone().unsqueeze_dim(1).bool(),
+            )
+        }


Shouldn't this also transform the targets with .one_hot(...) but force the num classes to 2 (assuming binary classification)?

Hum, I think that would be covered by the case above since then num_classes == 2. This, I think, is slightly different from binary classification ( I'm thinking for example classifying binary: spam email vs not spam email and multiclass with 2 labels: trees or bushes).Finally, I don't think it would work as output and targets should have the same shape.

Spam vs not spam (i.e., positive vs negative) is still considered binary classification. I think what you are talking about is just that one uses sigmoid to model the positive output (so there is only one score) vs softmax where you have a score for each. But the targets are still represented as 0 or 1.

I'm not sure what self.targets.clone().unsqueeze_dim(1).bool() is supposed to represent 🤔 I could be missing something though

Sure, at the end of the day they all are the same thing. The point I was trying to make is that in binary classification tasks I expect the output of the model to be (batch_size x 1) while in multi class I would expect (batch_size x 2), thus the targets for the first would be transformed from (batch_size) -> (batch_size, 1) and the second (batch_size) -> (batch_size, 2) so they match the shapes of the outputs. Still this is a preference, if you think it doesn't make sense we can just assert that the second dim of output is not less than 2.

Ahhh ok nvm I thought this would lead to some issues because the targets are expected to be one-hot encoded, but in reality this is not entirely true for the binary case for a single scalar output and target. The operations performed will still be valid.

tsanona · 2024-11-19T00:24:33Z

Yeah thank you for the changes it looks much better! I like the config solution. Just added pub to the PrecisionMetric, all else seems fine to me. Also, just out of curiosity, why isn't it possible to use the general ClassificationInput?

laggui

Good catch on the public 😅

Also, just out of curiosity, why isn't it possible to use the general ClassificationInput?

I simply renamed it since it only applies to the precision metric for now.

Should be good to merge, just one comment left 🙂

laggui · 2024-11-19T13:24:25Z

crates/burn-train/src/learner/classification.rs

+            PrecisionInput::new(
+                self.output.clone(),
+                self.targets.clone().unsqueeze_dim(1).bool(),
+            )
+        }


Spam vs not spam (i.e., positive vs negative) is still considered binary classification. I think what you are talking about is just that one uses sigmoid to model the positive output (so there is only one score) vs softmax where you have a score for each. But the targets are still represented as 0 or 1.

I'm not sure what self.targets.clone().unsqueeze_dim(1).bool() is supposed to represent 🤔 I could be missing something though

laggui · 2024-11-20T14:42:45Z

crates/burn-train/src/learner/classification.rs

+            PrecisionInput::new(
+                self.output.clone(),
+                self.targets.clone().unsqueeze_dim(1).bool(),
+            )
+        }


Ahhh ok nvm I thought this would lead to some issues because the targets are expected to be one-hot encoded, but in reality this is not entirely true for the binary case for a single scalar output and target. The operations performed will still be valid.

Tiago Sanona added 8 commits September 21, 2024 22:02

Implement confusion matrix and precision, first draft

c73438f

Implement confusion matrix

63f4a1d

format :D

b9d71b6

add agg type to cm, reformat debug representation add testing.

eac29aa

improve dummy classification input. reformat precision and add test with dummy data.

formating and tiny refactor

59db68b

add ClassificationMetric trait, rename variables and types, move test…

4261bd8

… module to lib.rs make precision a classification metric.

change unwrap to expect

5431a2f

update book

fd2e585

oojo12 reviewed Sep 22, 2024

View reviewed changes

crates/burn-train/src/metric/precision.rs Outdated Show resolved Hide resolved

Tiago Sanona added 2 commits September 22, 2024 16:05

remove unused code

56965e8

changes to make reusing code easier

419438a

format :D

dfac847

laggui changed the title ~~Add to metrics~~ Add precision classification metric with confusion matrix Sep 23, 2024

laggui requested changes Sep 23, 2024

View reviewed changes

laggui changed the title ~~Add precision classification metric with confusion matrix~~ Add precision classification metric Sep 24, 2024

Tiago Sanona added 2 commits September 24, 2024 20:15

change to static data tests

ea4b29c

remove classification metric trait, add auxiliary code for classifica…

e23aa7b

…tion input, clarify descriptions, remove dead code, rename some objects

tsanona requested a review from laggui October 14, 2024 14:53

laggui requested changes Oct 18, 2024

View reviewed changes

move classification objects to classification.rs, use rstest, remove …

60a246b

…approx lib and use tensordata asserts, move aggregate and average functions to ConfusionStats implementation

tsanona requested a review from laggui October 21, 2024 17:36

review docstring, add top_k for multiclass tasks.

c145531

laggui requested changes Oct 25, 2024

View reviewed changes

Tiago Sanona added 3 commits October 25, 2024 18:51

move class averaging and metric computation to metric implementation,…

0c984c4

… make dummy data more predictable and add tests for top_k > 1

change struct and var names

b0a2939

Merge branch 'main' into add-to-metrics

f18e321

antimora requested a review from laggui October 29, 2024 19:18

Tiago Sanona added 2 commits October 30, 2024 02:18

rename params, enforce nonzero for top_k param, optimize one_hot for …

386802c

…case num_class = 1, reformat dummy data, make use of derive(new) for metric init.

add adaptor por classification input, correct one hot function

b525527

Tiago Sanona added 3 commits November 8, 2024 06:09

define default for ClassReduction, derive new for Precision metric wi…

ff7611a

…th class_reduction as default and new setter implementation, move NonZerousize boundary to confusion_stats

Merge branch 'main' into add-to-metrics

4cbcff2

expose PrecisionMetric, change metric initialization

eeab0d3

laggui requested changes Nov 8, 2024

View reviewed changes

Tiago Sanona added 2 commits November 16, 2024 03:48

check one_hot input tensor has more than 1 classes and correct it's i…

aea207f

…mplementation, deal with classification output with 1 class, make macro average default, expose ClassReduction type and split precision implementations by classification type

Merge branch 'main' into add-to-metrics

410f273

# Conflicts: # burn-book/src/building-blocks/metric.md # crates/burn-train/Cargo.toml

implement adaptor for MultilabelClassificationOutput and Classificati…

746fa9d

…onInput

nathanielsimard requested a review from laggui November 18, 2024 17:13

Tiago Sanona and others added 3 commits November 18, 2024 20:11

change with_top_k to take usize

7428b86

Merge branch 'main' into add-to-metrics

58e1902

Add precision config for binary, multiclass and multilabel

d598f00

laggui reviewed Nov 18, 2024

View reviewed changes

laggui and others added 2 commits November 18, 2024 15:49

Fix dummy_classification_input

1542ee9

make PrecisionMetric public

03ebe1d

tsanona requested a review from laggui November 19, 2024 00:31

laggui reviewed Nov 19, 2024

View reviewed changes

tsanona requested a review from laggui November 19, 2024 19:14

laggui approved these changes Nov 20, 2024

View reviewed changes

laggui merged commit 76e67bf into tracel-ai:main Nov 20, 2024
11 checks passed

tsanona deleted the add-to-metrics branch November 20, 2024 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add precision classification metric #2293

Add precision classification metric #2293

tsanona commented Sep 21, 2024 •

edited

Loading

antimora commented Sep 22, 2024 •

edited

Loading

codecov bot commented Sep 22, 2024 •

edited

Loading

laggui commented Sep 23, 2024

laggui left a comment

tsanona commented Oct 14, 2024

laggui commented Oct 16, 2024

laggui left a comment

tsanona commented Oct 23, 2024

laggui commented Oct 23, 2024

laggui left a comment

laggui Oct 25, 2024

tsanona Oct 25, 2024

laggui Nov 4, 2024

tsanona Nov 16, 2024

laggui Nov 18, 2024

tsanona Nov 18, 2024

laggui commented Nov 4, 2024

tsanona commented Nov 8, 2024

laggui left a comment

tsanona commented Nov 16, 2024

laggui left a comment

laggui Nov 18, 2024

tsanona Nov 18, 2024 •

edited

Loading

laggui Nov 19, 2024

tsanona Nov 19, 2024

laggui Nov 20, 2024

tsanona commented Nov 19, 2024

laggui left a comment •

edited

Loading

laggui Nov 19, 2024

laggui Nov 20, 2024

Add precision classification metric #2293

Add precision classification metric #2293

Conversation

tsanona commented Sep 21, 2024 • edited Loading

Checklist

Related Issues/PRs

Changes

Testing

antimora commented Sep 22, 2024 • edited Loading

codecov bot commented Sep 22, 2024 • edited Loading

Codecov Report

laggui commented Sep 23, 2024

laggui left a comment

Choose a reason for hiding this comment

tsanona commented Oct 14, 2024

laggui commented Oct 16, 2024

laggui left a comment

Choose a reason for hiding this comment

tsanona commented Oct 23, 2024

laggui commented Oct 23, 2024

laggui left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laggui commented Nov 4, 2024

tsanona commented Nov 8, 2024

laggui left a comment

Choose a reason for hiding this comment

tsanona commented Nov 16, 2024

laggui left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsanona Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsanona commented Nov 19, 2024

laggui left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsanona commented Sep 21, 2024 •

edited

Loading

antimora commented Sep 22, 2024 •

edited

Loading

codecov bot commented Sep 22, 2024 •

edited

Loading

tsanona Nov 18, 2024 •

edited

Loading

laggui left a comment •

edited

Loading