In this assignment I will use 2 different variants of supervised learning to try to determine the species of penguins based on different measurements. The dataset to be used is "Palmer penguins" (dataset palmer_penguins.csv).
The figure below shows different variants of scatterplots for the four numerical quantities in the dataset: bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g
The plots show all pairwise combinations of these four. The diagonal shows histograms for single sizes. Notice how some combinations give point clouds where the three classes (penguin species) overlap a lot, while other combinations give better separation.