Skip to content

Latest commit

 

History

History

kallge_rcic

Recursion Cellular Image Classification

Xiaochen(Raynard) Zhang & Nelson Hong Ming Tseng

rcic image

The codes and notebooks for Recursion Cellular Image Classification. It's my first kaggle medal(88/866).bronze img

The task is to classify siRNA image, with 6 channels stored in 6 png images

  • We start our training with this starter kernel. It has efficientnet, resnet, densenet, data preprocess pipeline ready.

Why we should read and study data carefully?

These facts pretty much define how reckless we are this time. It's a miracle we even got the medal...

  • We discover there are 2 sites 3 days after we join. Before that we only use half the data.
  • We discover the plate leak only almost too late. Even if we apply the enforcing 227-hot encoding by the end. I know the fact "within each plate, there is only 1 siRNA of its class in the plate" only after the competition is over.
  • We are acknowledge of "control" pictures after the competition is over.

Ensemble

Plate Leak

  • We found the plate leak 2 days before the competition closed. I kicked myself for the slopiness of not wondering "discussion" often enough.
  • This notebook expored the plate leak and allocate/save the 4 groups of siRNA.
  • In the final ensemble notebook we apply the leak info to the model prediction output. This leak helped out Public LB score improved at least 0.1.
  • On the eve of the closing (UTC+8:00), I experimented learning from conv activations. But the time is too brief. Nothing prevails