Philip Graemer — Deep Learning for Biomedical Microscopy

About

I study how modern vision architectures learn, transfer, and fail when moved from natural images into biomedical microscopy, a domain where the standard assumptions of computer vision break down in informative ways.

My current work centres on a controlled benchmark comparing CNNs and Vision Transformers for single-cell classification on the LIVECell dataset, where I found that previously reported CNN superiority is an artefact of unmatched pretraining, and that standard optimisation recipes like layer-wise learning rate decay can unexpectedly hurt under domain shift.

I'm now building on these findings through domain-adaptive self-supervised learning (PhaseDINO), knowledge distillation, and cross-modality robustness testing.

More broadly, I'm interested in the questions that sit beneath the benchmarks: what makes one representation transfer better than another? Why do some training recipes fail outside their native domain? How do inductive biases interact with pretraining regime?

My path to these questions came through analytical philosophy, psychology, and neuroscience before arriving at machine learning, I've always been drawn to how systems form knowledge and why they sometimes get it wrong.

Selected results

From a controlled comparison of eleven architectures on LIVECell 8-class single-cell classification, with matched pretraining conditions across all models:

Architecture	Macro F1	Accuracy	Note
EfficientNet-B5	92.25%	93.53%	Best single model
EVA-02 (ViT-B/14)	90.51%	92.26%	Best teacher for distillation
EVA-02 → EN-B0	91.01%	92.67%	Cross-architecture distillation
EN-B5 → EN-B0	91.01%	92.18%	Same-family distillation
3×EVA-02 council → EN-B0	92.05%	93.42%	ViT council distillation. Best distillation
2xEVA-02 + EN-B5 council → EN-B0	91.41%	92.86%	Mixed council distillation
ViT-B/16	89.93%	91.94%
ViT-S/8	90.61	92.20%	Matches EVA-02 at ¼ parameters, better than the larger ViT-B/16 due to smaller patches/finer tokenisation

Layer-wise learning rate decay (LLRD), widely used for fine-tuning pretrained transformers, consistently hurts performance under ImageNet to LIVECell domain shift, a finding with implications for how standard NLP-derived training recipes are adopted in biomedical vision.

Gallery

Impressions from the bench, the microscope, and the training loop.

Raw data → Segmentation → Classification pipeline

LIVECell pipeline: phase-contrast images → instance segmentation → single-cell classification

Phase-contrast microscopy of MG-63 cells

MG-63 osteosarcoma cells, phase-contrast 20×

Instance segmentation mask, each colour is a single cell

Autofluorescence microscopy of amphipod copulatory claw

Amphipod copulatory claw (gnathopod), capturing autofluorescence, 200 µm scale bar

24-well plate with ALP and Alizarin Red staining

24-well plate: Alcian blue, oil red, ALP, Alizarin Red. Assessing differentiation of hMSCs.

Confusion matrices: Custom CNN vs EVA-02 with class-balanced loading

Custom CNN vs pretrained class-balanced EVA-02

3D volume rendering of a whole amphipod, multi-channel fluorescence

Dense instance segmentation with 574 cells

574 segmented instances in a single field of view

Publications

Pretraining Inverts the CNN–ViT Ranking for Single-Cell Classification: A Controlled Benchmark on LIVECell

Philip Graemer, Giuseppe Di Caprio

Manuscript in preparation · targeting arXiv 2026

arXiv link coming soon

Code

livecell-classification-benchmark

Controlled benchmark comparing CNN and Vision Transformer architectures for single-cell classification on LIVECell. Eleven architectures, pretrained and scratch conditions, multi-seed reproducibility.

PyTorch timm Mixed precision A100

livecell-distillation

Cross-architecture knowledge distillation. Temperature-scaled soft labels, combined hard/soft loss, council distillation. Didactic codebase with extensive inline documentation.

PyTorch timm Knowledge distillation

segmentation-crop-checker

Validation tool for single-cell crops extracted from segmentation masks. Flags multi-cell crops, boundary truncation, size outliers, and mask coverage issues.

Python Quality control

Contact

Email philip.graemer@strath.ac.uk

Lab Di Caprio Lab

LinkedIn Philip Graemer

ORCID Profile