-
Notifications
You must be signed in to change notification settings - Fork 11
SegCLR
Peter H. Li edited this page May 24, 2023
·
27 revisions
Segmentation-guided Contrastive Learning of Representations (SegCLR) is a method for learning rich embedding representations of cellular morphology and ultrastructure. See the blog post and the updated preprint fully describing the method for more details. The embeddings for two large-scale cortical datasets, one from human temporal cortex (explore in Neuroglancer) and one from mouse visual cortex (explore), are publicly released on Google Cloud Storage.
Open-source release of SegCLR code to the connectomics repo is complete aside from minor updates and bugfixes. The API may still undergo some changes.
Colab notebooks demonstrating how to use the code are available here:
- Run a pretrained SegCLR embedding model from TensorFlow 2 to predict embeddings for an arbitrary data cutout using TensorStore. This notebook shows how to instantiate a SegCLR model, load weights from a pretrained model, and run inference. Pretrained SegCLR models can also be loaded from TensorFlow 1.
-
Train a SegCLR embedding model. This notebook shows how to read positive pair
tf.train.Examples
from a TFRecord table, load the corresponding EM and segmentation data blocks, preprocess and batch them, and use them to train a SegCLR embedding model. By default the demo notebook is set to connect to a Google Colab instance with an NVIDIA T4 GPU, but for large-scale training or fine-tuning a larger GPU cluster should be used. - Access precomputed SegCLR embeddings from public CSV ZIP releases for h01 (human cortex) and MICrONS (mouse cortex). This notebook shows how to read the data remotely and parse it. It also demonstrates how to run dimensionality reduction to inspect embedding clusters (similar to paper figure 4).
- Run a pretrained SegCLR subcompartment classifier. This notebook shows how to load a pretrained subcompartment classifier model and run it on embeddings for a test cell (as in paper figure 2).
- Train a cell type classifier with out-of-distribution (OOD) detection. This notebook shows how to load ground truth cell type labels for the mouse cortex dataset, and train a lightweight cell type classifier on top of SegCLR embeddings from scratch (as in paper figure 3). In this demo, the classifier is trained on glia cell types, while the neuron types are only used for evaluation, so the classifier must learn to reject the OOD neuron types. We do this by training a classifier with calibrated uncertainty estimates via SNGP (SNGP paper, as in paper figure 5).