Domain Adaptation

BioMedia has published a dataset of CT scans with labeled vertebrae centers. However, the CT scans. However, the CT scans are visually very different from the scans of the KSA or the Covid-19 dataset.

Links from Thilo:

  • https://arxiv.org/pdf/1702.05374.pdf
  • https://dl.acm.org/doi/abs/10.1145/3400066
  • https://arxiv.org/abs/1802.03601
  • https://ieeexplore.ieee.org/document/8920338
  • https://arxiv.org/pdf/1911.02685.pdf
  • That's it: https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a

Unsupervised Approaches

  • Domain adversarial discriminative approaches: learn to produce data with a statistical distribution similar to the one of training samples via adversarial learning schemes. Besides the segmentation network, use an additional discriminator (like in GANs) as source-target domain classifier to reach domain invariance and to reduce the bias towards the source domain.
    • Feature Adversarial Adaption: Do the adversarial adaptation on the som extracted features
    • Output Adversarial Adaption: Do the adversarial adaptation on the low-dimensional output space
    • Personal impression: Could work with the features, is pretty simple to implement (e.g. an autoencoder could be pre-trained, whereby the discriminator tries to distinguish the domains of the embeddings. The training of the segmentation network could be done with the embeddings from the autoencoder (which are sufficient to reconstruct the image)).
    • Drawback?: "Lack of semantic awareness from the domain critic network: Even when the critic manages to grasp a clear expression of marginal distributions, thus effectively leading to a global statistical alignment, category-level joint-distributions necessarily remain unknown to the domain discriminator, as it is not provided with semantic labels when discriminating feature representations. A side effect to this semantic-unaware adaptation is that features can be placed close to class boundaries, increasing the chances of incorrect classification. Furthermore, target representations may be incorrectly transferred to a semantic category different from the actual one in the domain invariant adapted space (negative transfer), as decision boundaries are ignored in the adaptation process."
    • Approaches:
      • Autoencoder, use adversarial classifier on embeddings -> Embeddings of different domains should be in the same feature space (not distinguishable) and contain enough information to recreate the original image
      • Li, Y.; Yuan, L.; Vasconcelos, N. Bidirectional Learning for Domain Adaptation of Semantic Segmentation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
      • Huang, H.; Huang, Q.; Krähenbühl, P. Domain Transfer Through Deep Activation Matching. Proc. of European Conference on Computer Vision (ECCV), 2018.
      • Hoffman, J.; Tzeng, E.; Park, T.; Zhu, J.Y.; Isola, P.; Saenko, K.; Efros, A.; Darrell, T. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. Proc. of the International Conference on Machine Learning (ICML), 2018.
      • Chen, Y.C.; Lin, Y.Y.; Yang, M.H.; Huang, J.B. CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
      • Luo, Y.; Liu, P.; Guan, T.; Yu, J.; Yang, Y. Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation. Proc. of International Conference on Computer Vision (ICCV), 2019.
      • Du, L.; Tan, J.; Yang, H.; Feng, J.; Xue, X.; Zheng, Q.; Ye, X.; Zhang, X. SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation. Proc. of International Conference on Computer Vision (ICCV), 2019.
      • Zhu, X.; Zhou, H.; Yang, C.; Shi, J.; Lin, D. Penalizing top performers: Conservative loss for semantic segmentation adaptation. Proc. of European Conference on Computer Vision (ECCV), 2018, pp. 568–583.
      • Murez, Z.; Kolouri, S.; Kriegman, D.J.; Ramamoorthi, R.; Kim, K. Image to Image Translation for Domain Adaptation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      • Sankaranarayanan, S.; Balaji, Y.; Jain, A.; Nam Lim, S.; Chellappa, R. Learning from synthetic data: Addressing domain shift for semantic segmentation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3752–3761.
  • Generative-based approaches: use generative networks to translate data between domains in order to produce a target-like training set from source data (i.e. image-to-image translation). The idea is to transfer visual attributes from the target domain to the source one, while preserving source semantic information.
  • Classifier discrepancy approaches: resort to multiple dense classifiers on top of a single encoder to capture less adapted target representations and, in turn, encourage an improved alignment of cross-domain features far from decision boundaries via an adversarial-like strategy.
    • "Saito et al. propose an Adversarial Dropout Regularization (ADR) approach for UDA to provide cross-domain feature alignment away from decision boundaries. To do so, they completely revisit the original domain adversarial scheme, by providing the task-specific dense classifier (i.e., the encoder) with a discriminative role. In particular, by means of dropout, the classifier is perturbed in order to get two distinct predictions over the same encoder output. Since the prediction variability is subject to an inverse relationship with the proximity to decision boundaries, the feature extractor is forced to produce representations far from those boundaries by minimizing the discrepancy of the two output probability maps. At the same time, the classifier has to maximize its output variation, in order to boost its capability to detect less-adapted features. In this redesigned adversarial scheme, the dense classifier is trained to be sensitive to semantic variations of target features, as to capture all the information stored in its neurons, which in turn are encouraged to be as diverse as possible from each other by the adversarial dropout maximization. On the other hand, the encoder is focused on providing categorical certainty to extracted target features, since removing task-unrelated cues weakens the possibility to achieve dissimilar predictions from the same latent representations."
    • Interesting approaches
      • Saito, K.; Watanabe, K.; Ushiku, Y.; Harada, T. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3723–3732.
      • Lee, S.; Kim, D.; Kim, N.; Jeong, S.G. Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation. Proc. of International Conference on Computer Vision (ICCV), 2019, pp. 91–100.
  • Self-training approaches: produce some form of pseudo-label (typically using some confidence estimation schemes to select the most reliable predictions) based on the current estimate to automatically guide the learning process
    • Personal impression: Probably works when cats and dogs which are well seperated from the background are to be distinguished. But could cause problems with vertebrae that look very similar and the imprecise ground truth (vertebrae size is estimated). Keyword "catastrophic error propagation"
  • Entropy minimization methods: aim at minimizing the entropy of target output probability maps to mimic the over-confident behavior of source predictions, thus promoting well-clustered target feature representations
    • The principle behind minimizing target entropy to perform domain adaptation follows the observation that source predictions are likely to show more confidence, which in turn translates into high entropy probability outputs. On the contrary, the segmentation network is likely to display a more uncertain behavior on target-distributed samples, as target prediction entropy maps happen to be overall quite unstable, typically being the noise pattern not confined just to the semantic boundaries. Thus, forcing the segmentation network to mimic the over-confident source behavior when applied to the target domain too, should effectively reduce the accuracy gap between domains.
  • Curriculum learning approaches: tackle one or more easy tasks first, in order to infer some necessary properties about the target domain (e.g., global label distributions) and then train the segmentation network such that the predictions in the target domain follow those inferred properties.
    • Personal impression: Probably works badly, especially since the individual vertebrae are statistically too similar
  • Multi-tasking methods: solve multiple tasks simultaneously to improve the extraction of invariant features representation.
    • No suitable related tasks identified so far

For more details, see https://arxiv.org/pdf/2005.10876.pdf

Supervised Approaches

Work in progress…