From 1d06d87e95d5bf67e3e1c4a255122863abc72a81 Mon Sep 17 00:00:00 2001
From: Lupin1998 <1070535169@qq.com>
Date: Thu, 27 Apr 2023 22:44:44 +0000
Subject: [PATCH] fix docs and links

---
 README.md                                     |  8 +++---
 .../classification/imagenet/automix/README.md |  3 +--
 .../imagenet/context_cluster/README.md        | 15 ++++++-----
 .../imagenet/convnext/README.md               |  2 +-
 .../classification/imagenet/deit/README.md    |  2 +-
 .../classification/imagenet/lit_v2/README.md  |  9 +++----
 .../classification/imagenet/samix/README.md   |  2 +-
 .../imagenet/swin_transformer/README.md       |  4 +--
 .../imagenet/uniformer/README.md              | 14 +++++-----
 docs/en/awesome_selfsup/MIM.md                |  3 +++
 docs/en/model_zoos/Mixup_sup.md               |  2 +-
 docs/en/model_zoos/Model_Zoo_sup.md           |  4 +--
 docs/en/tutorials/ssl_detection.md            | 26 +++++++++----------
 openmixup/models/utils/grad_weight.py         |  3 +--
 14 files changed, 50 insertions(+), 47 deletions(-)
diff --git a/README.md b/README.md
index 3499a2b1..ff1981e1 100644
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@ The main branch works with **PyTorch 1.8** (required by some self-supervised met
 <summary>Major Features</summary>
 
 - **Modular Design.**
-  OpenMixup follows a similar code architecture of OpenMMLab projects, which decompose the framework into various components, and users can easily build a customized model by combining different modules. OpenMixup is also transplantable to OpenMMLab projects (e.g., [MMSelfSup](https://github.com/open-mmlab/mmselfsup)).
+  OpenMixup follows a similar code architecture of OpenMMLab projects, which decompose the framework into various components, and users can easily build a customized model by combining different modules. OpenMixup is also transplantable to OpenMMLab projects (e.g., [MMPreTrain](https://github.com/open-mmlab/mmpretrain)).
 
 - **All in One.**
   OpenMixup provides popular backbones, mixup methods, semi-supervised, and self-supervised algorithms. Users can perform image classification (CNN & Transformer) and self-supervised pre-training (contrastive and autoregressive) under the same framework.
@@ -246,8 +246,8 @@ This project is released under the [Apache 2.0 license](LICENSE). See `LICENSE`
 
 ## Acknowledgement
 
-- OpenMixup is an open-source project for mixup methods created by researchers in **CAIRI AI Lab**. We encourage researchers interested in visual representation learning and mixup methods to contribute to OpenMixup!
-- This repo borrows the architecture design and part of the code from [MMSelfSup](https://github.com/open-mmlab/mmselfsup) and [MMClassification](https://github.com/open-mmlab/mmclassification).
+- OpenMixup is an open-source project for mixup methods and visual representation learning created by researchers in **CAIRI AI Lab**. We encourage researchers interested in backbone architectures, mixup augmentations, and self-supervised learning methods to contribute to OpenMixup!
+- This project borrows the architecture design and part of the code from [MMPreTrain](https://github.com/open-mmlab/mmpretrain) and the official implementations of supported algorisms.
 
 <p align="right">(<a href="#top">back to top</a>)</p>
 
@@ -269,7 +269,7 @@ If you find this project useful in your research, please consider star `OpenMixu
 
 ## Contributors and Contact
 
-For help, new features, or reporting bugs associated with OpenMixup, please open a [GitHub issue](https://github.com/Westlake-AI/openmixup/issues) and [pull request](https://github.com/Westlake-AI/openmixup/pulls) with the tag "help wanted" or "enhancement". For now, the direct contributors include: Siyuan Li ([@Lupin1998](https://github.com/Lupin1998)), Zedong Wang ([@Jacky1128](https://github.com/Jacky1128)), and Zicheng Liu ([@pone7](https://github.com/pone7)). We thank all public contributors and contributors from MMSelfSup and MMClassification!
+For help, new features, or reporting bugs associated with OpenMixup, please open a [GitHub issue](https://github.com/Westlake-AI/openmixup/issues) and [pull request](https://github.com/Westlake-AI/openmixup/pulls) with the tag "help wanted" or "enhancement". For now, the direct contributors include: Siyuan Li ([@Lupin1998](https://github.com/Lupin1998)), Zedong Wang ([@Jacky1128](https://github.com/Jacky1128)), and Zicheng Liu ([@pone7](https://github.com/pone7)). We thank all public contributors and contributors from MMPreTrain (MMSelfSup and MMClassification)!
 
 This repo is currently maintained by:
 
diff --git a/configs/classification/imagenet/automix/README.md b/configs/classification/imagenet/automix/README.md
index 092af744..f6544ed9 100644
--- a/configs/classification/imagenet/automix/README.md
+++ b/configs/classification/imagenet/automix/README.md
@@ -30,8 +30,7 @@ Data mixing augmentation have proved to be effective in improving the generaliza
 |    Swin-T   | AutoMix |   224x224  |   28.29   |   300  |   81.80   |        [config](./swin/swin_t_l2_a2_near_lam_cat_switch0_8_8x128_ep300.py)        | model / log |
 |  ConvNeXt-T | AutoMix |   224x224  |   28.59   |   300  |   82.28   | [config](./convnext/convnext_t_l2_a2_near_lam_cat_switch0_8_8x128_accu4_ep300.py) | model / log |
 
-
-We will update configs and models for AutoMix soon. Please refer to [Model Zoo](https://github.com/Westlake-AI/openmixup/tree/main/docs/en/model_zoos/Model_Zoo_sup.md) for image classification results.
+We will update configs and models (ResNets, ViTs, Swin-T, and ConvNeXt-T) for AutoMix soon (please contact us if you want the models right now). Please refer to [Model Zoo](https://github.com/Westlake-AI/openmixup/tree/main/docs/en/model_zoos/Model_Zoo_sup.md) for image classification results.
 
 
 ## Citation
diff --git a/configs/classification/imagenet/context_cluster/README.md b/configs/classification/imagenet/context_cluster/README.md
index 125fb7e3..2259aad4 100644
--- a/configs/classification/imagenet/context_cluster/README.md
+++ b/configs/classification/imagenet/context_cluster/README.md
@@ -18,12 +18,15 @@ This page is based on the [official repo](https://github.com/ma-xu/Context-Clust
 
 | Model | Params(M) | Flops(G) | Top-1 (%) | Throughputs | Config | Download |
 | :---: | :-------: | :------: | :-------: | :---------: | :----: | :------: |
-| ContextCluster-tiny\* | 5.3 | 1.0 | 71.8 | 518.4 | [config](coc_tiny_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1Q_6W3xKMX63aQOBaqiwX5y1fCj4hVOIA?usp=sharing) |
-| ContextCluster-tiny_plain\* | 5.3 | 1.0 | 72.9 |  -  | [config](coc_tiny_plain_8xb256_ep300.py) | [model](https://web.northeastern.edu/smilelab/xuma/ContextCluster/checkpoints/coc_tiny_plain/coc_tiny_plain.pth.tar) |
-| ContextCluster-small\* | 5.3 | 1.0 | 71.8 | 518.4 | [config](coc_small_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1WSmnbSgy1I1HOTTTAQgOKEzXSvd3Kmh-?usp=sharing) |
-| ContextCluster-medium\* | 5.3 | 1.0 | 71.8 | 518.4 | [config](coc_medium_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1sPxnEHb2AHDD9bCQh6MA0I_-7EBrvlT5?usp=sharing) |
-
-We follow the original training setting provided by the [official repo](https://github.com/ma-xu/Context-Cluster). *Models with * are converted from the [official repo](https://github.com/ma-xu/Context-Cluster).*
+| ContextCluster-tiny\* | 5.6 | 1.10 | 71.8 | 518.4 | [config](coc_tiny_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1Q_6W3xKMX63aQOBaqiwX5y1fCj4hVOIA?usp=sharing) |
+| ContextCluster-tiny_plain\* (w/o region partition) | 5.6 | 1.10 | 72.9 |  -  | [config](coc_tiny_plain_8xb256_ep300.py) | [model](https://web.northeastern.edu/smilelab/xuma/ContextCluster/checkpoints/coc_tiny_plain/coc_tiny_plain.pth.tar) |
+| ContextCluster-small\* | 14.7 | 2.78 | 77.5 | 513.0 | [config](coc_small_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1WSmnbSgy1I1HOTTTAQgOKEzXSvd3Kmh-?usp=sharing) |
+| ContextCluster-medium\* | 29.3 | 5.90 | 81.0 | 325.2 | [config](coc_medium_8xb256_ep300.py) | [model](https://drive.google.com/drive/folders/1sPxnEHb2AHDD9bCQh6MA0I_-7EBrvlT5?usp=sharing) |
+| ContextCluster-tiny | 5.6 | 1.10 | 72.7 | 518.4 | [config](coc_tiny_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/coc_tiny_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/) |
+| ContextCluster-tiny_plain (w/o region partition) | 5.6 | 1.10 | 73.2 |  -  | [config](coc_tiny_plain_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/coc_tiny_plain_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/coc_tiny_plain_8xb256_ep300.log.json) |
+| ContextCluster-small | 14.7 | 2.78 | 77.7 | 513.0 | [config](coc_small_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/coc_small_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/coc_small_8xb256_ep300.log.json) |
+
+We follow the original training setting provided by the [official repo](https://github.com/ma-xu/Context-Cluster) to reproduce better performance of ContextCluster variants. *Models with * are converted from the [official repo](https://github.com/ma-xu/Context-Cluster).*
 
 ## Citation
 
diff --git a/configs/classification/imagenet/convnext/README.md b/configs/classification/imagenet/convnext/README.md
index fd310f8f..1c2391b6 100644
--- a/configs/classification/imagenet/convnext/README.md
+++ b/configs/classification/imagenet/convnext/README.md
@@ -18,7 +18,7 @@ This page is based on documents in [MMClassification](https://github.com/open-mm
 
 |     Model     |   Pretrain   | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) |                                Config                                 |                                Download                                 |
 | :-----------: | :----------: | :-------: | :------: | :-------: | :-------: | :-------------------------------------------------------------------: | :---------------------------------------------------------------------: |
-| ConvNeXt-T    | From scratch |   28.59   |   4.46   |   82.16   |   95.91   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/convnext/convnext_tiny_8xb256_accu2_fp16_ep300.py) | model | log |
+| ConvNeXt-T    | From scratch |   28.59   |   4.46   |   82.16   |   95.91   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/convnext/convnext_tiny_8xb256_accu2_fp16_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/convnext_small_8xb128_accu4_fp16_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/convnext_small_8xb128_accu4_fp16_ep300.log.json) |
 | ConvNeXt-T\*  | From scratch |   28.59   |   4.46   |   82.05   |   95.86   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/convnext/convnext_tiny_8xb256_accu2_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-tiny_3rdparty_32xb128_in1k_20220124-18abde00.pth) |
 | ConvNeXt-S\*  | From scratch |   50.22   |   8.69   |   83.13   |   96.44   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/convnext/convnext_small_8xb128_accu4_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-small_3rdparty_32xb128_in1k_20220124-d39b5192.pth) |
 | ConvNeXt-B\*  | From scratch |   88.59   |  15.36   |   83.85   |   96.74   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/convnext/convnext_base_8xb128_accu4_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/convnext/convnext-base_3rdparty_32xb128_in1k_20220124-d0915162.pth) |
diff --git a/configs/classification/imagenet/deit/README.md b/configs/classification/imagenet/deit/README.md
index 3a0c364b..cd84c379 100644
--- a/configs/classification/imagenet/deit/README.md
+++ b/configs/classification/imagenet/deit/README.md
@@ -20,7 +20,7 @@ This page is based on documents in [MMClassification](https://github.com/open-mm
 | :-------: | :----------: | :-------: | :------: | :-------: | :-------: | :-------------------------------------------------------------------: | :---------------------------------------------------------------------: |
 |  DeiT-T   | From scratch |   5.72    |   1.08   |   73.56   |   91.16   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_tiny_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_pt-4xb256_in1k_20220218-13b382a0.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny_pt-4xb256_in1k_20220218-13b382a0.log.json) |
 |  DeiT-T\* | From scratch |   5.72    |   1.08   |   72.20   |   91.10   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_tiny_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-tiny-distilled_3rdparty_pt-4xb256_in1k_20211216-c429839a.pth) |
-|  DeiT-S   | From scratch |   22.05   |   4.24   |   79.93   |   95.14   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_small_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_pt-4xb256_in1k_20220218-9425b9bb.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/deit/deit-small_pt-4xb256_in1k_20220218-9425b9bb.log.json) |
+|  DeiT-S   | From scratch |   22.05   |   4.24   |   79.93   |   95.14   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_small_8xb128_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/deit_small_8xb128_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/deit_small_8xb128_ep300.log.json) |
 |  DeiT-S\* | From scratch |   22.05   |   4.24   |   79.90   |   95.10   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_small_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-small-distilled_3rdparty_pt-4xb256_in1k_20211216-4de1d725.pth) |
 |  DeiT-B   | From scratch |   86.57   |   16.86  |   81.82   |   95.57   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_base_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_pt-16xb64_in1k_20220216-db63c16c.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_pt-16xb64_in1k_20220216-db63c16c.log.json) |
 |  DeiT-B\* | From scratch |   86.57   |   16.86  |   81.80   |   95.60   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/deit/deit_base_8xb128_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/deit/deit-base_3rdparty_pt-16xb64_in1k_20211124-6f40c188.pth) |
diff --git a/configs/classification/imagenet/lit_v2/README.md b/configs/classification/imagenet/lit_v2/README.md
index 5f4920a2..8ba2f82c 100644
--- a/configs/classification/imagenet/lit_v2/README.md
+++ b/configs/classification/imagenet/lit_v2/README.md
@@ -16,11 +16,10 @@ Vision Transformers (ViTs) have triggered the most recent and significant breakt
 
 |   Model   |   Pretrain   | resolution | Params(M) | Flops(G) | Throughput (imgs/s) | Top-1 (%) |                               Config                                |                               Download                                |
 | :-------: | :----------: | :--------: | :-------: | :------: | :-------: | :-------: | :-----------------------------------------------------------------: | :-------------------------------------------------------------------: |
-|  LITv2-S  | From scratch |  224x224   |     28    |    3.7   | 1,471               |    81.7   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_small_8xb128_cos_fp16_ep300.py) | model / log |
-| LITv2-S\* | From scratch |  224x224   |     28    |    3.7   | 1,471               |    82.0   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_small_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_s.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m_log.txt) |
-| LITv2-M\* | From scratch |  224x224   |     49    |    7.5   | 812                 |    83.3   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_medium_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m_log.txt) |
-| LITv2-B\* | From scratch |  224x224   |     87    |   13.2   | 602                 |    84.7   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_base_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_b.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_b_log.txt) |
-
+|  LITv2-S  | From scratch |  224x224   |     28    |    3.7   | 1,471 | 81.7 | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_small_8xb128_cos_fp16_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/lit_v2_small_8xb128_cos_fp16_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/lit_v2_small_8xb128_cos_fp16_ep300.log.json) |
+| LITv2-S\* | From scratch |  224x224   |     28    |    3.7   | 1,471 | 82.0 | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_small_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_s.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m_log.txt) |
+| LITv2-M\* | From scratch |  224x224   |     49    |    7.5   | 812 | 83.3 | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_medium_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_m_log.txt) |
+| LITv2-B\* | From scratch |  224x224   |     87    |   13.2   | 602 | 84.7 | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/lit_v2/lit_v2_base_8xb128_cos_fp16_ep300.py) | [model](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_b.pth) / [log](https://github.com/ziplab/LITv2/releases/download/v1.0/litv2_b_log.txt) |
 
 We follow the original training setting provided by the [official repo](https://github.com/ziplab/LITv2) and throughput is averaged over 30 runs. *Note that models with \* are converted from the [official repo](https://github.com/ziplab/LITv2).* We reproduce LITv2-S training 300 epochs.
 
diff --git a/configs/classification/imagenet/samix/README.md b/configs/classification/imagenet/samix/README.md
index 612556d0..91b7788b 100644
--- a/configs/classification/imagenet/samix/README.md
+++ b/configs/classification/imagenet/samix/README.md
@@ -26,7 +26,7 @@ Mixup is a popular data-dependent augmentation technique for deep neural network
 |  ResNet-101   |  SAMix  |  224x224   |   42.51   |  300   |   80.98   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/samix/basic/r101_l2_a2_bili_val_dp01_mul_mb_mlr1e_3_bb_mlr0_4xb64.py) | model / log |
 |  ResNeXt-101  |  SAMix  |  224x224   |   44.18   |  100   |   80.89   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/samix/basic/rx101_l2_a2_bili_val_dp01_mul_mb_mlr1e_3_bb_mlr0_4xb64.py) | model / log |
 
-We will update configs and models for SAMix soon. Please refer to [Model Zoo](https://github.com/Westlake-AI/openmixup/tree/main/docs/en/model_zoos/Model_Zoo_sup.md) for image classification results.
+We will update configs and models (ResNets, ViTs, Swin-T, and ConvNeXt-T) for SAMix soon (please contact us if you want the models right now). Please refer to [Model Zoo](https://github.com/Westlake-AI/openmixup/tree/main/docs/en/model_zoos/Model_Zoo_sup.md) for image classification results.
 
 ## Citation
 
diff --git a/configs/classification/imagenet/swin_transformer/README.md b/configs/classification/imagenet/swin_transformer/README.md
index 07602022..7802d171 100644
--- a/configs/classification/imagenet/swin_transformer/README.md
+++ b/configs/classification/imagenet/swin_transformer/README.md
@@ -1,6 +1,6 @@
 # Swin Transformer
 
-> [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/pdf/2103.14030.pdf)
+> [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)
 
 ## Abstract
 
@@ -32,7 +32,7 @@ The pre-trained models on ImageNet-21k are used to fine-tune, and therefore don'
 |  Swin-T (A3) | From scratch |  160x160   |   28.29   |   4.36   |   77.70   |   93.70   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_tiny_rsb_a3_sz160_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_tiny_rsb_a3_sz160_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_tiny_rsb_a3_sz160_8xb256_ep300.log.json) |
 |  Swin-S (A3) | From scratch |  160x160   |   49.61   |   8.52   |   80.20   |   95.12   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_small_rsb_a3_sz160_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_small_rsb_a3_sz160_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_small_rsb_a3_sz160_8xb256_ep300.log.json) |
 |  Swin-B (A3) | From scratch |  160x160   |   87.77   |  15.14   |   80.53   |   95.41   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_base_rsb_a3_sz160_8xb256_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_base_rsb_a3_sz160_8xb256_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/rsb-a3-weights/swin_base_rsb_a3_sz160_8xb256_ep300.log.json) |
-|  Swin-T  | From scratch |  224x224   |   28.29   |   4.36   |   81.18   |   95.61   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_tiny_8xb128_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925-66df6be6.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_tiny_224_b16x64_300e_imagenet_20210616_090925.log.json) |
+|  Swin-T  | From scratch |  224x224   |   28.29   |   4.36   |   81.18   |   95.61   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_tiny_8xb128_fp16_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/swin_tiny_8xb128_fp16_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/swin_tiny_8xb128_fp16_ep300.log.json) |
 |  Swin-S  | From scratch |  224x224   |   49.61   |   8.52   |   83.02   |   96.29   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_small_8xb128_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219-7f9d988b.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_small_224_b16x64_300e_imagenet_20210615_110219.log.json) |
 |  Swin-B  | From scratch |  224x224   |   87.77   |  15.14   |   83.36   |   96.44   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_base_8xb128_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742-93230b0d.pth)  \| [log](https://download.openmmlab.com/mmclassification/v0/swin-transformer/swin_base_224_b16x64_300e_imagenet_20210616_190742.log.json) |
 | Swin-S\* | From scratch |  224x224   |   49.61   |   8.52   |   83.21   |   96.25   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/swin_transformer/swin_small_8xb128_fp16_ep300.py) | [model](https://download.openmmlab.com/mmclassification/v0/swin-transformer/convert/swin_small_patch4_window7_224-cc7a01c9.pth) |
diff --git a/configs/classification/imagenet/uniformer/README.md b/configs/classification/imagenet/uniformer/README.md
index aa7ee6b2..2828e471 100644
--- a/configs/classification/imagenet/uniformer/README.md
+++ b/configs/classification/imagenet/uniformer/README.md
@@ -16,13 +16,13 @@ It is a challenging task to learn discriminative representation from images and
 
 |       Model       |   Pretrain   | resolution | Params(M) | Flops(G) | Top-1 (%) |                               Config                                |                               Download                                |
 | :---------------: | :----------: | :--------: | :-------: | :------: | :-------: | :-----------------------------------------------------------------: | :-------------------------------------------------------------------: |
-| UniFormer-T       | From scratch |  224x224   |   5.55    |   0.88   |   78.02   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | model / log |
-| UniFormer-S       | From scratch |  224x224   |   21.5    |   3.44   |   82.56   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | model / log |
-| UniFormer-S\*     | From scratch |  224x224   |   21.5    |   3.44   |   82.90   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
-| UniFormer-S†\*    | From scratch |  224x224   |   24.0    |   4.21   |   83.40   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
-| UniFormer-B\*     | From scratch |  224x224   |   49.8    |   8.27   |   83.90   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
-| UniFormer-S+TL\*  | From scratch |  224x224   |   21.5    |   3.44   |   83.40   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
-| UniFormer-B+TL\*  | From scratch |  224x224   |   49.8    |   8.27   |   85.10   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
+| UniFormer-T       | From scratch |  224x224   |   5.55    |   0.88   |   78.02   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_tiny_8xb128_ep300.py) | [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/uniformer_tiny_8xb128_ep300.log.json) |
+| UniFormer-S       | From scratch |  224x224   |   21.5    |   3.44   |   82.29   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_small_8xb128_fp16_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/uniformer_small_8xb128_fp16_ep300.pth) \| [log](https://github.com/Westlake-AI/openmixup/releases/download/open-in1k-weights/uniformer_small_8xb128_fp16_ep300.log.json) |
+| UniFormer-S\*     | From scratch |  224x224   |   21.5    |   3.44   |   82.90   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_small_8xb128_fp16_ep300.py) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
+| UniFormer-S†\*    | From scratch |  224x224   |   24.0    |   4.21   |   83.40   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_small_plus_8xb128_ep300.py) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
+| UniFormer-B\*     | From scratch |  224x224   |   49.8    |   8.27   |   83.90   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_base_8xb128_ep300.py) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
+| UniFormer-S+TL\*  | From scratch |  224x224   |   21.5    |   3.44   |   83.40   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_small_8xb128_fp16_ep300.py) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
+| UniFormer-B+TL\*  | From scratch |  224x224   |   49.8    |   8.27   |   85.10   | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/uniformer/uniformer_base_8xb128_ep300.py) | [model](https://drive.google.com/file/d/1-uepH3Q3BhTmWU6HK-sGAGQC_MpfIiPD/view?usp=sharing) | [log](https://drive.google.com/file/d/10ThKb9YOpCiHW8HL10dRuZ0lQSPJidO7/view?usp=sharing) |
 
 We follow the original training setting provided by the [official repo](https://github.com/Sense-X/UniFormer). *Note that models with \* are converted from [the official repo](https://github.com/Sense-X/UniFormer/blob/main/image_classification), † denotes using `ConvStem` in UniFormer. TL denotes training the model with [Token Labeling](https://arxiv.org/abs/2104.10858) as [LV-ViT](https://github.com/zihangJiang/TokenLabeling).* We reproduce the performances of UniFormer-T and UniFormer-S training 300 epochs, and UniFormer-T is designed according to VAN-T.
 
diff --git a/docs/en/awesome_selfsup/MIM.md b/docs/en/awesome_selfsup/MIM.md
index 19c8275b..118f1020 100644
--- a/docs/en/awesome_selfsup/MIM.md
+++ b/docs/en/awesome_selfsup/MIM.md
@@ -152,6 +152,9 @@ The list of awesome MIM methods is summarized in chronological order and is on u
 * **EVA**: Yuxin Fang, Wen Wang, Binhui Xie, Quan Sun, Ledell Wu, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao.
    - EVA: Exploring the Limits of Masked Visual Representation Learning at Scale. [[CVPR'2023](https://arxiv.org/abs/2211.07636)] [[code](https://github.com/baaivision/EVA)]
    <p align="center"><img width="60%" src="https://user-images.githubusercontent.com/44519745/206920442-4d896aca-1765-4e66-9afb-c76017bc3521.png" /></p>
+* **CAE**: Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang.
+   - Context Autoencoder for Self-Supervised Representation Learning. [[ArXiv'2022](https://arxiv.org/abs/2202.03026)] [[code](https://github.com/lxtGH/CAE)]
+   <p align="center"><img width="75%" src="https://user-images.githubusercontent.com/44519745/234667973-6f98f65e-662c-4934-be85-efa60f3fc20a.png" /></p>
 * **CAE.V2**: Xinyu Zhang, Jiahui Chen, Junkun Yuan, Qiang Chen, Jian Wang, Xiaodi Wang, Shumin Han, Xiaokang Chen, Jimin Pi, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang.
    - CAE v2: Context Autoencoder with CLIP Target. [[ArXiv'2022](https://arxiv.org/abs/2211.09799)]
    <p align="center"><img width="95%" src="https://user-images.githubusercontent.com/44519745/206920593-c703518b-47f9-4f61-a319-5ba0099c902d.png" /></p>
diff --git a/docs/en/model_zoos/Mixup_sup.md b/docs/en/model_zoos/Mixup_sup.md
index f33c7a15..c5af5862 100644
--- a/docs/en/model_zoos/Mixup_sup.md
+++ b/docs/en/model_zoos/Mixup_sup.md
@@ -2,7 +2,7 @@
 
 **OpenMixup provides mixup benchmarks on supervised learning on various tasks. Config files and experiment results are available, and pre-trained models and training logs are updating. Moreover, more advanced mixup variants will be supported in the future. Issues and PRs are welcome!**
 
-Now, we have supported 13 popular mixup methods! Notice that * denotes open-source arXiv pre-prints reproduced by us, and 📖 denotes original results reproduced by official implementations. We modified the original AttentiveMix by using pre-trained R-18 (or R-50) and sampling $\lambda$ from $\Beta(\alpha,8)$ as *AttentiveMix+*, which yields better performances.
+Now, we have supported 13 popular mixup methods! Notice that * denotes open-source arXiv pre-prints reproduced by us, and 📖 denotes original results reproduced by official implementations. We modified the original AttentiveMix by using pre-trained R-18 (or R-50) and sampling $\lambda$ from $\Beta(\alpha,8)$ as *AttentiveMix+*, which yields better performances. We will update pretrained models and logs soon (please contact us if you want the models right now).
 
 **Note**
 
diff --git a/docs/en/model_zoos/Model_Zoo_sup.md b/docs/en/model_zoos/Model_Zoo_sup.md
index d4f729c3..9a56b1e8 100644
--- a/docs/en/model_zoos/Model_Zoo_sup.md
+++ b/docs/en/model_zoos/Model_Zoo_sup.md
@@ -48,7 +48,7 @@
     </details>
 
 ## ImageNet
-ImageNet has multiple versions, but the most commonly used one is [ILSVRC 2012](http://www.image-net.org/challenges/LSVRC/2012/). We summarize image classification results of the official settings.
+ImageNet has multiple versions, but the most commonly used one is [ILSVRC 2012](http://www.image-net.org/challenges/LSVRC/2012/). We summarize image classification results of the official settings. You can download model files from [OpenMMLab](https://github.com/open-mmlab/mmpretrain) or [OpenMixup](https://github.com/Westlake-AI/openmixup/releases/tag/open-in1k-weights).
 
 | Model | Params(M) | Flops(G) | Top-1 (%) | Top-5 (%) | Config | Download |
 |:---:|:---:|:---:|:---:|:---:|:---:|:---:|
@@ -225,7 +225,7 @@ ImageNet has multiple versions, but the most commonly used one is [ILSVRC 2012](
 | MogaNet-L | 82.5 | 15.9 | 84.6 | - | [config](https://github.com/Westlake-AI/openmixup/tree/main/configs/classification/imagenet/moganet/moga_large_ema_sz224_8xb64_accu2_ep300.py) | [model](https://github.com/Westlake-AI/openmixup/releases/download/moganet-in1k-weights/moga_large_ema_sz224_8xb64_accu2_ep300.pth) / [log](https://github.com/Westlake-AI/openmixup/releases/download/moganet-in1k-weights/moga_large_ema_sz224_8xb64_accu2_ep300.log.json) |
 
 
-We also provide fast training results using [RSB A3](https://arxiv.org/abs/2110.00476) setting on [ILSVRC 2012](http://www.image-net.org/challenges/LSVRC/2012/). You can download all files from [**Baidu Cloud** (ss3j)](https://pan.baidu.com/s/1WgUn0zOtrmN2GBZ3aFjONw?pwd=ss3j).
+We also provide fast training results using [RSB A3](https://arxiv.org/abs/2110.00476) setting on [ILSVRC 2012](http://www.image-net.org/challenges/LSVRC/2012/). You can download all files from [GitHub](https://github.com/Westlake-AI/openmixup/releases/tag/rsb-a3-weights) / [**Baidu Cloud** (ss3j)](https://pan.baidu.com/s/1WgUn0zOtrmN2GBZ3aFjONw?pwd=ss3j).
 
 | Model | Date | Train / test size | Params(M) | Top-1 (\%) | Top-5 (\%) | Config | Download |
 |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
diff --git a/docs/en/tutorials/ssl_detection.md b/docs/en/tutorials/ssl_detection.md
index 6ac7ce67..ad33c345 100644
--- a/docs/en/tutorials/ssl_detection.md
+++ b/docs/en/tutorials/ssl_detection.md
@@ -1,10 +1,12 @@
 # Detection
 
 - [Detection](#detection)
-  - [Train](#train)
-    - [MMDetection](#mmdetection)
-    - [Detectron2](#detectron2)
-  - [Test](#test)
+  - [MMDetection](#mmdetection)
+    - [Evaluation](#evaluation)
+  - [Detectron2](#detectron2)
+    - [Evaluation](#evaluation)
+
+## MMDetection
 
 Here, we prefer to use MMDetection to do the detection task. First, make sure you have installed [MIM](https://github.com/open-mmlab/mim), which is also a project of OpenMMLab.
 
@@ -17,11 +19,9 @@ It is very easy to install the package.
 
 Besides, please refer to MMDet for [installation](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/get_started.md) and [data preparation](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md)
 
-## Train
-
-### MMDetection
+### Evaluation
 
-After installation MMDet, you can run MMDetection with simple command. We provide scripts for the stage-4 only (`C4`) and `FPN` setting of object detection models.
+After installing MMDet, you can run MMDetection with simple command. We provide scripts for the stage-4 only (`C4`) and `FPN` setting of object detection models.
 
 ```shell
 # distributed version
@@ -51,7 +51,7 @@ configs/benchmarks/mmdetection/coco/mask-rcnn_r50-c4_ms-1x_coco.py \
 https://download.openmmlab.com/mmselfsup/1.x/byol/byol_resnet50_16xb256-coslr-200e_in1k/byol_resnet50_16xb256-coslr-200e_in1k_20220825-de817331.pth 8
 ```
 
-### Detectron2
+## Detectron2
 
 If you want to do detection task with [detectron2](https://github.com/facebookresearch/detectron2), we also provide some config files.
 Please refer to [INSTALL.md](https://github.com/facebookresearch/detectron2/blob/main/INSTALL.md) for installation and follow the [directory structure](https://github.com/facebookresearch/detectron2/tree/main/datasets) to prepare your datasets required by detectron2.
@@ -65,16 +65,16 @@ bash run.sh ${DET_CFG} ${OUTPUT_FILE}
 
 <p align="right">(<a href="#top">back to top</a>)</p>
 
-## Test
+### Evaluation
 
-After training, you can also run the command below to test your model.
+After training, you can also run the command below with 8 GPUs (per node) to test your model.
 
 ```shell
 # distributed version
-bash benchmarks/mmdetection/mim_dist_test.sh ${CONFIG} ${CHECKPOINT} ${GPUS}
+bash benchmarks/mmdetection/mim_dist_test.sh ${CONFIG} ${CHECKPOINT}
 
 # slurm version
-bash benchmarks/mmdetection/mim_slurm_test.sh ${PARTITION} ${CONFIG} ${CHECKPOINT}
+bash benchmarks/mmdetection/slurm_run.sh ${PARTITION} ${CONFIG} ${CHECKPOINT}
 ```
 
 Remarks:
diff --git a/openmixup/models/utils/grad_weight.py b/openmixup/models/utils/grad_weight.py
index 4cd24b34..c30c5eb3 100644
--- a/openmixup/models/utils/grad_weight.py
+++ b/openmixup/models/utils/grad_weight.py
@@ -1,7 +1,6 @@
 import torch
 import torch.nn.functional as F
 import torch.nn as nn
-from torch._six import inf
 
 from mmcv.cnn.bricks.conv_module import ConvModule
 from mmcv.cnn import kaiming_init, normal_init
@@ -68,7 +67,7 @@ def get_grad_norm(parameters, norm_type=2):
     if len(parameters) == 0:
         return torch.tensor(0.)
     total_norm = 0
-    if norm_type == inf:
+    if norm_type == torch.inf:
         total_norm = max(p.grad.detach().abs().max() for p in parameters)
     else:
         for p in parameters: