Research on compressing BERT with low-rank factorization and knowledge distillation
svd bert tucker-decomposition low-rank-factorizations teacher-model teacher-prediction student-model
-
Updated
Nov 18, 2022 - Python