QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding

BASH Lab, Worcester Polytechnic Institute
INTERSPEECH, 2025
*Indicates Equal Contribution

Abstract

Spoken Language Understanding (SLU) systems must balance performance and efficiency, particularly in resourceconstrained environments. Existing methods apply distillation and quantization separately, leading to suboptimal compression as distillation ignores quantization constraints. We propose QUADS, a unified framework that optimizes both through multi-stage training with a pre-tuned model, enhancing adaptability to low-bit regimes while maintaining accuracy. QUADS achieves 71.13% accuracy on SLURP and 99.20% on FSC, with only minor degradations of up to 5.56% compared to state-of-the-art models. Additionally, it reduces computational complexity by 60–73× (GMACs) and model size by 83–700×, demonstrating strong robustness under extreme quantization. These results establish QUADS as a highly efficient solution for real-world, resource-constrained SLU applications. Index Terms: Quantization, knowledge distillation, multi-stage training, speech-language understanding.

Architecture

Schematic overview of QUADS. A two-phase framework for efficient model training. In the distillation phase, the student model Φ learns from the teacher model Ω via a combined loss strategy. The quantization phase compresses the student model’s weights WΦ using the codebook, where weights are grouped into clusters and refined using objectives that balance centroid alignment and cross-network consistency.

Results

Comparison of QUADS and Prior Methods on the SLURP and FSC Datasets. We report accuracy and F1-score for model performance, alongside GMACs and model size, to evaluate efficiency.

BibTeX


        @article{biswas2025quads,
          title={QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding},
          author={Biswas, Subrata and Khan, Mohammad Nur Hossain and Islam, Bashima},
          journal={arXiv preprint arXiv:2505.14723},
          year={2025}
        }