Multi-scale Interpretation Model for Convolutional Neural Networks: Building Trust based on Hierarchical Interpretation

University of British Columbia

The MINT model extracts class-discriminative, interpretable knowledge by leveraging both fine-scale and coarse-scale perturbations across shallow and deep network layers. By integrating perturbation-based and gradient-based mechanisms without requiring architectural changes or retraining, MINT generates multi-scale representations that explain deep-layer attention and shallow-layer features simultaneously. This novel approach utilizes a linear sparse optimization problem, governed by an inter-scale constraint regularization, to ensure coarse and fine-scale information cooperate effectively. Ultimately, MINT offers a more robust explanation of a network's hierarchical structure than methods relying solely on visual input, providing deeper insight into how CNNs process information.

Abstract

With the rapid development of deep learning models, their performances in various tasks are improved; while meanwhile their increasingly intricate architectures make them difficult to interpret. To tackle this challenge, model interpretability is essential and has been investigated in a wide range of applications. For end users, model interpretability can be used to build trust in the deployed machine learning models. For practitioners, interpretability plays a critical role in model explanation, model validation, and model improvement to develop a faithful model. In the paper, we propose a novel Multi-scale INTerpretation (MINT) model for convolutional neural networks using both the perturbation-based and the gradient-based interpretation approaches. It learns the class-discriminative interpretable knowledge from the multi-scale perturbation of feature information in different layers of deep networks. The proposed MINT model provides the coarse-scale and the fine-scale interpretations for the attention in the deep layer and specific features in the shallow layer, respectively. Experimental results show that the MINT model presents the class-discriminative interpretation of the network decision and explains the significance of the hierarchical network structure.

BibTeX

@ARTICLE{8653995, author={Cui, Xinrui and Wang, Dan and Wang, Z. Jane}, journal={IEEE Transactions on Multimedia}, title={Multi-Scale Interpretation Model for Convolutional Neural Networks: Building Trust Based on Hierarchical Interpretation}, year={2019}, volume={21}, number={9}, pages={2263-2276}, keywords={Visualization;Computational modeling;Analytical models;Feature extraction;Perturbation methods;Image segmentation;Heating systems;Model interpretability;multi-scale interpretation;convolutional neural networks;model-agnostic}, doi={10.1109/TMM.2019.2902099} }