模型压缩加速

持续更新

Posted on Jan.15, 2020

参数剪枝和共享

剪枝：设计关于参数重要性的评价准则,基于该准则判断网络参数的重要程度,删除冗余参数
共享：利用结构化矩阵或聚类方法映射网络内部参数

参数量化

将网络参数从32位全精度浮点数量化到更低位数

低秩分解

将高维参数向量降维分解为稀疏的低维向量

紧凑网络

从卷积核、特殊层和网络结构3个级别设计新型轻量网络

知识蒸馏

将较大的教师模型的信息提炼到较小的学生模型

论文

Awesome-Pruning and Related Links

工业界项目

Lightweight DNN Engines/APIs

CoCoPIE

cocopie.ai

Tencent-ncnn
Tencent-pocketFlow
Alibaba-MNN
Nvidia-tensorRT

TensorRT

facebook-Glow
intel-neon

新热点：神经网络架构搜索(NAS)

模型分析开源工具

  Model Analyzer

    sksq96/pytorch-summary | [Pytorch]

    Lyken17/pytorch-OpCounter | [Pytorch] ✔ preference

    sovrasov/flops-counter.pytorch | [Pytorch]

How to install

Through PyPi
      pip install thop

Using GitHub (always latest)
      pip install --upgrade git+https://github.com/Lyken17/pytorch-OpCounter.git

How to use

Basic usage
    
    from torchvision.models import resnet50
    from thop import profile
    from thop import clever_format

    os.environ['CUDA_VISIBLE_DEVICES'] = '0'
    model = resnet50().to(device)
    input = torch.randn(1, 3, 240, 240).to(device)
    flops, params = profile(model, inputs=(input, ))
    macs, params = clever_format([flops, params], "%.3f")

    print('{:<30}  {:<8}'.format('Computational complexity: ', macs))
    print('{:<30}  {:<8}'.format('Number of parameters: ', params))

注:
       1. FLOPS 注意全部大写 是floating point of per second的缩写，意指每秒浮点运算次数。用来衡量硬件的性能。
          FLOPs 是floating point of operations的缩写，是浮点运算次数，可以用来衡量算法/模型复杂度。

       2. model size = 4*params  模型大小约为参数量的4倍。

Github Links