A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,
T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient
convolutional neural networks for mobile vision applications,” arXiv
preprint arXiv:1704.04861, 2017.
 M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “Inverted
residuals and linear bottlenecks: Mobile networks for classification,
detection and segmentation,” in IEEE Conference on Computer Vision
and Pattern Recognition (CVPR 2018).
 M. Tan, B. Chen, R. Pang, V. Vasudevan, and Q. V. Le, “Mnasnet:
Platform-aware neural architecture search for mobile,” arXiv preprint
 J. H. Lee, S. Ha, S. Choi, W. Lee, and S. Lee, “Quantization for rapid
deployment of deep neural networks,” arXiv preprint arXiv:1810.05488,
2018. 5] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam,
and D. Kalenichenko, “Quantization and training of neural networks for
efficient integer-arithmetic only inference,” in Conference on Computer
Vision and Pattern Recognition (CVPR 2018).
 G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural
network,” arXiv preprint arXiv:1503.02531, 2015.
 A. Mishra and D. Marr, “Apprentice: Using knowledge distillation
techniques to improve low-precision network accuracy,” arXiv preprint
 A. Mishra, E. Nurvitadhi, J. J. Cook, and D. Marr, “Wrpn: Wide
reduced-precision networks,” arXiv preprint arXiv:1709.01134, 2017.
 https://developer.nvidia.com/tensorrt, NVIDIA TensorRT platform, 2018.
 M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,
M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray,
C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,
P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden,
M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: Largescale
machine learning on heterogeneous distributed systems,” arXiv preprint
 M. Courbariaux, Y. Bengio, and J. David, “Training deep neural
networks with low precision multiplications,” in International Conference
on Learning Representations (ICLR 2015).
 I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio,
“Binarized neural networks,” in Advances in Neural Information
Processing Systems (NIPS 2016), pp. 41074115.
 M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net:
Imagenet classification using binary convolutional neural networks,” in
European Conference on Computer Vision (ECCV 2016), Springer, pp.
 S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “Dorefa-net:
Training low bitwidth convolutional neural networks with low bitwidth
gradients,” arXiv preprint arXiv:1606.06160, 2016.
 Y. Bengio, N. Leonard, and A. C. Courville, “Estimating or propagating
gradients through stochastic neurons for conditional computation,” arXiv
preprint arXiv:1308.3432, 2013.
 M. D. McDonnell, “Training wide residual networks for deployment
using a single bit for each weight,” in International Conference on
Learning Representations (ICLR 2018).
 S. Zhu, X. Dong, and H. Su, “Binary ensemble neural network: More bits
per network or more networks per bit?” arXiv preprint arXiv:1806.07550,
 C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes,
A. Mendelson, and A. M. Bronstein, “Nice: Noise injection and
clamping estimation for neural network quantization,” arXiv preprint
 O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,
Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and
L. Fei-Fei, “Imagenet large scale visual recognition challenge,” arXiv
preprint arXiv:1409.0575, 2014.
 S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
network training by reducing internal covariate shift,” in International
Conference on Machine Learning (ICML 2015).
 D. P. Kingma and J. L. Ba, “Adam: A method for stochastic
optimization,” in International Conference on Learning Representations
 T. Sheng, C. Feng, S. Zhuo, X. Zhang, L. Shen, and M. Aleksic,
“A quantization-friendly separable convolution for mobilenets,” arXiv
preprint arXiv:1803.08607, 2018.
models.md, Image classification (Quantized Models).