Search by:
Architectural and mathematical fundamentals of improvement neural networks for classification of images
Full text (PDF)
UDC: 519.8
Publication Language: Ukrainian
Stuc. intelekt. 2022; 27(1):245-258
Abstract: The article proposes ways to solve the problem of structural synthesis of neural networks architectures; based on non-traditional approaches to their analytical formalization and application of new ones data processing operations. The example of the CIFAR10 dataset proves the possibility of improving the accuracy of the solution image classification tasks within an alternative architecture with expandable input and narrowing structures of trained neural networks. The results of the application of input expanding image taken with Resizing and Conv2DTranspose layers at the input trained neural networks indicate the effectiveness of solving such classification problems by example CIFAR10 dataset. Achieved on the basis of the Xception neural network, the average classification accuracy of 10 classes CIFAR10 images are 97.3%. The side effect of pre-scaling images is leveling the accuracy of the classification of different classes, which allows us to consider such a change in size as an option data augmentation in the dataset. Tensor-matrix methods have been introduced for further development of this approach formalization of the description of neural networks on the basis of the penetrating end product of matrices and its block ones modifications. On this basis, a number of new convolution and maxpooling operations are proposed, as well as combinations on input of the narrowing segment not only symmetrically enlarged image, but also its variants obtained on the basis of a generalized penetrating product. It is about drawing images expanded by rows of pixels (horizontally) and columns (vertically), as well as a combination of several different options symmetrical image extensions on the principle of constructing a pyramidal segment of the PSPNet neural network.
Keywords: neural network, tensor-matrix theory, penetrated face-splitting product of matrices.
References:
- Slyusar, V. I. (1998) End matrixs products in radar applications. Radioelectronics and Communications Systems, 41(3).
- Slyusar V. I. (1997) New operations of matrices product for applications of radars, in Proc. Direct and Inverse Problems of Electromagnetic and Acoustic Wave Theory (DIPED-97), Lviv, 73-74 (in Russian). doi: 10.1109/DIPED.1997.710918.
- Slyusar, V.I. (1999). A Family of Face Products of Matrices and its Properties. Cybernetics and systems analysis c/c of Kibernetika i sistemnyi analiz. Consultants bureau (USA), 3(35), 379–384. doi: 10.1007/BF02733426.
- Slyusar, V. I. (2003) Generalized face-products of matrices in models of digital antenna arrays with nonidentical channels. Radioelectronics and Communications Systems, 46(10), 9 - 17.
- Slyusar, V. I. (1999) Fisher Information Matrix for Models of Systems Based on Face-Splitting Matrix Products, Cybernetics and Systems Analysis c/c of Kibernetika i Sistemnyi Analiz, 35(4), 636-643. doi: 10.1007/BF02835859.
- Thomas D. Ahle, Jakob Bæk Tejs Knudsen (2019) Almost Optimal Tensor Sketch. Mathematics, Computer Science, ArXiv.
- Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Józefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Gordon Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda B. Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. CoRR, abs/1603.04467, http://arxiv.org/abs/1603.04467.
- Tensorflow, how to multiply a 2D tensor (matrix) by corresponding elements in a 1D vector (2017), https://stackoverflow.com/questions/47817135/tensorflow-how-to-multiply-a-2d-tensor-matrix-by-corresponding-elements-in-a.
- Ha D., Dai A.M., Le Q.V. HyperNetworks (2017) The International Conference on Learning Representations (ICLR), Toulon, https://arxiv.org/abs/1609.09106.
- LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278- 2324.
- Slyusar V.І. The tensor-matrix version of LeNet5 (2021) IVth International scientific-practical conference dedicated to the 50th anniversary of the Department of Information Systems and Technologies «Integration Information Systems and Intelligent Technologies in the Conditions of Information Society Transformation», Poltava: Poltava State Agrarian University, 114 - 119. doi: 10.32782/978-966-289-562-9.
- Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. (2017). ImageNet classification with deep convolutional neural networks, Communications of the ACM. 60 (6): 84–90. doi:10.1145/3065386.
- Vadym Slyusar, Mykhailo Protsenko, Anton Chernukha, Pavlo Kovalov, Pavlo Borodych, Serhii Shevchenko, Oleksandr Chernikov, Serhii Vazhynskyi, Oleg Bogatov, Kirill Khrustalev. Improvement of the object recognition model on aerophotos using deep conventional neural network.// Eastern-European Journal of Enterprise Technologies. - 2021, Vol. 5, No. 2 (113). Pp. 6 – 21. DOI: 10.15587/1729-4061.2021.243094.
- H. Qassim, A. Verma and D. Feinzimer (2018), Compressed residual-VGG16 CNN model for big data places image recognition, Computing and Communication Workshop and Conference (CCWC) 2018 IEEE 8th Annual, 169-175.
- G. Huang, Z. Liu, L. Van Der Maaten and K. Weinberger (2017), Densely Connected Convolutional Networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2261-2269. doi: 10.1109/CVPR.2017.243
- Howard, A.G., Zhu, M., Chen, B., et al. (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.
- Sandler, M., Howard, A., Zhu, M., et al. (2018) Mobilenetv2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4510-4520. https://doi.org/10.1109/CVPR.2018.00474.
- Howard, A., Sandler, M., Chu, G., et al. (2019) Searching for Mobilenetv3. Proceedings of the IEEE International Conference on Computer Vision, Seoul, 27 October-2 November 2019, 1314-1324. https://doi.org/10.1109/ICCV.2019.00140.
- Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning Transferable Architectures for Scalable Image Recognition, https://arxiv.org/abs/1707.07012.
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation, https://arxiv.org/pdf/1505.04597.pdf .
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio (2014) Generative Adversarial Nets, In Advances in Neural Information Processing Systems (NIPS), 2672–2680.
- Slyusar V.I., Sliusar I.I. (2021) Lions of Neural Networks Zoo, NeyromerezhnI tehnologIyi ta yih zastosuvannya NMTIZ-2021: zbirnik naukovyh prats XX Mizhnarodnoyi naukovoyi konferentsiyi «Neyromerezhny tehnologii ta yih zastosuvannya NMTIZ-2021», Kramatorsk: DDMA, 129 -133, doi: 10.13140/RG.2.2.17187.58405.
- A. Krizhevsky (2009). Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto.
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. (2009) ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
- Md Jahidul Islam, Peigen Luoy and Junaed Sattar (2020), Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception, Robotics: Science and Systems 2020 Corvalis, Oregon, USA, July 12-16, 2020 .
- Fjodor Van Veen. The neural network zoo. - https://www.asimovinstitute.org/neural-networkzoo/.
- Slyusar V.I. Multimodal quasi-fractal neural networks (2021) NeyromerezhnI tehnologIyi ta yih zastosuvannya NMTIZ-2021: zbirnik naukovyh prats XX Mizhnarodnoyi naukovoyi konferentsiyi «Neyromerezhny tehnologii ta yih zastosuvannya NMTIZ-2021», Kramatorsk: DDMA, 134 -137.
- Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia (2016) Pyramid Scene Parsing Network, https://arxiv.org/abs/1612.01105.