Effects of BP Algorithm-based Activation Functions on Neural Network Convergence

Junguo Hu; Lili Xu; Xin Wang; Xiaojun Xu; Guangyun Su

熱門：朱丽彬黃光男王善边崔雪娟王美玲黃乃熒

首頁

臺灣期刊 學校系所學協會民間出版

大陸/海外期刊 政府機關學校系所學協會民間出版

DOI註冊服務

閱讀全文
篇名	Effects of BP Algorithm-based Activation Functions on Neural Network Convergence
並列篇名	Effects of BP Algorithm-based Activation Functions on Neural Network Convergence
作者	Junguo Hu、Lili Xu、Xin Wang、Xiaojun Xu、Guangyun Su
英文摘要	Activation functions map data in artificial neural network computation. In an application, the activation function and selection of its gradient and translation factors are directly related to the convergence of the network. Usually, the activation function parameters are determined by trial and error. In this work, a Cauchy distribution (Cauchy), Laplace distribution (Laplace), and Gaussian error function (Erf) were used as new activation functions for the back-propagation (BP) algorithm. In addition, this study compares the effects of the Sigmoid type function (Logsig), hyperbolic tangent function (Tansig), and normal distribution function (Normal). The XOR problem was used in simulation experiments to evaluate the effects of these six kinds of activation functions on network convergence and determine their optimal gradient and translation factors. The results show that the gradient factor and initial weights significantly impact the convergence of activation functions. The optimal gradient factors for Laplace, Erf-Logsig, Tansig-Logsig, Logsig, and Normal were 0.5, 0.5, 4, 2, and 1, respectively, and the best intervals were [0.5, 1], [0.5, 2], [2, 6], [1, 4], and [1, 2], respectively. Using optimal gradient factors, the order of convergence speed was Laplace, Erf-Logsig, Tansig-Logsig, Logsig, and Normal. The functions Logsig (gradient factor = 2), Tansig-Logsig (gradient factor = 4), Normal (translation factor = 0, gradient factor = 1), Erf-Logsig (gradient factor = 0.5) and Laplace (translation factor = 0, gradient factor = 0.5) were less sensitive to initial weights, and as a result, their convergence performances were less influenced. As the gradient of the curve of the activation functions increased, the convergence speed of the networks showed an accelerating trend. The conclusions obtained from the simulation analysis can be used as a reference for the selection of activation functions for BP algorithm-based feedforward neural networks.
起訖頁	076-085
關鍵詞	activation functions、back-propagation (BP) algorithm、convergence、gradient factor、initial weights
刊名	電腦學刊
期數	201802 (29:1期)
DOI	10.3966/199115992018022901007 複製DOI
QR Code
該期刊上一篇	Proximal Support Vector Machine with Mixed Norm
該期刊下一篇	Combining Features to Meet User Satisfaction: Mining Helpful Chinese Reviews

教師服務合作出版期刊徵稿聯絡高教高教FB	讀者服務圖書目錄教育期刊訂購服務活動訊息	數位服務高等教育知識庫國際資料庫收錄投審稿系統 DOI註冊	線上購買高點網路書店元照網路書店博客來網路書店	教育資源教育網站國際教育網站	關於高教高教簡介出版授權合作單位
知識達	知識達	知識達	知識達	知識達	知識達