Neural Networks in Bioprocessing and Chemical Engineering

Neural Networks in Bioprocessing and Chemical Engineering

von: D. R. Baughman, Y. A. Liu

Elsevier Reference Monographs, 1996

ISBN: 9781483295657 , 509 Seiten

Format: PDF, ePUB

Kopierschutz: DRM

Windows PC,Mac OSX für alle DRM-fähigen eReader Apple iPad, Android Tablet PC's Apple iPod touch, iPhone und Android Smartphones

Preis: 54,95 EUR

Mehr zum Inhalt

Neural Networks in Bioprocessing and Chemical Engineering


 

2

Fundamental and Practical Aspects of Neural Computing


This chapter introduces neural networks. We first discuss what makes up a neural network, then move on to the fundamental and practical aspects of neural computing, and discuss aspects of network training (learning). We next illustrate how to develop a neural network using a commercial software package on a personal computer. Finally, we introduce a number of special neural networks that find significant applications in bioprocessing and chemical engineering.

2.1 Introduction to Neural Computing


This section provides an introduction to neural computing. Part of our discussion has been adopted and updated from Quantrille and Liu (1991; pp. 446–466). We begin with the basic network component, the node or processing element. After describing the structure of a node, we move on first to the topology of neural networks, i.e., how nodes are interconnected, then to network training and the different ways neural networks can “learn.”

A Components of a Node


The foundation of a neural network is the neuron, or node (sometimes called neurode), shown in Figure 2.1. In many scientific and engineering applications, this node is frequently called a processing element, although we use “node” throughout this text. The nodes perform most of the calculations in the neural network.


Figure 2.1 The anatomy of the jth node that transfers the input ai to the jth output bj through a weight factor wij and a transfer function f(xj). Tj is the internal threshold for node j.

1 Inputs and Outputs

The inputs to the jth node are represented as an input vector, a, with components ai(i=1 to n). The node manipulates these inputs, or activities, to give the output, bj, which can then form the part of the input to other nodes.

2 Weight Factors

What determines the output from a node? Certainly, the component values of input vector a have an effect. However, some additional factors also affect the output bj. One is the weight factor, wij, for the ith input, ai, corresponding to the jth node. Every input is multiplied by its corresponding weight factor, and the node uses this weighted input to perform further calculations. For example, let us consider the node six. The first input into the node is a1. Multiplying this input by the corresponding weight factor gives w16a1.

Weight factors can have either an inhibitory or an excitatory effect. If we adjust wij. such that wij ai is positive (and preferably large), we tend to excite the node. If wijai is negative, it inhibits the node. Finally, if wij ai is very small in magnitude relative to other signals, the input signal ai will have little or no effect.

3 Internal Thresholds

The next important factor governing the output from a node is the internal threshold. The internal threshold for the jth node, denoted Tj, controls activation of that node. The node calculates all its wij ai’s, sums the terms together, and then calculates the total activation, xj, by subtracting the internal threshold value:

=xj=∑i=1n(wijai)−Tj (2.1)

(2.1)

If Tj is large and positive, the node has a high internal threshold, which inhibits node-firing. Conversely, if Tj is zero (or negative, in some cases), the node has a low internal threshold, which excites node-firing.

Some, but not necessarily all, nodes have an internal threshold. If no internal threshold is specified, we assume Tj to be zero.

4 Transfer Functions

The final factor governing a node’s output is the transfer function. Once the node calculates the dot product of vector wj = [w1j, w2j,…, wnj]T with vector a, and subtracts the threshold Tj (as described above), it passes this result to a transfer function, F( ). Thus, the complete node calculation is:

(wj·a−Tj)=f(∑i=1n(wijai)−Tj) (2.2)

(2.2)

This calculation, then, is a function of the difference between the weighted total input and the internal threshold.

What functional form do we choose for f( )? We could choose whatever we want—square root, log, ex and so on. Mathematicians and computer scientists, however, have found the sigmoid (S-shaped) function particularly useful. A typical sigmoid function, shown in Figure 2.2, is:


Figure 2.2 A sigmoid (S-shaped) function.

(x)=11+e−x (2.3)

(2.3)

This function is monotonically increasing, with limiting values of 0 (at xj = –∞) and 1 (at xj = +∞). All sigmoid functions have upper and lower limiting values. Because of these limiting values, sigmoid functions are called threshold functions. At very low input values, the threshold-function output is zero. At very high input values, the output value is one.

The sigmoid function in general yields fairly well-behaved neural networks. With these functions, the inhibitory and excitatory effects of the weight factors are straightforward (i.e., wij < 0 is inhibitory, and wij > 0 is excitatory). Moreover, sigmoid functions are continuous and monatonic, and remain finite even as x approaches ±∞. Because they are monotonic, they also provide for more efficient network training. We frequently move down the slope of the curve in training, and the sigmoid functions have slopes that are well-behaved as a function of x (this topic will be discussed in gradient-descent learning, section 2.2D).

Another useful transfer function is the hyperbolic tangent, with limiting values of −1 and +1 (Figure 2.3):


Figure 2.3 A hyperbolic tangent transfer function.

(x)=tanh(x)=ex−e−xex+e−x (2.4)

(2.4)

Hyperbolic tangent transfer functions also typically produce well-behaved networks.

Another useful transfer function is the radial basis function. The Gaussian transfer function is the most commonly used “radially symmetric” function (see Section 3.2 for further discussion), and its equation is:

(x)=exp[−x22] (2.5)

(2.5)

Figure 2.4 shows a graphical representation of a Gaussian transfer function. The function has maximum response, f(x) = 1, when the input is x = 0, and the response decreases to f(x) = 0 as the input approaches x = ±∞. This type of response pattern makes the radial basis function very advantageous for certain types of networks, such as those used for classification described in Chapter 3.


Figure 2.4 A Gaussian transfer function.

5 Summary of Node Anatomy

Figure 2.5 summarizes the basic features of a node. As seen, a node has an n dimensional input vector, a, an internal threshold value, Tj, and n weight factors (w1j,…, wij,…, wnj) one for each input. If the weighted input is large enough, the node becomes active and performs a calculation based on the difference between the weighted input value and the internal threshold value. Typically, a sigmoid or hyperbolic tangent function is used for f(x), since it is a well-behaved threshold function, it provides for more rapid training.


Figure 2.5 Summary of a node anatomy.

B Topology of a Neural Network


The topology of a neural network refers to how its nodes are interconnected. Figure 2.6 shows a very common topology. We form these topologies, or architectures, by organizing the nodes into layers, connecting them, and weighting the interconnections. The network has three layers, one hidden, and each node’s output feeds into all nodes in the subsequent layer.


Figure 2.6 A neural network with one hidden layer.

1 Inhibitory or Excitatory Connections

As mentioned, weight factors can either inhibit or excite the node. If the weight is positive, it will excite the node, increasing its activation. If the weight is negative, it will inhibit the node, decreasing activation. If the weighted signal is highly inhibitory, it may lower the input below the threshold level and thus shut the node down.

2 Connection Options

We have three options for connecting nodes to one another, as shown in Figure 2.7. In intralayer connections, the outputs from a node feed into other nodes in the same layer. In interlayer connections, the outputs from a node in one layer feed into nodes in another layer. Finally, in recurrent connections, the output from a node feeds into...