[Artigo] Redes Neurais Perceptron de Múltiplas Camadas MLP

No último artigo publicado, eu apresentei como a rede neural Perceptron de múltiplas camadas (MLP) pode ser utilizada para resolver problemas não linearmente separáveis. Em um breve exemplo, exibi como foram calculados manualmente os pesos da rede neural. Neste artigo, será apresentado o algoritmo de treinamento denominado Retro-propagação (Back Propagation), que irá ser utilizado para gerar os pesos de forma automática.

O algoritmo de retro-propagação (Back Propagation) se tornou o algoritmo de treinamento de redes neurais mais utilizado, e tem sido estudado pela comunidade científica de inteligência artificial desde a década de 70. Ele é utilizado em frameworks de construção de redes neurais, como por exemplo o Matlab.

O príncipio do algoritmo back propagation é relativamente fácil de entender, embora o embasamento matemático por trás dele possa parecer um pouco complexo. Os passos do algoritmo são:

Inicialize os pesos da rede neural com valores aleatórios pequenos.
Apresente um padrão (dado) para a camada de entrada da rede neural.
Propague o padrão de entrada pelas camadas intermediárias da rede neural a fim de calcular a saída da sua função de ativação.
A diferença entre a saída desejada e a saída da função de ativação será usada para calcular o erro de ativação da rede.
Ajuste os pesos, através do neurônio de saída da rede a fim de reduzir o erro de ativação para o padrão de entrada apresentado.
Propague o valor do erro de volta (retorno) para cada neurônio da camada intermediária, considerando a proporção de sua contribuição no cálculo do erro de ativação da rede.
Ajuste os pesos, agora propagando o erro pelos neurônios da camada intermediária a fim de reduzir a sua contribuição para o erro calculado para o padrão de entrada apresentado à rede.
Repita os passos 2 a 7 para cada padrão de entrada do conjunto de dados apresentado à rede.
Repita o passo 8 até que a rede esteja treinada.

É importante atentar que cada padrão de entrada é apresentado à rede em turnos, ajustando aos poucos os pesos da rede, antes de mover para o próximo padrão de entrada. Se deixarmos a rede corrigir perfeitamente os erros antes de mover para o próximo padrão de entrada, a rede neural perderá a capacidade de generalização a fim de encontrar uma solução que satisfaça todo o conjunto de dados de entrada.

A grande "jogada" do algoritmo de treinamento acontece no passo 6, quando ele determina a quantidade de erro que deve ser propagada de volta (daí o nome retro-propagação) para cada neurônio das camadas intermediárias. Uma vez que o valor do erro for calculado, o treinamento pode continuar conforme descrito semelhante ao algoritmo do perceptron de uma camada.

Para ilustrar como o valor do erro é calculado, considere a rede neural abaixo:

Logo, se usarmos estas variáveis:

output_o = Resposta da função de ativação do neurônio de saída.
error_o = Erro calculado pelo neurônio da camada de saída (output).
error_h = Erro no neurônio da camada intermediária.
weight_ho = Peso conectando o neurônio da camada intermediária com o neurônio da camada de saída.

O erro propagado de volta para o neurônio da camada intermediária é calculado pela fórmula:

error_h = error_o * Derivative(output_o) * weight_ho

Para uma explicação detalhada sobre como foi obtido a derivada da função de ativação do sigmóide, ver o artigo sobre funções de ativação.

Diferentemente da rede Perceptron de uma camada, o treinamento do Perceptron de múltiplas camadas (MLP) com o algoritmo Back-Propagation não garante uma solução final correta, mesmo que uma seja possível. Isto acontece porque durante o treinamento, o mesmo pode ficar preso em uma região de erro de mínimo local. Há diversas estratégias para evitar a ocorrência deste problema, que serão discutidas em detalhes em um próximo artigo. Para agora, reiniciar o treinamento novamente é suficiente para pequenas redes neurais.

Será utilizado para este artigo o mesmo problema de classificação proposto no último artigo.

Os padrões de entrada são representados pelos pontos (vermelhos e azuis) plotados no gráfico acima.

Segue abaixo a implementação em Python de uma rede MLP, ainda sem otimizações.

O código referente ao peso (weight) conectado aos neurônios:


##################################################
#                                                #
#  Copyright 2009 -Marcel Pinheiro Caraciolo-    #
#  Email: caraciol@gmail.com                     #
#                                                #
#  -- Simple Weight of the network               #
#  -- Version: 0.1  - 13/01/2009                 #
##################################################


#Snippet Weight

class Weight:
 
   #Class Constructor
   #This method initializes all properties of this class.
   def __init__(self):
       #Neuron related to this weight.
       self.input = None
       #The value of the weight.
       self.value = None

Segue o código referente ao Neurônio da rede:


##################################################
#                                                #
#  Copyright 2009 -Marcel Pinheiro Caraciolo-    #
#  Email: caraciol@gmail.com                     #
#                                                #
#  -- Simple Neuron of the network               #
#  -- Version: 0.1  - 13/01/2009                 #
##################################################

import math
from Weight import Weight

#Snippet Neuron

class Neuron:
 
   #Class Constructor
   #This method initializes all properties of this class.
   #@param layer: The Input layer that will be connected to the Hidden Neuron.
   #@param random: A random number.
   def __init__(self, *pargs):

     #Set of weights to inputs
     self._weights = None
     #Sum of inputs
     self._input = 0.0
     #Steepness of sigmoid curve
     self._lambda = 6
     #Bias value.
     self._bias = 0.0
     #Sum of error
     self._error  = 0.0
     #Learning rate.
     self._learningRate = 0.5
     #Preset value of neuron.
     self._output = None
   
     if pargs:
         #Each hidden neuron must be full-connected with all input neurons.
         inputs,rnd = pargs
         self._weights = []
         for input in inputs.getLayer():
             #New weight for each neuron.
             w = Weight()
             w.input = input
             #Initializes with a random number.
             w.value = rnd.random()
             self._weights.append(w)

   #Set the output of the neuron.
   def setOutput(self,value):
       self._output = value
 
  #Linear combination implementation of the perceptron.    
   def activate(self):
       self._input = 0.0
       #Calculates the input of the hidden neuron that receives the output from
       #the input neuron.
       for w in self._weights:
           self._input += w.value * w.input.output()
         
   #Activation function of the perceptron.
   def output(self):
       if self._output != None:
           return self._output
       return 1 / (1 + math.exp(-self._lambda * (self._input + self._bias)))
 
   #Calculates the error (feedback).
   def errorFeedback(self, input):
       weight = None
       for w in self._weights:
           if w.input == input:
               weight = w
               break
       return self._error * self._derivative() * weight.value
 
   #The derivative of activation function.
   def _derivative(self):
       return self.output() *  (1  - self.output())
 
   #Adjust the weights connected to the neuron.
   def adjustWeights(self,value):
       self._error = value
       for w in self._weights:
           w.value += self._error * self._derivative() * self._learningRate * w.input.output()
       self._bias += self._error * self._derivative() * self._learningRate

Os neurônios estão organizados em camadas. Segue abaixo a abstração de uma camada em python:


##################################################
#                                                #
#  Copyright 2009 -Marcel Pinheiro Caraciolo-    #
#  Email: caraciol@gmail.com                     #
#                                                #
#  -- Simple Layer of the network                #
#  -- Version: 0.1  - 13/01/2009                 #
##################################################


#Snippet Layer

from Neuron import Neuron

class Layer:
 
   #Class Constructor
   #This method initializes all properties of this class.
   #@param size: The number of Neurons that composes the layer.
   #@param layer: The Input layer that will be connected to the Hidden Neuron.
   #@param random: A random number.
   def __init__(self,size, *pargs):
     
       self._base = []
     
       if pargs:
           #The hidden layer is full-connected with the input layer.
           #So, for each neuron, all input neurons are passed as parameter.
           lyer,rnd = pargs
           for i in range(size):
               self._base.append(Neuron(lyer,rnd))  
       else:
           for i in range(size):
               self._base.append(Neuron())
 
 
   #Get the collection
   #@return:  base (list with all neurons in the layer)
   def getLayer(self):
       return self._base

E o código referente à toda rede neural: Camadas de neurônios interconectados por pesos:


##################################################
#                                                #
#  Copyright 2009 -Marcel Pinheiro Caraciolo-    #
#  Email: caraciol@gmail.com                     #
#                                                #
#  -- Multi Layer Perceptron Neural Net          #
#  -- Version: 0.1  - 12/01/2009                 #
##################################################


#Snippet MLP

import random
import math
from Layer import Layer
from Neuron import Neuron


"""
   This is a Multi-Layer Perceptron class
   with BackPropagation algorithm implemented.
"""
class MLP:
 
     
   #Class Constructor
   #This method initializes all properties of this class.
   #@param iterations: The number of iterations.
   #@param architecture: (numberOfHiddenNeurons,numberOfInputNeurons)
   #@param patterns: Collection of training patterns
   def __init__(self,patterns,iterations=5000, architecture=(2,1)):
     
       #Number of hidden neurons, Number of input Neurons
       self._hiddenDims,self._inputDims = architecture
       #Current training iteration
       self._iteration = 0
       #Maximum number of iterations before restart training
       self._iterations = iterations
       #Set of hidden neurons
       self._hiddenLayer = None
       #Set of input neurons
       self._inputLayer = None
       #List of training patterns
       self._patterns = list(patterns)
       #Output neuron
       self._output = None
       #Random number generator
       self._rnd = random.Random()
     
       #Initialize the network
       self. _initialize()
 
 
   #Initialize the network based on its architecture.
   def _initialize(self):
       #First Layer (2 neurons representing x and y)
       self._inputLayer = Layer(self._inputDims)
       #Hidden Layer (2 neurons representing h1 and h2)
       self._hiddenLayer = Layer(self._hiddenDims, self._inputLayer, self._rnd)
       #Initialize the  output neuron (o).
       self._output = Neuron(self._hiddenLayer,self._rnd)
       #Reset the iterations.
       self._iteration = 0
       print "Network Initialized..."
 
   #Adjust the network weights.  
   def _adjustWeights(self, delta):
       #Adjust the weights from the output neuron.
       self._output.adjustWeights(delta)
       #Adjust the weights from the hidden neurons (retro-propagation).
       for neuron in self._hiddenLayer.getLayer():
           #Output neuron connected to the respective neuron.
           neuron.adjustWeights(self._output.errorFeedback(neuron))
 
 
   #Propagates the perceptron and calculate the output.
   def _activate(self, pattern):
       for i in range(len(pattern[0])):
           #for each input neuron set the respective input (x,y).
           self._inputLayer.getLayer()[i].setOutput(pattern[0][i])        
       
       for neuron in self._hiddenLayer.getLayer():
           #Propagate through the network (hidden neurons).
           neuron.activate()
     
       #Calculate the output from the output neuron  
       self._output.activate()
     
       #Calculates the output of the network (output neuron)
       return self._output.output()

  #Do the training of the perceptron network.
   def train(self):
      error = 1.0
      while error > 0.1:
          error = 0.0
          for pattern in self._patterns:
              #Calculates the error.
              delta = pattern[1] - self._activate(pattern)
              #Adjust the weights of the network
              self._adjustWeights(delta)
              #Evaluates the global error
              error += math.pow(delta,2)
          print "Iteration %d Error: %f" % (self._iteration, error)
          self._iteration += 1
          #Restarts the training/Initialization.
          if self._iteration > self._iterations:
              self._initialize()

  #Test the network after trained.
   def execute(self,pattern):
       return self._activate(pattern)
 
   #Generates random numbers.
   def arange(self,start,stop=None,step=None):
       if stop is None:
           stop = float(start)
           start = 0.0
       if step is None:
           step = 1.0
       cur = float(start)
       while cur <= stop:             yield cur             cur+=step

Execute o programa acima usando:


       
#main Logic

#Load sample input patterns
inputs = [
[(0.10, 0.03), 0], [(0.11, 0.11), 0],[(0.11, 0.82), 0],
[(0.13, 0.17), 0],[(0.20, 0.81), 0],
[(0.21, 0.57), 1],[(0.25, 0.52), 1],[(0.26, 0.48), 1],
[(0.28, 0.17), 1],[(0.28, 0.45), 1],[(0.37, 0.28), 1],
[(0.41, 0.92), 0], [(0.43, 0.04), 1], [(0.44, 0.55), 1],
[(0.47, 0.84), 0], [(0.50, 0.36), 1], [(0.51, 0.96), 0],
[(0.56, 0.62), 1], [(0.65, 0.01), 1], [(0.67, 0.50), 1],
[(0.73, 0.05), 1], [(0.73, 0.90), 0], [(0.73, 0.99), 0],
[(0.78, 0.01), 1], [(0.83, 0.62), 0], [(0.86, 0.42), 1],
[(0.86, 0.91), 0], [(0.89, 0.12), 1], [(0.95, 0.15), 1],
[(0.98, 0.73), 0] ]        

#Initializes the Multi-Layer Perceptron
mlp = MLP(inputs,iterations=2000, architecture=(2,2))
#Train the network whit the training inputs.
mlp.train()
#Display network generalization
print ""
print "X, Y, Output"
for pattern in inputs:
   #Calculate output.
   result =  mlp.execute(pattern[:-1])
   print "%f %f %s" % (pattern[0][0],pattern[0][1],result)

Quando o aplicativo for executado, ele utilizará o algoritmo de retro-propagação (back propagation) para treinar a rede neural, reiniciando caso ela fique presa em algum erro mínimo local. Uma vez totalmente treinada, será possível testar a rede neural com novas entradas (pontos do gráfico) desconhecidas e assim observar a capacidade de generalização da rede, classificando corretamente tais pontos após o aprendizado da rede.

Pode-se observar que é simples modificar o código a fim de inserir mais dimensões (nesse gráfico são 2 dimensões, mas pode ser 3 dimensões) e aumentar o número de neurônios da camada escondida, variando as variáveis X e Y. Não esquecer claro de adicionar as colunas extras ao conjunto de entradas apresentado à rede. Logo, a rede pode ser usada para solucionar problemas de alta complexidade.

Donwnload do código aqui.

Aplicações desta rede neural poderão ser abordadas em futuros artigos, exibindo o poder dessas redes em solucionar problemas do mundo real. Fiquem antenados!

5 comments:

UnknownApril 14, 2009 at 5:13 AM
BOM DIA!

Você tem algo sobre RNA com três neurônios na camada de saída? Preciso algo nesse sentido. Grata por sua atenção, aguardo seu retorno, se possível no e-mail: camilaboeri@hotmail.com

Camila
UnknownJuly 7, 2017 at 2:23 AM
PLC training in Cochin, Kerala
Automation training in Cochin, Kerala
Embedded System training in Cochin, Kerala
VLSI training in Cochin, Kerala
PLC training institute in Cochin, Kerala
Embedded training in Cochin, Kerala
Best plc training in Cochin, Kerala
maheshOctober 3, 2018 at 10:41 PM
Gaining Python certifications will validate your skills and advance your career. Python Certification .
python certification
jamesJune 26, 2019 at 11:21 PM
Amazing content.
Data Mining Service Providers in Bangalore
jane hollySeptember 20, 2020 at 7:02 AM
This professional hacker is absolutely reliable and I strongly recommend him for any type of hack you require. I know this because I have hired him severally for various hacks and he has never disappointed me nor any of my friends who have hired him too, he can help you with any of the following hacks:

-Phone hacks (remotely)
-Credit repair
-Bitcoin recovery (any cryptocurrency)
-Make money from home (USA only)
-Social media hacks
-Website hacks
-Erase criminal records (USA & Canada only)
-Grade change
-funds recovery

Email: onlineghosthacker247@ gmail .com

Artificial Intelligence in Motion

Pages