Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
´úÂëÏê½â£º»ùÓÚPython½¨Á¢ÈÎÒâ²ãÊýµÄÉî¶ÈÉñ¾­ÍøÂç
 
×÷Õß±ÊÃû£º¶ÁоÊõ
  2291  次浏览      27
 2019-11-25
 
±à¼­ÍƼö:
ÎÄÕ½éÉÜÁËÈçºÎ½¨Á¢ÆðÒ»¸öÈÎÒâ²ãÊýµÄÉî¶ÈÉñ¾­ÍøÂç¡£Õâ¸öÉñ¾­ÍøÂç¿ÉÒÔÓ¦ÓÃÓÚ¶þÔª·ÖÀàµÄ¼à¶½Ñ§Ï°ÎÊÌâ¡£
±¾ÎÄÀ´×ÔÓÚ°Ù¶È£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼­¡¢ÍƼö¡£

ͼ1 Éñ¾­ÍøÂç¹¹ÔìµÄÀý×Ó£¨·ûºÅ˵Ã÷£ºÉϱê[l]±íʾÓëµÚl²ã£»Éϱ꣨i£©±íʾµÚi¸öÀý×Ó£»Ï±êi±íʾʸÁ¿µÚiÏ

µ¥²ãÉñ¾­ÍøÂç

ͼ2 µ¥²ãÉñ¾­ÍøÂçʾÀý

Éñ¾­ÔªÄ£ÐÍÊÇÏȼÆËãÒ»¸öÏßÐÔº¯Êý£¨z=Wx+b£©£¬½Ó×ÅÔÙ¼ÆËãÒ»¸ö¼¤»îº¯Êý¡£Ò»°ãÀ´Ëµ£¬Éñ¾­ÔªÄ£Ð͵ÄÊä³öÖµÊÇa=g(Wx+b)£¬ÆäÖÐgÊǼ¤»îº¯Êý£¨sigmoid,tanh, ReLU, ¡­£©¡£

Êý¾Ý¼¯

¼ÙÉèÓÐÒ»¸öºÜ´óµÄÊý¾Ý¿â£¬ÀïÃæ¼Ç¼Á˺ܶàÌìÆøÊý¾Ý£¬ÀýÈç£¬ÆøÎ¡¢Êª¶È¡¢ÆøÑ¹ºÍ½µÓêÂÊ¡£

ÎÊÌâ³ÂÊö£º

Ò»×éѵÁ·Êý¾Ým_train£¬ÏÂÓê±ê¼ÇΪ£¨1£©£¬²»ÏÂÓê±ê¼ÇΪ£¨0£©¡£

Ò»¸ö²âÊÔÊý¾Ý×ém_test£¬±ê¼ÇÊÇ·ñÏÂÓê¡£

ÿһ¸öÌìÆøÊý¾Ý°üº¬x1=ÆøÎ£¬x2=ʪ¶È£¬x3=ÆøÑ¹¡£

»úÆ÷ѧϰÖÐÒ»¸ö³£¼ûµÄÔ¤´¦Àí²½ÖèÊǽ«Êý¾Ý¼¯¾ÓÖв¢±ê×¼»¯£¬ÕâÒâζ×Å´Óÿ¸öʾÀýÖмõÈ¥Õû¸önumpyÊý×éµÄƽ¾ùÖµ£¬È»ºó½«Ã¿¸öʾÀý³ýÒÔÕû¸önumpyÊý×éµÄ±ê׼ƫ²î¡£

ͨÓ÷½·¨£¨½¨Á¢²¿·ÖËã·¨£©

ʹÓÃÉî¶ÈѧϰÀ´½¨ÔìÄ£ÐÍ

1. ¶¨ÒåÄ£Ð͹¹Ô죨ÀýÈ磬Êý¾ÝµÄÊäÈëÌØÕ÷£©

2. ³õʼ»¯²ÎÊý²¢¶¨Ò峬²ÎÊý

µü´ú´ÎÊý

ÔÚÉñ¾­ÍøÂçÖеÄL²ãµÄ²ãÊý

Òþ²Ø²ã´óС

ѧϰÂʦÁ

3. µü´úÑ­»·

ÕýÏò´«²¥£¨¼ÆËãµçÁ÷ËðºÄ£©

¼ÆËã³É±¾º¯Êý

·´Ïò´«²¥£¨¼ÆËãµçÁ÷ËðºÄ£©

Éý¼¶²ÎÊý£¨Ê¹Óñ³¾°²ÎÊýºÍÌݶȣ©

4. ʹÓÃѵÁ·²ÎÊýÀ´Ô¤²â±êÇ©

³õʼ»¯

¸üÉî²ã´ÎµÄL-²ãÉñ¾­ÍøÂçµÄ³õʼ»¯¸üΪ¸´ÔÓ£¬ÒòΪÓиü¶àµÄÈ¨ÖØ¾ØÕóºÍÆ«ÖÃÏòÁ¿¡£Ï±íչʾÁ˲»Í¬½á¹¹µÄ¸÷Öֲ㼶¡£

±í1 L²ãµÄÈ¨ÖØ¾ØÕów¡¢Æ«ÖÃÏòÁ¿bºÍ¼¤»îº¯Êýz

±í2 ʾÀý¼Ü¹¹ÖеÄÉñ¾­ÍøÂçÈ¨ÖØ¾ØÕów¡¢Æ«ÖÃÏòÁ¿bºÍ¼¤»îº¯Êýz

±í2°ïÖúÎÒÃÇΪͼ1ÖеÄʾÀýÉñ¾­ÍøÂç¼Ü¹¹µÄ¾ØÕó×¼±¸ÁËÕýÈ·µÄά¶È¡£

import numpy as np
import matplotlib.pyplot as plt
nn_architecture = [
{"layer_size": 4,"activation": "none"}, # input layer
{"layer_size": 5,"activation": "relu"},
{"layer_size": 4,"activation": "relu"},
{"layer_size": 3,"activation": "relu"},
{"layer_size": 1,"activation": "sigmoid"}
]
def initialize_parameters(nn_architecture, seed = 3):
np.random.seed(seed)
# python dictionary containingour parameters "W1", "b1", ..., "WL","bL"
parameters = {}
number_of_layers = len(nn_architecture)
for l in range(1,number_of_layers):
parameters['W' + str(l)] =np.random.randn(
nn_architecture[l]["layer_size"],
nn_architecture[l-1]["layer_size"]
) * 0.01
parameters['b' + str(l)] =np.zeros((nn_architecture[l]["layer_size"], 1))
return parameters

´úÂë¶Î1 ²ÎÊý³õʼ»¯

ʹÓÃÐ¡Ëæ»úÊý³õʼ»¯²ÎÊýÊÇÒ»ÖÖ¼òµ¥µÄ·½·¨£¬µ«Í¬Ê±Ò²±£Ö¤Ëã·¨µÄÆðʼֵ×ã¹»ºÃ¡£

¼Çס£º

¡¤ ²»Í¬µÄ³õʼ»¯¹¤¾ß£¬ÀýÈçZero,Random, He or Xavier£¬¶¼»áµ¼Ö²»Í¬µÄ½á¹û¡£

¡¤ Ëæ»ú³õʼ»¯Äܹ»È·±£²»Í¬µÄÒþ²Øµ¥Ôª¿ÉÒÔѧϰ²»Í¬µÄ¶«Î÷£¨³õʼ»¯ËùÓÐÈ¨ÖØÎªÁã»áµ¼Ö£¬ËùÓвã´ÎµÄËùÓиÐÖª»ú¶¼½«Ñ§Ï°ÏàͬµÄ¶«Î÷£©¡£

¡¤ ²»Òª³õʼ»¯ÎªÌ«´óµÄÖµ¡£

¼¤»îº¯Êý

¼¤»îº¯ÊýµÄ×÷ÓÃÊÇΪÁËÔö¼ÓÉñ¾­ÍøÂçµÄ·ÇÏßÐÔ¡£ÏÂÀý½«Ê¹ÓÃsigmoid and ReLU¡£

SigmoidÊä³öÒ»¸ö½éÓÚ0ºÍ1Ö®¼äµÄÖµ£¬ÕâʹµÃËü³ÉΪ¶þ½øÖÆ·ÖÀàµÄÒ»¸öºÜºÃµÄÑ¡Ôñ¡£Èç¹ûÊä³öСÓÚ0.5£¬¿ÉÒÔ½«Æä·ÖÀàΪ0£»Èç¹ûÊä³ö´óÓÚ0.5£¬¿ÉÒÔ½«Æä·ÖÀàΪ1¡£

def sigmoid(Z):
S = 1 / (1 + np.exp(-Z))
return S
def relu(Z):
R = np.maximum(0, Z)
return R
def sigmoid_backward(dA, Z):
S = sigmoid(Z)
dS = S * (1 - S)
return dA * dS
def relu_backward(dA, Z):
dZ = np.array(dA, copy = True)
dZ[Z <= 0] = 0
return dZ

´úÂë¶Î2 SigmoidºÍReLU¼¤»îº¯Êý£¬¼°ÆäÑÜÉúÎï

ÔÚ´úÂë¶Î2ÖУ¬¿ÉÒÔ¿´µ½¼¤»îº¯Êý¼°ÆäÅÉÉúµÄʸÁ¿»¯±à³ÌʵÏÖ¡£¸Ã´úÂ뽫ÓÃÓÚ½øÒ»²½µÄ¼ÆËã¡£

ÕýÏò´«²¥

ÔÚÕýÏò´«²¥ÖУ¬ÔÚ²ãlµÄÕýÏòº¯ÊýÖУ¬ÐèÒªÖªµÀ¸Ã²ãÖеļ¤»îº¯ÊýÊÇÄÄÒ»ÖÖ£¨sigmoid¡¢tanh¡¢ReLUµÈ£©¡£Ç°Ò»²ãµÄÊä³öֵΪÕâÒ»²ãµÄÊäÈëÖµ£¬ÏȼÆËãz£¬ÔÙÓÃÑ¡¶¨µÄ¼¤»îº¯Êý¼ÆËã¡£

ͼ3 Éñ¾­ÍøÂçµÄÕýÏò´«²¥

ÏßÐÔÕýÏòÄ£¿é£¨¶ÔËùÓÐʾÀý½øÐÐʸÁ¿»¯£©¼ÆËãÒÔÏ·½³Ìʽ£º

·½³Ìʽ1 ÏßÐÔÕýÏòº¯Êý

def L_model_forward(X, parameters, nn_architecture):
forward_cache = {}
A = X
number_of_layers =len(nn_architecture)
for l in range(1,number_of_layers):
A_prev = A
W = parameters['W' + str(l)]
b = parameters['b' + str(l)]
activation =nn_architecture[l]["activation"]
Z, A =linear_activation_forward(A_prev, W, b, activation)
forward_cache['Z' + str(l)] =Z
forward_cache['A' + str(l)] =A
AL = A
return AL, forward_cache
def linear_activation_forward(A_prev, W, b, activation):
if activation =="sigmoid":
Z = linear_forward(A_prev, W,b)
A = sigmoid(Z)
elif activation =="relu":
Z = linear_forward(A_prev, W,b)
A = relu(Z)
return Z, A
def linear_forward(A, W, b):
Z = np.dot(W, A) + b
return Z

´úÂë¶Î3 ÕýÏò´«²¥Ä£ÐÍ

ʹÓá°cache¡±£¨python×Öµä°üº¬ÎªÌض¨²ãËù¼ÆËãµÄaºÍzÖµ£©ÒÔÔÚÕýÏò´«²¥ÖÁÏàÓ¦µÄ·´Ïò´«²¥ÆÚ¼ä´«µÝ±äÁ¿¡£Ëü°üº¬ÓÃÓÚ·´Ïò´«²¥¼ÆËãµ¼ÊýµÄÓÐÓÃÖµ¡£

Ëðʧº¯Êý

ΪÁ˹ܳÌѧϰ¹ý³Ì£¬ÐèÒª¼ÆËã´ú¼Ûº¯ÊýµÄÖµ¡£ÏÂÃæµÄ¹«Ê½ÓÃÓÚ¼ÆËã³É±¾¡£

·½³Ìʽ2 ½»²æìسɱ¾

def compute_cost(AL, Y):
m = Y.shape[1]
# Compute loss from AL and y
logprobs =np.multiply(np.log(AL),Y) + np.multiply(1 - Y, np.log(1 - AL))
# cross-entropy cost
cost = - np.sum(logprobs) / m
cost = np.squeeze(cost)
return cost

´úÂë¶Î4 ´ú¼Ûº¯ÊýµÄ¼ÆËã

·´Ïò´«²¥

·´Ïò´«²¥ÓÃÓÚ¼ÆËã²ÎÊýµÄËðʧº¯ÊýÌݶȡ£¸ÃËã·¨ÊÇÓÉ΢·ÖѧÖÐÒÑÖªµÄ¡°Á´¹æÔò¡±µÝ¹éʹÓõġ£

·´Ïò´«²¥¼ÆËãÖÐʹÓõĹ«Ê½£º

·½³Ìʽ3 ·´Ïò´«²¥¼ÆË㹫ʽ

Á´Ê½·¨ÔòÊǼÆË㸴ºÏº¯Êýµ¼ÊýµÄ¹«Ê½¡£¸´ºÏº¯Êý¾ÍÊǺ¯ÊýÌ׺¯Êý¡£

·½³Ìʽ4 Á´¹æÔòʾÀý

¡°Á´¹æÔò¡±ÔÚ¼ÆËãËðʧʱʮ·ÖÖØÒª£¨ÒÔ·½³Ìʽ5ΪÀý£©¡£

·½³Ìʽ5 Ëðʧº¯Êý£¨º¬Ìæ»»Êý¾Ý£©¼°ÆäÏà¶ÔÓÚµÚÒ»È¨ÖØµÄµ¼Êý

Éñ¾­ÍøÂçÄ£ÐÍ·´Ïò´«²¥µÄµÚÒ»²½ÊǼÆËã×îºóÒ»²ãËðʧº¯ÊýÏà¶ÔÓÚzµÄµ¼Êý¡£·½³Ìʽ6ÓÉÁ½²¿·Ö×é³É£º·½³Ìʽ2Ëðʧº¯ÊýµÄµ¼Êý£¨¹ØÓÚ¼¤»îº¯Êý£©ºÍ¼¤»îº¯Êý¡°sigmoid¡±¹ØÓÚ×îºóÒ»²ãZµÄµ¼Êý¡£

·½³Ìʽ6 ´Ó4²ã¶ÔzµÄËðʧº¯Êýµ¼Êý

·½³Ìʽ6µÄ½á¹û¿ÉÓÃÓÚ¼ÆËã·½³Ìʽ3µÄµ¼Êý¡£

·½³Ìʽ7 Ëðʧº¯ÊýÏà¶ÔÓÚ3²ãµÄµ¼Êý

ÔÚ½øÒ»²½¼ÆËãÖУ¬Ê¹ÓÃÁËÓëµÚÈý²ã¼¤»îº¯ÊýÓйصÄËðʧº¯ÊýµÄµ¼Êý£¨·½³Ìʽ7£©¡£

·½³Ìʽ8 µÚÈý²ãµÄµ¼Êý

·½³Ìʽ7µÄ½á¹ûºÍµÚÈý²ã»î»¯º¯Êý¡°relu¡±µÄµ¼ÊýÓÃÓÚ¼ÆËã·½³Ìʽ8µÄµ¼Êý£¨Ëðʧº¯ÊýÏà¶ÔÓÚzµÄµ¼Êý£©¡£È»ºó£¬ÎÒÃǶԷ½³Ìʽ3½øÐÐÁ˼ÆËã¡£

ÎÒÃǶԷ½³Ì9ºÍ10×öÁËÀàËÆµÄ¼ÆËã¡£

·½³Ìʽ9 µÚ¶þ²ãµÄµ¼Êý

·½³Ìʽ10 µÚÒ»²ãµÄµ¼Êý

×ÜÌå˼·

´ÓµÚÒ»²ã²ã¶ÔzµÄËðʧº¯Êýµ¼ÊýÓÐÖúÓÚ¼ÆË㣨L-1£©²ã£¨ÉÏÒ»²ã£©¶ÔËðʧº¯ÊýµÄµ¼Êý¡£½á¹û½«ÓÃÓÚ¼ÆË㼤»îº¯ÊýµÄµ¼Êý¡£

ͼ4 Éñ¾­ÍøÂçµÄ·´Ïò´«²¥

def L_model_backward(AL, Y, parameters, forward_cache, nn_architecture):
grads = {}
number_of_layers =len(nn_architecture)
m = AL.shape[1]
Y = Y.reshape(AL.shape) # afterthis line, Y is the same shape as AL
# Initializing thebackpropagation
dAL = - (np.divide(Y, AL) -np.divide(1 - Y, 1 - AL))
dA_prev = dAL
for l in reversed(range(1,number_of_layers)):
dA_curr = dA_prev
activation =nn_architecture[l]["activation"]
W_curr = parameters['W' +str(l)]
Z_curr = forward_cache['Z' +str(l)]
A_prev = forward_cache['A' +str(l-1)]
dA_prev, dW_curr, db_curr =linear_activation_backward(dA_curr, Z_curr, A_prev, W_curr, activation)
grads["dW" +str(l)] = dW_curr
grads["db" +str(l)] = db_curr
return grads
def linear_activation_backward(dA, Z, A_prev, W, activation):
if activation =="relu":
dZ = relu_backward(dA, Z)
dA_prev, dW, db =linear_backward(dZ, A_prev, W)
elif activation =="sigmoid":
dZ = sigmoid_backward(dA, Z)
dA_prev, dW, db =linear_backward(dZ, A_prev, W)
return dA_prev, dW, db
def linear_backward(dZ, A_prev, W):
m = A_prev.shape[1]
dW = np.dot(dZ, A_prev.T) / m
db = np.sum(dZ, axis=1,keepdims=True) / m
dA_prev = np.dot(W.T, dZ)
return dA_prev, dW, db

´úÂë¶Î5 ·´Ïò´«²¥Ä£¿é

¸üвÎÊý

¸Ãº¯ÊýµÄÄ¿±êÊÇͨ¹ýÌݶÈÓÅ»¯À´¸üÐÂÄ£Ð͵IJÎÊý¡£

def update_parameters(parameters, grads, learning_rate):
L = len(parameters)
for l in range(1, L):
parameters["W" +str(l)] = parameters["W" + str(l)] - learning_rate *grads["dW" + str(l)]
parameters["b" +str(l)] = parameters["b" + str(l)] - learning_rate *grads["db" + str(l)]
return parameters

´úÂë¶Î6 ʹÓÃÌݶÈϽµ¸üвÎÊýÖµ

ȫģÐÍ

Éñ¾­ÍøÂçÄ£Ð͵ÄÍêÕûʵÏÖ°üÀ¨ÔÚÆ¬¶ÎÖÐÌṩµÄ·½·¨¡£

def L_layer_model(X, Y, nn_architecture, learning_rate = 0.0075,num_iterations = 3000, print_cost=False):
np.random.seed(1)
# keep track of cost
costs = []
# Parameters initialization.
parameters =initialize_parameters(nn_architecture)
# Loop (gradient descent)
for i in range(0,num_iterations):
# Forward propagation:[LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
AL, forward_cache =L_model_forward(X, parameters, nn_architecture)
# Compute cost.
cost = compute_cost(AL, Y)
# Backward propagation.
grads = L_model_backward(AL,Y, parameters, forward_cache, nn_architecture)
# Update parameters.
parameters =update_parameters(parameters, grads, learning_rate)
# Print the cost every 100training example
if print_cost and i % 100 ==0:
print("Cost afteriteration %i: %f" %(i, cost))
costs.append(cost)
# plot the cost
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (pertens)')
plt.title("Learning rate=" + str(learning_rate))
plt.show()
return parameters

´úÂë¶Î7 Õû¸öÉñ¾­ÍøÂçÄ£ÐÍ

Ö»ÐèÒª½«ÒÑÖªµÄÈ¨ÖØºÍϵÁвâÊÔÊý¾Ý£¬Ó¦ÓÃÓÚÕýÏò´«²¥Ä£ÐÍ£¬¾ÍÄÜÔ¤²â½á¹û¡£

¿ÉÒÔÐÞ¸Äsnippet1ÖеÄnn_¼Ü¹¹£¬ÒÔ¹¹½¨¾ßÓв»Í¬²ãÊýºÍÒþ²Ø²ã´óСµÄÉñ¾­ÍøÂç¡£´ËÍ⣬׼±¸ÕýȷʵÏÖ¼¤»îº¯Êý¼°ÆäÅÉÉúº¯Êý£¨´úÂë¶Î2£©¡£ËùʵÏֵĺ¯Êý¿ÉÓÃÓÚÐ޸ĴúÂë¶Î3ÖеÄÏßÐÔÕýÏò¼¤»î·½·¨ºÍ´úÂë¶Î5ÖеÄÏßÐÔ·´Ïò¼¤»î·½·¨¡£

½øÒ»²½¸Ä½ø

Èç¹ûѵÁ·Êý¾Ý¼¯²»¹»´ó£¬Ôò¿ÉÄÜÃæÁÙ¡°¹ý¶ÈÄâºÏ¡±ÎÊÌâ¡£ÕâÒâζ×ÅËùѧµÄÍøÂç²»»á¸ÅÀ¨ÎªËü´Óδ¼û¹ýµÄÐÂÀý×Ó¡£¿ÉÒÔʹÓÃÕýÔò»¯·½·¨£¬ÈçL2¹æ·¶»¯£¨Ëü°üÀ¨Êʵ±µØÐ޸ijɱ¾º¯Êý£©»òÍ˳ö£¨ËüÔÚÿ´Îµü´úÖÐËæ»ú¹Ø±ÕһЩ¸ÐÖª»ú£©¡£

ÎÒÃÇʹÓÃÌݶÈϽµÀ´¸üвÎÊýºÍ×îС»¯³É±¾¡£Äã¿ÉÒÔѧϰ¸ü¶à¸ß¼¶ÓÅ»¯·½·¨£¬ÕâЩ·½·¨¿ÉÒÔ¼Ó¿ìѧϰËÙ¶È£¬ÉõÖÁ¿ÉÒÔΪ³É±¾º¯ÊýÌṩ¸üºÃµÄ×îÖÕ¼ÛÖµ£¬ÀýÈ磺

¡¤ СÅúÁ¿ÌݶÈϽµ

¡¤ ¶¯Á¦

¡¤ AdamÓÅ»¯Æ÷

   
2291 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚͼ¾í»ýÍøÂçµÄͼÉî¶Èѧϰ
×Ô¶¯¼ÝÊ»ÖеÄ3DÄ¿±ê¼ì²â
¹¤Òµ»úÆ÷ÈË¿ØÖÆÏµÍ³¼Ü¹¹½éÉÜ
ÏîĿʵս£ºÈçºÎ¹¹½¨ÖªÊ¶Í¼Æ×
 
Ïà¹ØÎĵµ

5GÈ˹¤ÖÇÄÜÎïÁªÍøµÄµäÐÍÓ¦ÓÃ
Éî¶ÈѧϰÔÚ×Ô¶¯¼ÝÊ»ÖеÄÓ¦ÓÃ
ͼÉñ¾­ÍøÂçÔÚ½»²æÑ§¿ÆÁìÓòµÄÓ¦ÓÃÑо¿
ÎÞÈË»úϵͳԭÀí
Ïà¹Ø¿Î³Ì

È˹¤ÖÇÄÜ¡¢»úÆ÷ѧϰ&TensorFlow
»úÆ÷ÈËÈí¼þ¿ª·¢¼¼Êõ
È˹¤ÖÇÄÜ£¬»úÆ÷ѧϰºÍÉî¶Èѧϰ
ͼÏñ´¦ÀíËã·¨·½·¨Óëʵ¼ù