ǰÑÔ
×ÔÈ»ÓïÑÔ´¦Àí£¨¼ò³ÆNLP£©£¬ÊÇÑо¿¼ÆËã»ú´¦ÀíÈËÀàÓïÑÔµÄÒ»Ãż¼Êõ£¬NLP¼¼ÊõÈüÆËã»ú¿ÉÒÔ»ùÓÚÒ»×é¼¼ÊõºÍÀíÂÛ£¬·ÖÎö¡¢Àí½âÈËÀàµÄ¹µÍ¨ÄÚÈÝ¡£´«Í³µÄ×ÔÈ»ÓïÑÔ´¦Àí·½·¨Éæ¼°µ½Á˺ܶàÓïÑÔѧ±¾ÉíµÄ֪ʶ£¬¶øÉî¶Èѧϰ£¬ÊDZíÕ÷ѧϰ£¨representation
learning£©µÄÒ»ÖÖ·½·¨£¬ÔÚ»úÆ÷·Òë¡¢×Ô¶¯ÎÊ´ð¡¢Îı¾·ÖÀà¡¢Çé¸Ð·ÖÎö¡¢ÐÅÏ¢³éÈ¡¡¢ÐòÁбê×¢¡¢Óï·¨½âÎöµÈÁìÓò¶¼Óй㷺µÄÓ¦Óá£
2013ÄêÄ©¹È¸è·¢²¼µÄword2vec¹¤¾ß£¬½«Ò»¸ö´Ê±íʾΪ´ÊÏòÁ¿£¬½«ÎÄ×ÖÊý×Ö»¯£¬ÓÐЧµØÓ¦ÓÃÓÚÎı¾·ÖÎö¡£2016Äê¹È¸è¿ªÔ´×Ô¶¯Éú³ÉÎı¾ÕªÒªÄ£Ðͼ°Ïà¹ØTensorFlow´úÂë¡£2016/2017Ä꣬¹È¸è·¢²¼/Éý¼¶ÓïÑÔ´¦Àí¿ò¼ÜSyntaxNet£¬Ê¶±ðÂÊÌá¸ß25%£¬Îª40ÖÖÓïÑÔ´øÀ´Îı¾·Ö¸îºÍ´Ê̬·ÖÎö¹¦ÄÜ¡£2017Äê¹È¸è¹Ù·½¿ªÔ´tf-seq2seq£¬Ò»ÖÖͨÓñàÂëÆ÷/½âÂëÆ÷¿ò¼Ü£¬ÊµÏÖ×Ô¶¯·Òë¡£±¾ÎÄÖ÷Òª½áºÏTensorFlowƽ̨£¬½²½âTensorFlow´ÊÏòÁ¿Éú³ÉÄ£ÐÍ£¨Vector
Representations of Words£©£»Ê¹ÓÃRNN¡¢LSTMÄ£ÐͽøÐÐÓïÑÔÔ¤²â£»ÒÔ¼°TensorFlow×Ô¶¯·ÒëÄ£ÐÍ¡£
Word2VecÊýѧÔÀí¼ò½é
ÎÒÃǽ«×ÔÈ»ÓïÑÔ½»¸ø»úÆ÷ѧϰÀ´´¦Àí£¬µ«»úÆ÷ÎÞ·¨Ö±½ÓÀí½âÈËÀàÓïÑÔ¡£ÄÇôÊ×ÏÈÒª×öµÄÊÂÇé¾ÍÊÇÒª½«ÓïÑÔÊýѧ»¯£¬HintonÓÚ1986ÄêÌá³öDistributed
Representation·½·¨£¬Í¨¹ýѵÁ·½«ÓïÑÔÖеÄÿһ¸ö´ÊÓ³Éä³ÉÒ»¸ö¹Ì¶¨³¤¶ÈµÄÏòÁ¿¡£ËùÓÐÕâЩÏòÁ¿¹¹³É´ÊÏòÁ¿¿Õ¼ä£¬Ã¿¸öÏòÁ¿¿ÉÊÓΪ¿Õ¼äÖеÄÒ»¸öµã£¬ÕâÑù¾Í¿ÉÒÔ¸ù¾Ý´ÊÖ®¼äµÄ¾àÀëÀ´ÅжÏËüÃÇÖ®¼äµÄÏàËÆÐÔ£¬²¢ÇÒ¿ÉÒÔ°ÑÆäÓ¦ÓÃÀ©Õ¹µ½¾ä×Ó¡¢Îĵµ¼°ÖÐÎÄ·Ö´Ê¡£
Word2VecÖÐÓõ½Á½¸öÄ£ÐÍ£¬CBOWÄ£ÐÍ(Continuous Bag-of-Words model)ºÍSkip-gramÄ£ÐÍ£¨Continuous
Skip-gram Model£©¡£Ä£ÐÍʾÀýÈçÏ£¬ÊÇÈý²ã½á¹¹µÄÉñ¾ÍøÂçÄ£ÐÍ£¬°üÀ¨ÊäÈë²ã£¬Í¶Ó°²ãºÍÊä³ö²ã¡£


ÆäÖÐscore(wt, h)£¬±íʾÔÚµÄÉÏÏÂÎÄ»·¾³Ï£¬Ô¤²â½á¹ûÊǵĸÅÂʵ÷֡£ÉÏÊöÄ¿±êº¯Êý£¬¿ÉÒÔת»»Îª¼«´ó»¯ËÆÈ»º¯Êý£¬ÈçÏÂËùʾ£º

Çó½âÉÏÊö¸ÅÂÊÄ£Ð͵ļÆËã³É±¾ÊǷdz£¸ß°ºµÄ£¬ÐèÒªÔÚÉñ¾ÍøÂçµÄÿһ´ÎѵÁ·¹ý³ÌÖУ¬¼ÆËãÿ¸ö´ÊÔÚËûµÄÉÏÏÂÎÄ»·¾³ÖгöÏֵĸÅÂʵ÷֣¬ÈçÏÂËùʾ£º

È»¶øÔÚʹÓÃword2vec·½·¨½øÐÐÌØÐÔѧϰµÄʱºò£¬²¢²»ÐèÒª¼ÆËãÈ«¸ÅÂÊÄ£ÐÍ¡£ÔÚCBOWÄ£ÐͺÍskip-gramÄ£ÐÍÖУ¬Ê¹ÓÃÁËÂß¼»Ø¹é£¨logistic
regression£©¶þ·ÖÀà·½·¨½øÐеÄÔ¤²â¡£ÈçÏÂͼCBOWÄ£ÐÍËùʾ£¬ÎªÁËÌá¸ßÄ£Ð͵ÄѵÁ·ËٶȺ͸ÄÉÆ´ÊÏòÁ¿µÄÖÊÁ¿£¬Í¨³£²ÉÓÃËæ»ú¸º²ÉÑù£¨Negative
Sampling£©µÄ·½·¨£¬ÔëÒôÑù±¾w1£¬w2£¬w3£¬wk¡ÎªÑ¡Öеĸº²ÉÑù¡£

TensorFlow½üÒå´ÊÄ£ÐÍ
±¾Õ½²½âʹÓÃTensorFlow word2vecÄ£ÐÍѰÕÒ½üÒå´Ê£¬ÊäÈëÊý¾ÝÊÇÒ»´ó¶ÎÓ¢ÎÄÎÄÕ£¬Êä³öÊÇÏàÓ¦´ÊµÄ½üÒå´Ê¡£±ÈÈ磬ͨ¹ýѧϰÎÄÕ¿ÉÒԵõ½ºÍfiveÒâ˼Ïà½üµÄ´ÊÓÐ:
four, three, seven, eight, six, two, zero, nine¡£Í¨¹ý¶Ô´ó¶ÎÓ¢ÎÄÎÄÕµÄѵÁ·£¬µ±Éñ¾ÍøÂçѵÁ·µ½10Íò´Îµü´ú£¬ÍøÂçLossÖµ¼õСµ½4.6×óÓÒµÄʱºò£¬Ñ§Ï°µÃµ½µÄÏà¹Ø½üËÆ´Ê£¬ÈçÏÂͼËùʾ£º

ÏÂÃæÎªTensorFlow word2vec API ʹÓÃ˵Ã÷£º
¹¹½¨´ÊÏòÁ¿±äÁ¿£¬vocabulary_sizeΪ×Öµä´óС£¬embedding_sizeΪ´ÊÏòÁ¿´óС
embeddings
= tf.Variable(tf.random_uniform([vocabulary_size,
embedding_size], -1.0, 1.0)) |
¶¨Ò帺²ÉÑùÖÐÂß¼»Ø¹éµÄÈ¨ÖØºÍÆ«ÖÃ
nce_weights
= tf.Variable(tf.truncated_normal
([vocabulary_size, embedding_size], stddev=1.0
/ math.sqrt(embedding_size)))
nce_biases = tf.Variable(tf.zeros([vocabulary_size])) |
¶¨ÒåѵÁ·Êý¾ÝµÄ½ÓÈë
train_inputs
= tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size,
1]) |
¶¨Òå¸ù¾ÝѵÁ·Êý¾ÝÊäÈ룬²¢Ñ°ÕÒ¶ÔÓ¦µÄ´ÊÏòÁ¿
embed
= tf.nn.embedding_lookup(embeddings, train_inputs) |
»ùÓÚ¸º²ÉÑù·½·¨¼ÆËãLossÖµ
loss
= tf.reduce_mean( tf.nn.nce_loss
(weights=nce_weights, biases=nce_biases, labels=train_labels,
inputs=embed, num_sampled=num_sampled, num_classes=vocabulary_size)) |
¶¨ÒåʹÓÃËæ»úÌݶÈϽµ·¨Ö´ÐÐÓÅ»¯²Ù×÷£¬×îС»¯lossÖµ
optimizer
= tf.train.GradientDescentOptimizer(learning_rate=1.0)
.minimize(loss) |
ͨ¹ýTensorFlow Session RunµÄ·½·¨Ö´ÐÐÄ£ÐÍѵÁ·
for
inputs, labels in generate_batch(...):
feed_dict = {train_inputs: inputs, train_labels:
labels}
_, cur_loss = session.run([optimizer, loss],
feed_dict=feed_dict) |
TensorFlowÓïÑÔÔ¤²âÄ£ÐÍ
±¾ÕÂÖ÷Òª»Ø¹ËRNN¡¢LSTM¼¼ÊõÔÀí£¬²¢»ùÓÚRNN/LSTM¼¼ÊõѵÁ·ÓïÑÔÄ£ÐÍ¡£Ò²¾ÍÊǸø¶¨Ò»¸öµ¥´ÊÐòÁУ¬Ô¤²â×îÓпÉÄܳöÏÖµÄÏÂÒ»¸öµ¥´Ê¡£ÀýÈ磬¸ø¶¨[had,
a, general] 3¸öµ¥´ÊµÄLSTMÊäÈëÐòÁУ¬Ô¤²âÏÂÒ»¸öµ¥´ÊÊÇʲô£¿ÈçÏÂͼËùʾ£º

RNN¼¼ÊõÔÀí
Ñ»·Éñ¾ÍøÂ磨Recurrent Neural Network, RNN£©ÊÇÒ»ÀàÓÃÓÚ´¦ÀíÐòÁÐÊý¾ÝµÄÉñ¾ÍøÂç¡£ºÍ¾í»ýÉñ¾ÍøÂçµÄÇø±ðÔÚÓÚ£¬¾í»ýÍøÂçÊÇÊÊÓÃÓÚ´¦ÀíÍø¸ñ»¯Êý¾Ý£¨ÈçͼÏñÊý¾Ý£©µÄÉñ¾ÍøÂ磬¶øÑ»·Éñ¾ÍøÂçÊÇÊÊÓÃÓÚ´¦ÀíÐòÁл¯Êý¾ÝµÄÉñ¾ÍøÂç¡£ÀýÈ磬ÄãÒªÔ¤²â¾ä×ÓµÄÏÂÒ»¸öµ¥´ÊÊÇʲô£¬Ò»°ãÐèÒªÓõ½Ç°ÃæµÄµ¥´Ê£¬ÒòΪһ¸ö¾ä×ÓÖÐǰºóµ¥´Ê²¢²»ÊǶÀÁ¢µÄ¡£RNNÖ®ËùÒÔ³ÆÎªÑ»·Éñ¾ÍøÂ·£¬¼´Ò»¸öÐòÁе±Ç°µÄÊä³öÓëÇ°ÃæµÄÊä³öÒ²Óйء£¾ßÌåµÄ±íÏÖÐÎÊ½ÎªÍøÂç»á¶ÔÇ°ÃæµÄÐÅÏ¢½øÐмÇÒä²¢Ó¦ÓÃÓÚµ±Ç°Êä³öµÄ¼ÆËãÖУ¬¼´Òþ²Ø²ãÖ®¼äµÄ½Úµã²»ÔÙÎÞÁ¬½Ó¶øÊÇÓÐÁ¬½ÓµÄ£¬²¢ÇÒÒþ²Ø²ãµÄÊäÈë²»½ö°üÀ¨ÊäÈë²ãµÄÊä³ö»¹°üÀ¨ÉÏһʱ¿ÌÒþ²Ø²ãµÄÊä³ö¡£ÈçÏÂͼËùʾ£º

LSTM¼¼ÊõÔÀí
RNNÓÐÒ»ÎÊÌ⣬·´Ïò´«²¥Ê±£¬ÌݶÈÒ²»á³ÊÖ¸Êý±¶ÊýµÄË¥¼õ£¬µ¼Ö¾¹ýÐí¶à½×¶Î´«²¥ºóµÄÌݶÈÇãÏòÓÚÏûʧ£¬²»ÄÜ´¦Àí³¤ÆÚÒÀÀµµÄÎÊÌâ¡£ËäÈ»RNNÀíÂÛÉÏ¿ÉÒÔ´¦ÀíÈÎÒⳤ¶ÈµÄÐòÁУ¬µ«ÊµÏ°Ó¦ÓÃÖУ¬RNNºÜÄÑ´¦Àí³¤¶È³¬¹ý10µÄÐòÁС£ÎªÁ˽â¾öRNNÌݶÈÏûʧµÄÎÊÌ⣬Ìá³öÁËLong
Short-Term MemoryÄ£¿é£¬Í¨¹ýÃŵĿª¹ØÊµÏÖÐòÁÐÉϵļÇÒ书ÄÜ£¬µ±Îó²î´ÓÊä³ö²ã·´Ïò´«²¥»ØÀ´Ê±£¬¿ÉÒÔʹÓÃÄ£¿éµÄ¼ÇÒäÔª¼ÇÏÂÀ´¡£ËùÒÔ
LSTM ¿ÉÒÔ¼Çס±È½Ï³¤Ê±¼äÄÚµÄÐÅÏ¢¡£³£¼ûµÄLSTMÄ£¿éÈçÏÂͼËùʾ£º


output gateÀàËÆÓÚinput gateͬÑù»á²úÉúÒ»¸ö0-1ÏòÁ¿À´¿ØÖÆMemory Cellµ½Êä³ö²ãµÄÊä³ö£¬ÈçϹ«Ê½Ëùʾ£º

Èý¸öÃÅÐ×÷ʹµÃ LSTM ´æ´¢¿é¿ÉÒÔ´æÈ¡³¤ÆÚÐÅÏ¢£¬±ÈÈç˵ֻҪÊäÈëÃű£³Ö¹Ø±Õ£¬¼ÇÒäµ¥ÔªµÄÐÅÏ¢¾Í²»»á±»ºóÃæÊ±¿ÌµÄÊäÈëËù¸²¸Ç¡£
ʹÓÃTensorFlow¹¹½¨µ¥´ÊÔ¤²âÄ£ÐÍ
Ê×ÏÈÏÂÔØPTBµÄÄ£ÐÍÊý¾Ý£¬¸ÃÊý¾Ý¼¯´ó¸Å°üº¬10000¸ö²»Í¬µÄµ¥´Ê£¬²¢¶Ô²»³£Óõĵ¥´Ê½øÐÐÁ˱ê×¢¡£
Ê×ÏÈÐèÒª¶ÔÑù±¾Êý¾Ý¼¯½øÐÐÔ¤´¦Àí£¬°Ñÿ¸öµ¥´ÊÓÃÕûÊý±ê×¢£¬¼´¹¹½¨´ÊµäË÷Òý£¬ÈçÏÂËùʾ£º
¶ÁȡѵÁ·Êý¾Ý
data
= _read_words(filename)
#°´ÕÕµ¥´Ê³öÏÖÆµÂÊ£¬½øÐÐÅÅÐò
counter = collections.Counter(data)
count_pairs = sorted(counter.items(), key=lambda
x: (-x1, x[0]))
#¹¹½¨´Êµä¼°´ÊµäË÷Òý
words, _ = list(zip(*count_pairs))
word_to_id = dict(zip(words, range(len(words)))) |
¶ÁȡѵÁ·Êý¾Ýµ¥´Ê£¬²¢×ª»»Îªµ¥´ÊË÷ÒýÐòÁÐ
data
= _read_words(filename) data = [word_to_id[word]
for word in data if word in word_to_id] |
Éú³ÉѵÁ·Êý¾ÝµÄdataºÍlabel£¬ÆäÖÐepoch_sizeΪ¸ÃepochµÄѵÁ·µü´ú´ÎÊý£¬num_stepsΪLSTMµÄÐòÁ㤶È
i
= tf.train.range_input_producer(epoch_size,
shuffle=False).dequeue()
x = tf.strided_slice(data, [0, i * num_steps],
[batch_size, (i + 1) * num_steps])
x.set_shape([batch_size, num_steps])
y = tf.strided_slice(data, [0, i * num_steps
+ 1], [batch_size, (i + 1) * num_steps + 1])
y.set_shape([batch_size, num_steps]) |
¹¹½¨LSTM Cell£¬ÆäÖÐsizeΪÒþ²ØÉñ¾ÔªµÄÊýÁ¿
lstm_cell
= tf.contrib.rnn.BasicLSTMCell(size,
forget_bias=0.0, state_is_tuple=True) |
Èç¹ûΪѵÁ·Ä£Ê½£¬Îª±£Ö¤ÑµÁ·Â³°ôÐÔ£¬¶¨Òådropout²Ù×÷
attn_cell
= tf.contrib.rnn.DropoutWrapper(lstm_cell,
output_keep_prob=config.keep_prob) |
¸ù¾Ý²ãÊýÅäÖ㬶¨Òå¶à²ãRNNÉñ¾ÍøÂç
cell
= tf.contrib.rnn.MultiRNNCell( [ attn_cell for
_ in range(config.num_layers)],
state_is_tuple=True) |
¸ù¾Ý´Êµä´óС£¬¶¨Òå´ÊÏòÁ¿
embedding
= tf.get_variable("embedding",
[vocab_size, size], dtype=data_type()) |
¸ù¾Ýµ¥´ÊË÷Òý£¬²éÕÒ´ÊÏòÁ¿£¬ÈçÏÂͼËùʾ¡£´Óµ¥´ÊË÷ÒýÕÒµ½¶ÔÓ¦µÄOne-hot encoding£¬È»ºóºìÉ«µÄweight¾ÍÖ±½Ó¶ÔÓ¦ÁËÊä³ö½ÚµãµÄÖµ£¬Ò²¾ÍÊǶÔÓ¦µÄembeddingÏòÁ¿¡£
inputs
= tf.nn.embedding_lookup(embedding, input_.input_data) |

¶¨ÒåRNNÍøÂ磬ÆäÖÐstateΪLSTM CellµÄ״̬£¬cell_outputΪLSTM CellµÄÊä³ö
for
time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(inputs[:, time_step,
:], state)
outputs.append(cell_output) |
¶¨ÒåѵÁ·µÄlossÖµ¾Í£¬ÈçϹ«Ê½Ëùʾ¡£

LossÖµ
loss
= tf.contrib.legacy_seq2seq.sequence_loss_by_example
([logits],
[tf.reshape(input_.targets, [-1])], [tf.ones
([batch_size * num_steps],
dtype=data_type())]) |
¶¨ÒåÌݶȼ°ÓÅ»¯²Ù×÷
cost
= tf.reduce_sum(loss) / batch_size
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(cost,
tvars), config.max_grad_norm)
optimizer = tf.train.GradientDescentOptimizer(self._lr) |
µ¥´ÊÀ§»ó¶Èeloss
perplexity
= np.exp(costs / iters) |
TensorFlowÓïÑÔ·ÒëÄ£ÐÍ
±¾½ÚÖ÷Òª½²½âʹÓÃTensorFlowʵÏÖRNN¡¢LSTMµÄÓïÑÔ·ÒëÄ£ÐÍ¡£»ù´¡µÄsequence-to-sequenceÄ£ÐÍÖ÷Òª°üº¬Á½¸öRNNÍøÂ磬һ¸öRNNÍøÂçÓÃÓÚ±àÂëSequenceµÄÊäÈ룬ÁíÒ»¸öRNNÍøÂçÓÃÓÚ²úÉúSequenceµÄÊä³ö¡£»ù´¡¼Ü¹¹ÈçÏÂͼËùʾ

ÉÏͼÖеÄÿ¸ö·½¿ò±íʾRNNÖеÄÒ»¸öCell¡£ÔÚÉÏͼµÄÄ£ÐÍÖУ¬Ã¿¸öÊäÈë»á±»±àÂë³É¹Ì¶¨³¤¶ÈµÄ״̬ÏòÁ¿£¬È»ºó´«µÝ¸ø½âÂëÆ÷¡£2014Ä꣬BahdanauÔÚÂÛÎÄ¡°Neural
Machine Translation by Jointly Learning to Align and
Translate¡±ÖÐÒýÈëÁËAttention»úÖÆ¡£Attention»úÖÆÔÊÐí½âÂëÆ÷ÔÚÿһ²½Êä³öʱ²ÎÓëµ½ÔÎĵIJ»Í¬²¿·Ö£¬ÈÃÄ£Ð͸ù¾ÝÊäÈëµÄ¾ä×ÓÒÔ¼°ÒѾ²úÉúµÄÄÚÈÝÀ´Ó°Ïì·Òë½á¹û¡£Ò»¸ö¼ÓÈëattention»úÖÆµÄ¶à²ãLSTM
sequence-to-sequenceÍøÂç½á¹¹ÈçÏÂͼËùʾ£º

Õë¶ÔÉÏÊösequence-to-sequenceÄ£ÐÍ£¬TensorFlow·â×°³ÉÁË¿ÉÒÔÖ±½Óµ÷Óõĺ¯ÊýAPI£¬Ö»ÐèÒª¼¸°ÙÐеĴúÂë¾ÍÄÜʵÏÖÒ»¸ö³õ¼¶µÄ·ÒëÄ£ÐÍ¡£tf.nn.seq2seqÎļþ¹²ÊµÏÖÁË5¸öseq2seqº¯Êý£º
basic_rnn_seq2seq£ºÊäÈëºÍÊä³ö¶¼ÊÇembeddingµÄÐÎʽ£»encoderºÍdecoderÓÃÏàͬµÄRNN
cell£¬µ«²»¹²ÏíȨֵ²ÎÊý£»
tied_rnn_seq2seq£ºÍ¬basic_rnn_seq2seq£¬µ«encoderºÍdecoder¹²ÏíȨֵ²ÎÊý£»
embedding_rnn_seq2seq£ºÍ¬basic_rnn_seq2seq£¬µ«ÊäÈëºÍÊä³ö¸ÄΪidµÄÐÎʽ£¬º¯Êý»áÔÚÄÚ²¿´´½¨·Ö±ðÓÃÓÚencoderºÍdecoderµÄembedding¾ØÕó£»
embedding_tied_rnn_seq2seq£ºÍ¬tied_rnn_seq2seq£¬µ«ÊäÈëºÍÊä³ö¸ÄΪidÐÎʽ£¬º¯Êý»áÔÚÄÚ²¿´´½¨·Ö±ðÓÃÓÚencoderºÍdecoderµÄembedding¾ØÕó£»
embedding_attention_seq2seq£ºÍ¬embedding_rnn_seq2seq£¬µ«¶àÁËattention»úÖÆ£»
embedding_rnn_seq2seqº¯Êý½Ó¿ÚʹÓÃ˵Ã÷ÈçÏ£º
encoder_inputs£ºencoderµÄÊäÈë
decoder_inputs£ºdecoderµÄÊäÈë
cell£ºRNN_CellµÄʵÀý
num_encoder_symbols£¬num_decoder_symbols£º·Ö±ðÊDZàÂëºÍ½âÂëµÄ´óС
embedding_size£º´ÊÏòÁ¿µÄά¶È
output_projection£ºdecoderµÄoutputÏòÁ¿Í¶Ó°µ½´Ê±í¿Õ¼äʱ£¬Óõ½µÄͶӰ¾ØÕóºÍÆ«ÖÃÏî
feed_previous£ºÈôΪTrue, Ö»ÓеÚÒ»¸ödecoderµÄÊäÈë·ûºÅÓÐÓã¬ËùÓеÄdecoderÊäÈë¶¼ÒÀÀµÓÚÉÏÒ»²½µÄÊä³ö£»
outputs,
states = embedding_rnn_seq2seq(
encoder_inputs, decoder_inputs, cell,
num_encoder_symbols, num_decoder_symbols,
embedding_size, output_projection=None,
feed_previous=False) |
TensorFlow¹Ù·½ÌṩÁËÓ¢Óïµ½·¨ÓïµÄ·ÒëʾÀý£¬²ÉÓõÄÊÇstatmtÍøÕ¾ÌṩµÄÓïÁÏÊý¾Ý£¬Ö÷Òª°üº¬£ºgiga-fren.release2.fixed.en£¨Ó¢ÎÄÓïÁÏ£¬3.6G£©ºÍgiga-fren.release2.fixed.fr£¨·¨ÎÄÓïÁÏ£¬4.3G£©¡£¸ÃʾÀýµÄ´úÂë½á¹¹ÈçÏÂËùʾ£º
seq2seq_model.py£ºseq2seqµÄTensorFlowÄ£ÐÍ
²ÉÓÃÁËembedding_attention_seq2seqÓÃÓÚ´´½¨seq2seqÄ£ÐÍ¡£
data_utils.py£º¶ÔÓïÁÏÊý¾Ý½øÐÐÊý¾ÝÔ¤´¦Àí£¬¸ù¾ÝÓïÁÏÊý¾ÝÉú³É´Êµä¿â£»²¢»ùÓڴʵä¿â°ÑÒª·ÒëµÄÓï¾äת»»³ÉÓÃÓôÊID±íʾµÄѵÁ·ÐòÁС£ÈçÏÂͼËùʾ£º

translate.py£ºÖ÷º¯ÊýÈë¿Ú£¬Ö´ÐзÒëÄ£Ð͵ÄѵÁ·
Ö´ÐÐÄ£ÐÍѵÁ·
python
translate.py
--data_dir [your_data_directory] --train_dir
[checkpoints_directory]
--en_vocab_size=40000 --fr_vocab_size=40000 |
×ܽá
Ëæ×ÅTensorFlowа汾µÄ²»¶Ï·¢²¼ÒÔ¼°ÐÂÄ£Ð͵IJ»¶ÏÔö¼Ó£¬TensorFlowÒѳÉΪÖ÷Á÷µÄÉî¶Èѧϰƽ̨¡£±¾ÎÄÖ÷Òª½éÉÜÁËTensorFlowÔÚ×ÔÈ»ÓïÑÔ´¦ÀíÁìÓòµÄÏà¹ØÄ£ÐͺÍÓ¦Óá£Ê×ÏȽéÉÜÁËWord2VecÊýѧÔÀí£¬ÒÔ¼°ÈçºÎʹÓÃTensorFlowѧϰ´ÊÏòÁ¿£»½ÓׯعËÁËRNN¡¢LSTMµÄ¼¼ÊõÔÀí£¬½²½âÁËTensorFlowµÄÓïÑÔÔ¤²âÄ£ÐÍ£»×îºóʵÀý·ÖÎöÁËTensorFlow
sequence-to-sequenceµÄ»úÆ÷·Òë API¼°¹Ù·½Ê¾Àý¡£
|