×ÔÈ»ÓïÑÔ´¦ÀíÊÇÈ˹¤ÖÇÄÜÑо¿µÄºËÐÄÎÊÌâÖ®Ò»¡£½üÈÕ£¬ÒÑÐû²¼±»
Salesforce ÊÕ¹ºµÄÉî¶Èѧϰ¹«Ë¾ MetaMind ÔÚÆä¹Ù·½ÍøÕ¾ÉÏ·¢±íÁËһƪÎÄÕ£¬Éî¶ÈÆÊÎöÁË
LSTM ºÍ´Ê´üÄ£ÐÍÔÚ×ÔÈ»ÓïÑÔ´¦ÀíÉϵÄÓ¦Óá£
»úÆ÷ѧϰ¡¢Éî¶ÈѧϰºÍ¸ü¹ãÒåÉϵÄÈ˹¤ÖÇÄܵÄÐËÆðÊǺÁÎÞÒÉÎʵ쬶øÇÒÆäÒѾ¶Ô¼ÆËã»ú¿ÆÑ§ÁìÓò²úÉú¾Þ´óµÄÓ°Ïì¡£Äã¿ÉÄÜÒѾÌý˵¹ý£¬Ä¿Ç°Éî¶ÈѧϰÒѾÔÚͼÏñʶ±ðºÍΧÆåµÈºÜ¶àÈÎÎñÉÏʵÏÖÁ˶ÔÈËÀàµÄ³¬Ô½¡£
Éî¶ÈѧϰÉçÇøÄ¿Ç°½«×ÔÈ»ÓïÑÔ´¦Àí(NLP)¿´×÷ÊÇÏÂÒ»¸öÑо¿ºÍÓ¦ÓõÄÇ°ÑØ¡£
Éî¶ÈѧϰµÄÒ»´óÓÅÊÆÊÇÆä½ø²½ÍùÍùÊǷdz£Í¨Óõġ£±ÈÈç˵£¬Ê¹Éî¶ÈѧϰÔÚÒ»¸öÁìÓòÓÐЧµÄ¼¼ÊõÍùÍù²»ÐèҪ̫¶àÐ޸ľÍÄÜÇ¨ÒÆµ½ÁíÒ»¸öÁìÓò¡£¸ü¾ßÌå¶øÑÔ£¬ÎªÍ¼ÏñºÍÓïÒôʶ±ðËù¿ª·¢µÄ¹¹½¨´ó¹æÄ£¡¢¸ß¼ÆËã³É±¾µÄÉî¶ÈѧϰģÐ͵ķ½·¨Ò²Äܱ»ÓÃÓÚ×ÔÈ»ÓïÑÔ´¦Àí¡£×î½üµÄ×îÏȽøµÄ·Òëϵͳ¾ÍÊÇÆäÖÐÒ»Àý£¬¸ÃϵͳµÄ±íÏÖ³¬Ô½ÁËËùÓÐÒÔÍùµÄϵͳ£¬µ«ËùÐèµÄ¼ÆËã»úÄÜÁ¦Ò²Òª¶àµÃ¶à¡£ÕâÑùµÄ¸ßÒªÇóµÄϵͳÄܹ»ÔÚÕæÊµÊÀ½çÊý¾ÝÖз¢ÏÖżȻ³öÏֵķdz£¸´ÔÓµÄģʽ£¬µ«ÕâÒ²ÈúܶàÈ˽«ÕâÑùµÄ´ó¹æÄ£Ä£ÐÍÓÃÔÚ¸÷ÖÖ¸÷ÑùµÄÈÎÎñÉÏ¡£ÕâÓÖ´øÀ´ÁËÒ»¸öÎÊÌ⣺
ÊÇ·ñËùÓеÄÈÎÎñ¶¼¾ßÓÐÐèÒªÕâÖÖÄ£ÐͲÅÄÜ´¦ÀíµÄ¸´ÔÓ¶È?
ÈÃÎÒÃÇ¿´¿´Ò»¸öÓÃÓÚÇé¸Ð·ÖÎöµÄÔÚ´Ê´üǶÈë(bag-of-words embeddings)ÉÏѵÁ·µÄÒ»¸öÁ½²ã¶à²ã¸ÐÖªÆ÷(two
layered MLP)µÄÄÚ²¿Çé¿ö£º

ÓÃÓÚÇé¸Ð·ÖÎöµÄÔÚ´Ê´üǶÈë(bag-of-words embeddings)ÉÏѵÁ·µÄÒ»¸öÁ½²ã¶à²ã¸ÐÖªÆ÷(two
layered MLP)µÄÄÚ²¿Çé¿ö
Ò»¸ö±»³ÆÎª´Ê´ü(bag-of-words)µÄ¼òµ¥Éî¶ÈѧϰϵͳµÄÄÚ²¿Çé¿ö£¬Æä¿ÉÒÔ½«¾ä×Ó·ÖÀàΪ»ý¼«µÄ(positive)»òÏû¼«µÄ(negative)¡£ÕâÕÅͼÊÇÀ´×ÔÔÚÒ»¸ö´Ê´üÉϵÄÒ»¸ö
2 ²ã MLP ×îºóÒ»¸öÒþ²Ø²ãµÄÒ»¸ö T-SNE¡£ÆäÖÐÿ¸öÊý¾Ýµã¶ÔÓ¦ÓÚÒ»¸ö¾ä×Ó£¬²»Í¬µÄÑÕÉ«·Ö±ð¶ÔÓ¦ÓÚ¸ÃÉî¶ÈѧϰϵͳµÄÔ¤²âºÍÕæÊµÄ¿±ê¡£ÊµÏß¿ò±íʾ¾ä×ӵIJ»Í¬ÓïÒåÄÚÈÝ¡£ºóÃæÄã¿ÉÒÔͨ¹ýÒ»ÕŽ»»¥Ê½Í¼±íÀ´Á˽âËüÃÇ¡£
ÉÏͼÖеÄʵÏß¿òÌṩÁËÒ»Ð©ÖØÒªµÄ¼û½â¡£¶øÕæÊµÊÀ½çÊý¾ÝµÄÄѶÈÔ¶²»Ö¹´Ë£¬Ò»Ð©¾ä×Ó¿ÉÒÔ±»ÇáËÉ·ÖÀ࣬µ«ÁíһЩȴ°üº¬Á˸´ÔÓµÄÓïÒå½á¹¹¡£ÔÚ¿ÉÒÔÇáËÉ·ÖÀàµÄ¾ä×ӵݸÀýÖУ¬¸ßÈÝÁ¿µÄϵͳ¿ÉÄܲ¢²»ÊDZØÐèµÄ¡£Ò²ÐíÒ»¸ö¼òµ¥µÃ¶àµÄÄ£Ð;ÍÄÜÍê³ÉͬÑùµÄ¹¤×÷¡£ÕâÆª²©¿ÍÎÄÕÂ̽ÌÖÁËÕâÖÖÇé¿öÊÇ·ñÊôʵ£¬²¢½«ËµÃ÷ÎÒÃÇÆäʵÍùÍùʹÓüòµ¥Ä£Ð;ÍÄÜÍê³ÉÈÎÎñ¡£
Ò»¡¢¶ÔÎı¾µÄÉî¶Èѧϰ
´ó¶àÊýÉî¶Èѧϰ·½·¨ÐèÒª¸¡µãÊý×÷ΪÊäÈ룬Èç¹ûÄãûʹÓùýÎı¾£¬Äã¿ÉÄÜ»áÒÉÎÊ£º
ÎÒÔõôʹÓÃÒ»¶ÎÎı¾À´½øÐÐÉî¶Èѧϰ?
¶ÔÓÚÎı¾£¬ÆäºËÐÄÎÊÌâÊÇÔÚ¸ø¶¨²ÄÁϵij¤¶ÈµÄÇé¿öÏÂÈçºÎ±íÕ÷ÈÎÒâ´óÁ¿µÄÐÅÏ¢¡£Ò»ÖÖÁ÷Ðеķ½·¨Êǽ«Îı¾ÇзÖ(tokenize)³É´Ê(word)¡¢×Ó´Ê(sub-word)ÉõÖÁ×Ö·û(character)¡£È»ºóÿһ¸ö´Ê¶¼¿ÉÒÔͨ¹ý
word2vec »ò Glove µÈ¾¹ýÁ˳ä·ÖÑо¿µÄ·½·¨¶ø×ª»»³ÉÒ»¸ö¸¡µãÏòÁ¿¡£ÕâÖÖ·½·¨¿ÉÒÔͨ¹ý²»Í¬´Ê֮ǰµÄÒþº¬¹ØÏµÀ´Ìá¸ß¶Ô´ÊµÄÓÐÒâÒåµÄ±íÕ÷¡£

ȡһ¸ö´Ê£¬½«Æäת»»³ÉÒ»¸ö¸ßάǶÈë(±ÈÈç 300 ά)£¬È»ºóʹÓà PCA »ò T-SNE(Á÷ÐеĽµÎ¬¹¤¾ß£¬ÔÚÕâ¸ö°¸ÀýÖÐÊǽµÎª
2 ά)£¬Äã¾Í¿ÉÒÔÕÒµ½´ÊÖ®¼äµÄÓÐȤ¹ØÏµ¡£±ÈÈ磬ÔÚÉÏͼÖÐÄã¿ÉÒÔ¿´µ½ uncle Óë aunt Ö®¼äµÄ¾àÀëºÍ
man Óë woman Ö®¼äµÄ¾àÀë²î²»¶àÏàµÈ(À´×Ô Mikolov et al., 2013)
ͨ¹ýʹÓà tokenization ºÍ word2vec ·½·¨£¬ÎÒÃÇ¿ÉÒÔ½«Ò»¶ÎÎı¾×ª»»Îª´ÊµÄ¸¡µã±íʾµÄÒ»¸öÐòÁС£
ÏÖÔÚ£¬Ò»¸ö´Ê±íÕ÷µÄÐòÁÐÓÐʲôÓÃ?
¶þ¡¢´Ê´ü(bag-of-words)
ÏÖÔÚÎÒÃÇÀ´Ì½ÌÖһϴʴü(BoW)£¬ÕâÒ²ÐíÊÇ×î¼òµ¥µÄ»úÆ÷ѧϰËã·¨ÁË!

ȡһЩ´Ê±íÕ÷(ͼϲ¿µÄ»ÒÉ«¿ò)£¬È»ºóͨ¹ý¼Ó(sum)»òƽ¾ù(average)µÃµ½Ò»¸ö¹²Í¬µÄ±íÕ÷(À¶É«¿ò)£¬Õâ¸ö¹²Í¬±íÕ÷(common
representation)°üº¬ÁËÿ¸ö´ÊµÄһЩÐÅÏ¢¡£ÔÚÕâÆªÎÄÕÂÖУ¬¸Ã¹²Í¬±íÕ÷±»ÓÃÓÚÔ¤²âÒ»¸ö¾ä×ÓÊÇ»ý¼«µÄ»¹ÊÇÏû¼«µÄ(ºìÉ«¿ò)¡£
ÔÚÿ¸öÌØÕ÷ά(feature dimension)Éϼòµ¥µØÈ¡´ÊµÄƽ¾ù(mean)¡£ÊÂʵ֤Ã÷¼òµ¥µØ¶Ô´ÊǶÈë(word
embedding)½øÐÐÆ½¾ù(¾¡¹ÜÕâÍêÈ«ºöÂÔÁ˾ä×ÓµÄ˳Ðò)¾Í×ãÒÔÔÚÐí¶à¼òµ¥µÄʵ¼Ê°¸ÀýÖÐÈ¡µÃÁ¼ºÃµÄЧ¹û£¬¶øÇÒÒ²ÄÜÔÚÓëÉî¶ÈÉñ¾ÍøÂç½áºÏʱÌṩһ¸öÇ¿´óµÄ»ù×¼(ºóÃæ»á½âÊÍ)¡£
´ËÍ⣬ȡƽ¾ùµÄ¼ÆËã³É±¾ºÜµÍ£¬¶øÇÒ¿ÉÒÔ½«¾ä×ӵĽµÎ¬³É¹Ì¶¨´óСµÄÏòÁ¿¡£
Èý¡¢Ñ»·Éñ¾ÍøÂç
һЩ¾ä×ÓÐèÒªºÜ¸ßµÄ׼ȷ¶È»òÒÀÀµÓÚ¾ä×ӽṹ¡£Ê¹ÓôʴüÀ´½â¾öÕâЩÎÊÌâ¿ÉÄܲ»ÄÜÂú×ãÒªÇó¡£²»¹ý£¬Äã¿ÉÒÔ¿¼ÂÇʹÓÃÈÃÈ˾ªÌ¾µÄÑ»·Éñ¾ÍøÂç(Recurrent
Neural Networks)¡£

ÔÚÿһ¸öʱ¼ä²½Öè(´Ó×óµ½ÓÒ)£¬Ò»¸öÊäÈë(±ÈÈçÒ»¸ö´Ê)±»À¡ËÍÈë RNN(»ÒÉ«¿ò)£¬²¢ÕûºÏ֮ǰµÄÄÚ²¿¼ÇÒä(À¶É«¿ò)¡£È»ºó¸Ã
RNN Ö´ÐÐһЩ¼ÆË㣬µÃµ½ÐµÄÄÚ²¿¼ÇÒä(À¶É«¿ò)£¬¸Ã¼ÇÒä±íʾÁËËùÓÐ֮ǰ¼û¹ýµÄµ¥Ôª(È磬ËùÓÐ֮ǰµÄ´Ê)¡£¸Ã
RNN ÏÖÔÚÓ¦¸ÃÒѾ°üº¬ÁËÒ»¸ö¾ä×Ó²ãÃæµÄÐÅÏ¢£¬ÈÃÆä¿ÉÒÔ¸üºÃµØÔ¤²âÒ»¸ö¾ä×ÓÊÇ»ý¼«µÄ»¹ÊÇÏû¼«µÄ(ºìÉ«¿ò)¡£
ÿ¸ö´ÊǶÈë¶¼°´Ë³Ðò±»ËÍÈëÒ»¸öÑ»·Éñ¾ÍøÂ磬Ȼºó¸ÃÍøÂç¿ÉÒԴ洢֮ǰ¼û¹ýµÄÐÅÏ¢²¢½«ÆäÓëеĴʽáºÏÆðÀ´¡£µ±Ê¹Óó¤¶ÌÆÚ¼ÇÒä(LSTM)»òÃÅ¿ØÑ»·µ¥Ôª(GRU)µÈÖøÃûµÄ¼ÇÒäµ¥ÔªÀ´Çý¶¯
RNN ʱ£¬¸Ã RNN Äܹ»¼Çס¾ßÓкܶà¸ö´ÊµÄ¾ä×ÓÖÐËù·¢ÉúµÄÇé¿ö!(ÒòΪ LSTM µÄ³É¹¦£¬´øÓÐ LSTM
¼ÇÒäµ¥ÔªµÄ RNN ³£±»³ÆÎª LSTM¡£)ÕâÀàÄ£ÐÍÖÐ×î´óµÄÄ£Ðͽ«ÕâÑùµÄ½á¹¹¶ÑµþÁË 8 ´Î¡£

¶¼±íʾ´øÓÐ LSTM µ¥ÔªµÄÑ»·Éñ¾ÍøÂç¡£ËüÃÇÒ²Ó¦ÓÃÁËһЩȨºâµÄ¼¼ÇÉ£¬±ÈÈçÌø¹ý LSTM ²ãÖ®¼äµÄÁ¬½ÓºÍÒ»ÖÖ±»³ÆÎª×¢Òâ(attention)µÄ·½·¨¡£ÁíÍâҪעÒâÂÌÉ«µÄ
LSTM Ö¸ÏòÁËÏà·´µÄ·½Ïò¡£µ±ÓëÒ»¸öÆÕͨµÄ LSTM ½áºÏʱ£¬Õâ±»³ÆÎªË«Ïò LSTM(bidirectional
LSTM)£¬ÒòΪÆä¿ÉÒÔÔÚÊý¾ÝÐòÁеÄÁ½¸ö·½ÏòÉ϶¼»ñÈ¡ÐÅÏ¢¡£¸ü¶àÐÅÏ¢¿É²ÎÔÄ Stephen Merity
µÄ²©¿Í(¼´»úÆ÷Ö®ÐÄÎÄÕ¡¶Éî¶È | Öð²ãÆÊÎö£¬¹È¸è»úÆ÷·ÒëÍ»ÆÆ±³ºóµÄÉñ¾ÍøÂç¼Ü¹¹ÊÇÔõÑùµÄ?¡·)(À´Ô´£ºWu
et al., 2016)¡£
µ«ÊÇ£¬ºÍ¼òµ¥µÄ´Ê´üÄ£ÐÍ±ÈÆðÀ´£¬LSTM µÄ¼ÆËã³É±¾Òª¸ßµÃ¶à£¬¶øÇÒÐèÒª¾Ñé·á¸»µÄÉî¶Èѧϰ¹¤³ÌʦʹÓøßÐÔÄܵļÆËãÓ²¼þÀ´ÊµÏÖºÍÌṩ֧³Ö¡£
ËÄ¡¢Àý×Ó£ºÇé¸Ð·ÖÎö
Çé¸Ð·ÖÎö(sentiment analysis)ÊÇÒ»ÖÖÁ¿»¯Ö÷¹ÛÐÔÎÄÕµļ«ÐÔµÄÎĵµ·ÖÀàÈÎÎñ¡£¸ø¶¨Ò»¸ö¾ä×Ó£¬Ä£ÐÍÈ¥ÆÀ¹ÀËüµÄÇé¸ÐÊÇ»ý¼«¡¢Ïû¼«»¹ÊÇÖÐÐԵġ£
ÏëÒªÔÚÊÂ̬ÑÏÖØÇ°ÏÈ·¢ÏÖ Twitter Éϵķßſͻ§Âð?ÄÇô£¬Çé¸Ð·ÖÎö¿ÉÄÜÕýÊÇÄãÏëÒªµÄ!
Ò»¸ö¼«¼ÑµÄʵÏÖ´ËÄ¿µÄµÄÊý¾Ý¼¯(ÎÒÃǽÓÏÂÀ´»áÓõ½)ÊÇ Stanford sentiment treebank(SST):
https://nlp.stanford.edu/sentiment/treebank.html
ÎÒÃÇÒѾ¹«¿ªÁËÒ»¸ö PyTorch µÄÊý¾Ý¼ÓÔØÆ÷£º
https://github.com/pytorch/text
STT ²»½ö¿ÉÒÔ¸ø¾ä×Ó·ÖÀà(»ý¼«¡¢Ïû¼«)£¬¶øÇÒÒ²¿ÉÒÔ¸øÃ¿¸ö¾ä×ÓÌṩ·ûºÏÓï·¨µÄ×Ó¶ÌÓï(subphrases)¡£È»¶ø£¬ÔÚÎÒÃǵÄϵͳÖУ¬ÎÒÃDz»Ê¹ÓÃÈκÎÊ÷ÐÅÏ¢(tree
information)¡£
ÔʼµÄ SST ÓÉ 5 À๹³É£º·Ç³£»ý¼«¡¢»ý¼«¡¢ÖÐÐÔ¡¢Ïû¼«¡¢·Ç³£Ïû¼«¡£ÎÒÃÇÈÏΪ¶þÖµ·ÖÀàÈÎÎñ¸ü¼Ó¼òµ¥£¬ÆäÖлý¼«Óë·Ç³£»ý¼«½áºÏ¡¢Ïû¼«Óë·Ç³£Ïû¼«½áºÏ£¬Ã»ÓÐÖÐÐÔ¡£
ÎÒÃÇΪÎÒÃǵÄÄ£Ðͼܹ¹ÌṩÁËÒ»¸ö¼òÂÔÇÒ¼¼Êõ»¯µÄÃèÊö¡£Öص㲻ÊÇËüµ½µ×ÈçºÎ±»¹¹½¨£¬¶øÊǼÆËã³É±¾µÍµÄÄ£ÐÍ´ïµ½ÁË
82% µÄÑéÖ¤¾«¶È£¬Ò»¸ö 64 ´óСµÄÅúÈÎÎñÓÃÁË 10 ºÁÃ룬¶ø¼ÆËã³É±¾¸ßµÄ LSTM ¼Ü¹¹ËäÈ»ÑéÖ¤¾«¶È´ïµ½ÁË
88% µ«ÊÇÐèºÄʱ 87 ºÁÃë²ÅÄÜ´¦ÀíÍêͬÑùµÄÈÎÎñÁ¿(×îºÃµÄÄ£ÐÍ´ó¸Å¾«¶ÈÔÚ 88-90%)¡£

ÏÂÃæµÄÂÌÉ«¿ò±íʾ´ÊǶÈ룬ʹÓà GloVe ½øÐÐÁ˳õʼ»¯£¬È»ºóÊÇÈ¡´ÊµÄƽ¾ù(´Ê´ü)ºÍ´øÓÐ dropout
µÄ 2 ²ã MLP¡£

ÏÂÃæµÄÀ¶ÂÌÉ«¿ò±íʾ´ÊǶÈ룬ʹÓà GloVe ½øÐÐÁ˳õʼ»¯¡£ÔÚÕû¸ö´ÊǶÈëÖÐûÓиú×ÙÌݶȡ£ÎÒÃÇʹÓÃÁËÒ»¸ö´øÓÐ
LSTM µ¥ÔªµÄË«Ïò RNN£¬Ê¹Óõķ½Ê½ÀàËÆÓÚ´Ê´ü£¬ÎÒÃÇʹÓÃÁ˸à RNN Òþ²Ø×´Ì¬À´ÌáÈ¡¾ùÖµºÍ×î´óÖµ£¬Ö®ºóÊÇÒ»¸ö´ø
dropout µÄ 2 ²ã MLP¡£
Îå¡¢µÍ¼ÆËã³É±¾µÄÌø¶ÁÔĶÁÆ÷(skim reader)
ÔÚijЩÈÎÎñÖУ¬Ëã·¨¿ÉÒÔÕ¹ÏÖ³ö½Ó½üÈËÀàˮƽµÄ¾«¶È£¬µ«ÊÇÒª´ïµ½ÕâÖÖЧ¹û£¬ÄãµÄ·þÎñÆ÷Ô¤Ëã¿Öŵ÷dz£¸ß¡£ÄãÒ²ÖªµÀ£¬²»Ò»¶¨×ÜÊÇÐèҪʹÓÃÓÐÕæÊµÊÀ½çÊý¾ÝµÄ
LSTM£¬Óõͳɱ¾µÄ´Ê´ü(BoW)»òÐíҲûÎÊÌâ¡£
µ±È»£¬Ë³Ðò²»¿ÉÖªµÄ´Ê´ü(BoW)»á½«´óÁ¿Ïû¼«´Ê»ã´íÎó·ÖÀà¡£ÍêÈ«Çл»µ½Ò»¸öÁÓÖʵĴʴü(BoW)»á½µµÍÎÒÃǵÄ×ÜÌåÐÔÄÜ£¬ÈÃËüÌýÉÏÈ¥¾Í²»ÄÇôÁîÈËÐÅ·þÁË¡£ËùÒÔÎÊÌâ¾Í±ä³ÉÁË£º
ÎÒÃÇÄÜ·ñѧ»áÇø·Ö¡¸¼òµ¥¡¹ºÍ¡¸À§ÄÑ¡¹µÄ¾ä×Ó¡£
¶øÇÒΪÁ˽Úʡʱ¼ä£¬ÎÒÃÇÄÜ·ñÓõͳɱ¾µÄÄ£ÐÍÀ´Íê³ÉÕâÏîÈÎÎñ?
Áù¡¢Ì½Ë÷ÄÚ²¿
̽Ë÷Éî¶ÈѧϰģÐ͵ÄÒ»ÖÖÁ÷Ðеķ½·¨ÊÇÁ˽âÿ¸ö¾ä×ÓÔÚÒþ²Ø²ãÖÐÊÇÈçºÎ±íʾµÄ¡£µ«ÊÇ£¬ÒòΪÒþ²Ø²ã³£³£ÊǸßάµÄ£¬ËùÒÔÎÒÃÇ¿ÉÒÔʹÓÃ
T-SNE ÕâÑùµÄËã·¨À´½«Æä½µÖÁ 2 ά£¬´Ó¶øÈÃÎÒÃÇ¿ÉÒÔ»æÖÆÍ¼±í¹©ÈËÀà¹Û²ì¡£


ÉÏÃæÁ½ÕÅͼÊÇÔÎÄÖпɽ»»¥µÄͼʾµÄ½ØÍ¼¡£ÔÚÔ½»»¥Í¼ÖУ¬Äã¿ÉÒÔ½«¹â±êÒÆ¶¯¡¢Ëõ·ÅºÍÐüÍ£ÔÚÊý¾ÝµãÉÏÀ´²é¿´ÕâЩÊý¾ÝµãµÄÐÅÏ¢¡£ÔÚͼÖУ¬Äã¿ÉÒÔ¿´µ½ÔÚ´Ê´ü(BoW)ÖеÄ×îºóÒ»¸öÒþ²Ø²ã¡£µ±ÐüÍ£ÔÚÈκÎÊý¾ÝµãÉÏʱ£¬Äã¿ÉÒÔ¿´µ½±íʾ¸ÃÊý¾ÝµãµÄ¾ä×Ó¡£¾ä×ÓµÄÑÕɫȡ¾öÓÚÆä±êÇ©(label)¡£
Predictions ±êǩҳ£º¸ÃÄ£Ð͵ÄϵͳԤ²âÓëʵ¼Ê±êÇ©µÄ±È½Ï¡£Êý¾ÝµãµÄÖÐÐıíʾÆäÔ¤²â(À¶É«±íʾ»ý¼«£¬ºìÉ«±íʾÏû¼«)£¬ÖÜΧµÄÏß±íʾʵ¼ÊµÄ±êÇ©¡£ÈÃÎÒÃÇ¿ÉÒÔÁ˽âϵͳʲôʱºòÊÇÕýÈ·µÄ£¬Ê²Ã´Ê±ºòÊÇ´íÎóµÄ¡£
Probabilities ±êǩҳ£ºÎÒÃÇ»æÖÆÁËÔÚÊä³ö²ãÖб»Ô¤²âµÄÀà±ðµÄ¸ÅÂÊ¡£Õâ±íʾÁ˸ÃÄ£ÐÍ¶ÔÆäÔ¤²âµÄÐÅÏ¢¡£´ËÍ⣬µ±ÐüÍ£ÔÚÊý¾ÝµãÉÏʱ£¬Ò²½«ÄÜ¿´µ½¸ø¶¨Êý¾ÝµãµÄ¸ÅÂÊ£¬ÆäÑÕÉ«±íʾÁËÄ£Ð͵ÄÔ¤²â¡£×¢ÒâÒòΪ¸ÃÈÎÎñÊǶþÔª·ÖÀ࣬ËùÒÔÆä¸ÅÂÊÊÇ´Ó
0.5 ¿ªÊ¼µÄ£¬ÔÚÕâ¸ö°¸ÀýÖеÄ×îСÖÃÐŶÈΪ 50/50.
T-SNE ͼÈÝÒ×Êܵ½Ðí¶à¹ý¶È½â¶ÁµÄÆÆ»µ£¬µ«Õâ¿ÉÄÜÄÜÈÃÄãÁ˽âһЩÇ÷ÊÆ¡£
Æß¡¢T-SNE µÄ½â¶Á
¾ä×Ó±ä³É¾ÛÀà(cluster)£¬¾ÛÀ๹³É²»Í¬µÄÓïÒåÀàÐÍ¡£
һЩ¾ÛÀà¾ßÓмòµ¥µÄÐÎʽ£¬¶øÇÒ¾ßÓкܸߵÄÖÃÐŶȺÍ׼ȷ¶È¡£
ÆäËü¾ÛÀà¸ü¼Ó·ÖÉ¢£¬´øÓиüµÍµÄ׼ȷ¶ÈºÍÖÃÐŶȡ£
´øÓлý¼«³É·ÖºÍÏû¼«³É·ÖµÄ¾ä×ÓÊǺÜÀ§Äѵġ£
ÏÖÔÚÈÃÎÒÃÇ¿´¿´ÔÚ LSTM ÉϵÄÏàËÆµÄͼ£º 

ÉÏÃæÁ½ÕÅͼÊÇÔÎÄÖпɽ»»¥µÄͼʾµÄ½ØÍ¼¡£ÔÚÔ½»»¥Í¼ÖУ¬Äã¿ÉÒÔ½«¹â±êÒÆ¶¯¡¢Ëõ·ÅºÍÐüÍ£ÔÚÊý¾ÝµãÉÏÀ´²é¿´ÕâЩÊý¾ÝµãµÄÐÅÏ¢¡£ÉèÖúʹʴüµÄ½»»¥Í¼ÀàËÆ£¬¿ìÀ´Ì½Ë÷
LSTM µÄÄÚ²¿°É!
ÎÒÃÇ¿ÉÒÔÈÏΪÆäÖÐÐí¶à¹Û²ìÒ²¶Ô LSTM ÓÐЧ¡£µ«ÊÇ£¬LSTM Ö»ÓÐÏà¶Ô½ÏÉÙµÄÑù±¾£¬ÖÃÐŶÈÒ²Ïà¶Ô½ÏµÍ£¬¶øÇÒ¾ä×ÓÖÐͬʱ³öÏÖ»ý¼«ºÍÏû¼«µÄ³É·Öʱ£¬¶Ô
LSTM À´ËµµÄÌôÕ½ÐÔÒ²ÒªµÍÓÚ¶Ô´Ê´üµÄÌôÕ½ÐÔ¡£
¿´ÆðÀ´´Ê´ü¿ÉÒÔ¾ÛÀà¾ä×Ó£¬²¢Ê¹ÓÃÆä¸ÅÂÊÀ´Ê¶±ðÊÇ·ñÓпÉÄܸøÄǸö¾ÛÀàÖеľä×ÓÌṩһ¸öÕýÈ·µÄÔ¤²â¡£¶ÔÓÚÕâЩ¹Û²ì£¬¿ÉÒÔ×ö³öÒ»¸öºÏÀíµÄ¼ÙÉ裺ÖÃÐŶȸü¸ßµÄ´ð°¸¸üÕýÈ·¡£
ΪÁËÑо¿Õâ¸ö¼ÙÉ裬ÎÒÃÇ¿ÉÒÔ¿´¿´¸ÅÂÊãÐÖµ(probability thresholds)¡£
°Ë¡¢¸ÅÂÊãÐÖµ
ÈËÃÇѵÁ·´Ê´üºÍ LSTM Ϊÿһ¸öÀàÌṩ¸ÅÂÊ£¬ÒÔ¶ÈÁ¿È·¶¨ÐÔ¡£ÕâÊÇʲôÒâ˼?Èç¹û´Ê´ü·µ»ØÒ»¸ö 1£¬ÄÇô±íʾËü¶ÔÆäÔ¤²âºÜ×ÔÐÅ¡£Í¨³£ÔÚÔ¤²âʱÎÒÃDzÉÓÃÓÉÎÒÃǵÄÄ£ÐÍÌṩÇÒ´øÓÐ×î¸ß¿ÉÄÜÐÔµÄÀà¡£ÔÚÕâÖÖ¶þÔª·ÖÀàµÄÇé¿öÏÂ(»ý¼«»òÏû¼«)£¬¸ÅÂʱØÐ볬¹ý
0.5(·ñÔòÎÒÃÇ»áÔ¤²âÏà·´µÄÀà)¡£µ«ÊÇÒ»¸ö±»Ô¤²âÀàµÄµÍ¸ÅÂÊÒ²Ðí±íÃ÷¸ÃÄ£ÐÍ´æÒÉ¡£ÀýÈ磬һ¸öÄ£ÐÍÔ¤²âµÄ»ý¼«¸ÅÂÊΪ
0.51£¬Ïû¼«¸ÅÂÊΪ 0.49£¬ÄÇô˵Õâ¸ö½áÂÛÊÇ»ý¼«µÄ¾Í²»Ì«¿ÉÐÅ¡£µ±Ê¹Óá¸ãÐÖµ¡¹Ê±£¬ÎÒÃÇÊÇÖ¸½«Ô¤²â³öµÄ¸ÅÂÊÓëÒ»¸öÖµÏà±È½Ï£¬²¢ÆÀ¹ÀÒª²»ÒªÊ¹ÓÃËü¡£ÀýÈ磬ÎÒÃÇ¿ÉÒÔ¾ö¶¨Ê¹ÓøÅÂÊÈ«²¿³¬¹ý
0.7 µÄ¾ä×Ó¡£»òÕßÎÒÃÇÒ²¿ÉÒÔ¿´¿´ 0.5-0.55 µÄÇø¼ä¸øÔ¤²âÖÃÐŶȴøÀ´Ê²Ã´Ó°Ï죬¶øÕâÕýÊÇÔÚÏÂͼËùÒª¾«È·µ÷²éµÄ¡£


ÔÚÕâÕÅãÐֵͼÖУ¬ÖùµÄ¸ß¶È¶ÔÓ¦ÓÚÁ½¸öãÐÖµÄÚµÄÊý¾ÝµãµÄ¾«È·¶È;Ïß±íʾµ±ËùÓеÄÊý¾Ýµã³¬³ö¸ø¶¨µÄãÐֵʱµÄÀàËÆµÄ¾«È·¶È¡£ÔÚÊý¾ÝÊýÁ¿Í¼ÖУ¬ÖùµÄ¸ß¶È¶ÔÓ¦ÓÚÁ½¸öãÐÖµÄÚ
data reciding µÄÁ¿£¬ÏßÔòÊÇÿ¸öãÐÖµ²Ö»ýÀÛµÄÊý¾Ý¡£
´Óÿ¸ö´Ê´üͼÖÐÄãÒ²Ðí·¢ÏÖÔö¼Ó¸ÅÂÊãÐÖµÐÔÄÜÒ²»áËæÖ®ÌáÉý¡£µ± LSTM ¹ýÄâºÏѵÁ·¼¯²¢Ö»ÌṩÖÃÐŶȸߵĴð°¸Ê±£¬ÉÏÊöÇé¿öÔÚ
LSTM ͼÖв¢²»Ã÷ÏÔ¾ÍËÆºõºÜÕý³£ÁË¡£
ÔÚÈÝÒ×µÄÑù±¾ÉÏʹÓà BoW£¬ÔÚÀ§ÄѵÄÑù±¾ÉÏʹÓÃÔʼ LSTM
Òò´Ë£¬¼òµ¥Ê¹ÓÃÊä³ö¸ÅÂʾÍÄÜÏòÎÒÃDZíÃ÷ʲôʱºòÒ»¸ö¾ä×ÓÊÇÈÝÒ׵ģ¬Ê²Ã´Ê±ºòÐèÒªÀ´×Ô¸üǿϵͳ(±ÈÈçÇ¿´óµÄ
LSTM)µÄÖ¸µ¼¡£
ÎÒÃÇʹÓøÅÂÊãÐÖµ´´½¨ÁËÒ»ÖÖ¡¸¸ÅÂʲßÂÔ¡¹(probability strategy)£¬´Ó¶ø¿ÉΪ´Ê´üϵͳµÄ¸ÅÂÊÉèÖÃãÐÖµ£¬²¢ÔÚËùÓÐûÓдﵽãÐÖµµÄÊý¾ÝµãÉÏʹÓÃ
LSTM¡£ÕâÑù×öΪÎÒÃÇÌṩÁËÓÃÓÚ´Ê´üµÄÄÇô¶àµÄÊý¾Ý(ÔÚãÐÖµÖ®Éϵľä×Ó)ºÍһϵÁÐÊý¾Ýµã£¬ÆäÖÐÎÒÃÇҪôѡÔñ
BoW(ÔÚãÐÖµÖ®ÉÏ)£¬ÒªÃ´Ñ¡Ôñ LSTM(ÔÚãÐÖµÖ®ÏÂ)£¬ÎÒÃÇ¿ÉÒÔÓô˷¢ÏÖÒ»¸ö¾«¶ÈºÍ¼ÆËã³É±¾¡£½Ó×ÅÎÒÃÇ»á»ñµÃ
BoW ºÍ LSTM Ö®¼äµÄÒ»¸ö´Ó 0.0(½öʹÓà LSTM)µ½ 1.0(½öʹÓà BoW)µÄ±ÈÂÊ£¬²¢¿É½è´Ë¼ÆË㾫¶ÈºÍ¼ÆËãʱ¼ä¡£
¾Å¡¢»ùÏß(Baseline)
ΪÁ˹¹½¨»ùÏß(baseline)£¬ÎÒÃÇÐèÒª¿¼ÂÇÁ½¸öÄ£ÐÍÖ®¼äµÄ±ÈÂÊ¡£ÀýÈç´Ê´ü(BoW)ʹÓà 0.1 µÄÊý¾Ý¾ÍÏ൱ÓÚ
0.9 ±¶ LSTM µÄ׼ȷÂÊºÍ 0.1 ±¶ BoW µÄ׼ȷÂÊ¡£ÆäÄ¿µÄÊÇÈ¡µÃûÓÐÖ¸µ¼²ßÂÔ(guided
strategy)µÄ»ùÏߣ¬´Ó¶øÔÚ¾ä×ÓÖÐʹÓà BoW »ò LSTM µÄÑ¡ÔñÊÇËæ»ú·ÖÅäµÄ¡£È»¶ø£¬Ê¹ÓòßÂÔʱÊÇÓгɱ¾µÄ¡£ÎÒÃDZØÐëÊ×ÏÈͨ¹ý
BoW Ä£ÐÍ´¦ÀíËùÓеľä×Ó£¬´Ó¶øÈ·¶¨ÎÒÃÇÊÇ·ñ¸ÃʹÓà BoW »ò LSTM¡£ÔÚûÓоä×Ó´ïµ½¸ÅÂÊ·§Öµ(probability
threshold)µÄÇé¿öÏ£¬ÎÒÃÇ¿ÉÒÔ²»ÐèҪʲôÀíÓÉÔËÐжîÍâµÄÄ£ÐÍ¡£ÎªÁËÌåÏÖÕâÒ»µã£¬ÎÒÃÇ´ÓÒÔÏ·½Ê½¼ÆËã²ßÂԳɱ¾Óë±ÈÂÊ¡£

ÆäÖÐ C ´ú±í×ųɱ¾£¬p ´ú±í×Å BoW ʹÓÃÊý¾ÝµÄ±ÈÀý¡£

ÉÏͼÊÇÑéÖ¤¼¯ÉϵĽá¹û£¬Æä±È½ÏÁË BoW¡¢LSTM(ºìÏß)ºÍ¸ÅÂÊ·§Öµ²ßÂÔ(À¶Ïß)Ö®¼ä²»Í¬×éºÏ±ÈÂʵľ«¶ÈºÍËÙ¶È£¬×î×ó²àµÄÊý¾Ýµã¶ÔÓ¦ÓÚֻʹÓÃ
LSTM£¬×îÓұߵÄֻʹÓà BoW£¬ÖмäµÄ¶ÔÓ¦×ÅʹÓÃÁ½ÕßµÄ×éºÏ¡£À¶Ïß´ú±í×ÅûÓÐÖ¸µ¼²ßÂ﵀ CBOW ºÍ
LSTM ×éºÏ£¬ºìÏßÃèÊöÁËʹÓà BoW ¸ÅÂÊ×÷Ϊ²ßÂÔÖ¸µ¼ÄĸöϵͳʹÓöà´ó±ÈÀý¡£×¢Òâ×î´óµÄʱ¼ä½ÚÊ¡³¬¹ýÁË
90%£¬ÒòΪÆä½ö½öֻʹÓÃÁË BoW¡£ÓÐȤµÄÊÇ£¬ÎÒÃÇ·¢ÏÖʹÓà BoW ·§ÖµÒªÏÔÖøÓÅÓÚûÓÐʹÓÃÖ¸µ¼²ßÂÔ(guided
strategy)µÄÇé¿ö¡£
ÎÒÃÇËæºó²âÁ¿ÁËÇúÏߵľùÖµ£¬ÎÒÃdzÆÖ®ÎªÇúÏßÏÂËÙ¶È(Speed Under the Curve /SUC)£¬Æä¾ÍÈçϱíËùʾ¡£

ÒÔÉÏÊÇÔÚÑéÖ¤¼¯ÖÐÀëÉ¢µØÑ¡ÔñʹÓà BoW »¹ÊÇ LSTM µÄ²ßÂÔ½á¹û¡£Ã¿Ò»¸öÄ£ÐÍ»áÔÚ²»Í¬ seed µÄÇé¿öϼÆËãÊ®´Î¡£¸Ã±í¸ñÖеĽá¹ûÊÇ
SUC µÄ¾ùÖµ¡£¸ÅÂʲßÂÔ(probability strategy)Ò²»áºÍ±ÈÂÊ(Ratio)Ïà±È½Ï¡£
Ê®¡¢Ñ§Ï°ºÎÊ±Ìø¶ÁºÎʱÔĶÁ
ÖªµÀʲôʱºòÔÚÁ½¸ö²»Í¬Ä£ÐÍÖ®¼äת»»»¹²»¹»£¬ÒòΪÎÒÃÇÒª¹¹½¨Ò»¸ö¸üͨÓõÄϵͳ£¬Ñ§Ï°ÔÚËùÓв»Í¬Ä£ÐÍÖ®¼äת»»¡£ÕâÑùµÄϵͳ½«°ïÖúÎÒÃÇ´¦Àí¸ü¸´ÔÓµÄÐÐΪ¡£
ÔڼලѧϰÖе±ÔĶÁÍêʤÓÚÌø¶Áʱ£¬ÎÒÃÇ¿ÉÒÔѧϰÂð?
LSTM ×Ô×óµ½Óҵء¸ÔĶÁ¡¹ÎÒÃÇ£¬Ã¿Ò»²½¶¼´æ´¢Ò»¸ö¼ÇÒ䣬¶ø¡¸Ìø¶Á¡¹ÔòʹÓà BoW Ä£ÐÍ¡£ÔÚÀ´×Ô´Ê´üÄ£ÐÍÉϵĸÅÂʲÙ×÷ʱ£¬ÎÒÃÇ»ùÓÚ²»±äÁ¿×ö¾ö²ß£¬Õâ¸ö²»±äÁ¿ÊÇÖ¸µ±´Ê´üϵͳÔâµ½ÖÊÒÉʱ£¬¸üÇ¿´óµÄ
LSTM ¹¤×÷µØ¸üºÃ¡£µ«ÊÇÇé¿ö×ÜÊÇÈç´ËÂð?

µ±´Ê´üºÍ LSTM ¹ØÓÚÒ»¸ö¾ä×ÓÊÇÕýÈ·»ò´íÎóµÄʱºòµÄ¡¸»ìÏý¾ØÕó¡¹(confusion matrix)¡£ÏàËÆÓÚÀ´×Ô֮ǰµÄ´Ê´üºÍ
LSTM Ö®¼äµÄ»ìÏý T-SNE ͼ¡£
ÊÂʵÉÏ£¬½á¹ûÖ¤Ã÷ÕâÖÖÇé¿öÖ»ÊÊÓÃÓÚ 12% µÄ¾ä×Ó£¬¶ø 6% µÄ¾ä×ÓÖУ¬´Ê´üºÍ LSTM ¶¼´íÁË¡£ÔÚÕâÖÖÇé¿öÏ£¬ÎÒÃÇûÓÐÀíÓÉÔÙÔËÐÐ
LSTM£¬¶øÖ»Ê¹ÓôʴüÒÔ½Úʡʱ¼ä¡£
ʮһ¡¢Ñ§Ï°Ìø¶Á£¬ÅäÖÃ
µ± BoW ÔâÊÜÖÊÒÉʱÎÒÃDz¢²»×ÜÊÇÓ¦¸ÃʹÓà LSTM¡£µ± LSTM Ò²·¸´í²¢ÇÒÎÒÃÇÒª±£ÁôÕä¹óµÄ¼ÆËã×ÊԴʱ£¬ÎÒÃÇ¿ÉÒÔʹ´Ê´üÄ£ÐÍÀí½âÂð?
ÈÃÎÒÃÇÔÙÒ»´Î¿´¿´ T-SNE ͼ£¬µ«ÊÇÏÖÔÚÔÙ¼ÓÉÏ BoW ºÍ LSTM Ö®¼äµÄ»ìÏý¾ØÕóͼ¡£ÎÒÃÇÏ£ÍûÕÒµ½»ìÏý¾ØÕó²»Í¬ÔªËØÖ®¼äµÄ¹ØÏµ£¬ÓÈÆäÊǵ±
BoW ´íÎóʱ¡£


´Ó¶Ô±ÈͼÖУ¬ÎÒÃÇ·¢ÏÖµ± BoW ÊÇÕýÈ·µÄ£¬²¢ÔâÊÜ»³ÒÉʱ£¬ÎÒÃǺÜÈÝÒ×Åоö³öÀ´¡£È»¶ø£¬µ± LSTM ¿ÉÄÜÊǶԻò´íʱ£¬BoW
Óë LSTM Ö®¼ä²¢Ã»ÓÐÃ÷È·µÄ¹ØÏµ¡£
1. ÎÒÃÇÄÜѧϰÕâÖÖ¹ØÏµÂð?
ÁíÍ⣬ÒòΪ¸ÅÂʲßÂÔÒÀÀµÓÚ¶þÔª¾ö²ß²¢ÒªÇó¸ÅÂÊ£¬ÆäÊÇÓкܴóµÄÏÞÖÆÐԵġ£Ïà·´£¬ÎÒÃÇÌá³öÁËÒ»¸ö»ùÓÚÉñ¾ÍøÂçµÄ¿ÉѵÁ·¾ö²ßÍøÂç(decision
network)¡£Èç¹ûÎÒÃDz鿴»ìÏý¾ØÕó(confusion matrix)£¬ÄÇôÎÒÃǾÍÄÜʹÓÃÕâЩÐÅϢΪ¼à¶½¾ö²ßÍøÂçÉú³É±êÇ©¡£Òò´Ë£¬ÎÒÃǾÍÄÜÔÚ
LSTM ÕýÈ·ÇÒ BoW ´íÎóµÄÇé¿öÏÂʹÓà LSTM¡£
ΪÁËÉú³ÉÊý¾Ý¼¯£¬ÎÒÃÇÐèÒªÒ»¸ö¾ä×Ó¼¯£¬Æä°üº¬ÁË´Ê´üºÍ LSTM µÄÕæÊµ¡¢Ç±ÔÚµÄÔ¤²â¡£È»¶øÔÚѵÁ· LSTM
µÄ¹ý³ÌÖУ¬Æä¾³£ÊµÏÖÁ˳¬¹ý 99% µÄѵÁ·×¼È·¶È£¬²¢ÏÔÈ»¶ÔѵÁ·¼¯´æÔÚ¹ýÄâºÏÏÖÏó¡£ÎªÁ˱ÜÃâÕâÒ»µã£¬ÎÒÃǽ«ÑµÁ·¼¯·Ö¸î³ÉÄ£ÐÍѵÁ·¼¯(80%
µÄѵÁ·Êý¾Ý)ºÍ¾ö²ßѵÁ·¼¯(ÓàÏ 20% µÄѵÁ·Êý¾Ý)£¬ÆäÖоö²ßѵÁ·¼¯ÊÇÄ£ÐÍ֮ǰËùûÓмû¹ýµÄ¡£Ö®ºó£¬ÎÒÃÇʹÓÃÓàϵÄ
20% Êý¾Ý΢µ÷ÁËÄ£ÐÍ£¬²¢ÆÚÍû¾ö²ßÍøÂçÄÜ·º»¯µ½ÕâÒ»¸öеġ¢Ã»¼û¹ýµÄµ«ÓÖÊ®·ÖÏà¹ØµÄÊý¾Ý¼¯£¬²¢ÈÃϵͳ¸üºÃһЩ¡£

´Ê´üºÍ LSTM ×î³õ¶¼ÊÇÔÚ¡¸Model train¡¹ÉÏÏȽøÐÐѵÁ·(80% ѵÁ·Êý¾Ý)£¬ËæºóÕâЩģÐͱ»ÓÃÓÚÉú³É¾ö²ßÍøÂçµÄ±êÇ©£¬ÔÙ½øÐÐÍêÕûÊý¾Ý¼¯µÄѵÁ·¡£ÑéÖ¤¼¯ÔÚÕâ¶Îʱ¼äÒ»Ö±±»Ê¹Óá£
ΪÁ˹¹½¨ÎÒÃǵľö²ßÍøÂ磬ÎÒÃǽøÈëÎÒÃǵͳɱ¾µÄ´Ê´üϵͳµÄ×îºóÒ»¸öÒþ²Ø²ã(ÓÃÀ´Éú³É T-SNE ͼµÄͬһ²ã)¡£ÎÒÃÇÔÚÄ£ÐÍѵÁ·¼¯ÉϵĴʴüѵÁ·Ö®Éϵþ¼ÓÒ»¸öÁ½²ã
MLP¡£ÎÒÃÇ·¢ÏÖ£¬Èç¹ûÎÒÃDz»×ñÑÕâ¸ö·½·¨£¬¾ö²ßÍøÂ罫ÎÞ·¨Á˽â BoW Ä£Ð͵ÄÇ÷ÊÆ£¬²¢ÇÒ²»ÄܺܺõؽøÐзº»¯¡£

µ×²¿µÄ³¤Ìõ×´´ú±í´Ê´üϵͳµÄ²ã£¬²»°üº¬ dropout¡£Ò»¸öË«²ãµÄ MLP ±»¼ÓÔÚ¶¥²¿£¬Ò»¸öÀàÓÃÓÚÊÇ·ñÑ¡Ôñ´Ê´ü»ò׿ԽµÄ
LSTM¡£
Óɾö²ßÍøÂçÔÚÑéÖ¤¼¯ÉÏÑ¡ÔñµÄÀà±ð(ÔÚÄ£ÐÍѵÁ·¼¯ÉÏѵÁ·¹ýµÄÄ£ÐÍ»ù´¡ÉÏ)½Ó×ű»Ó¦ÓÃÓÚÍêȫѵÁ·¼¯ÉÏѵÁ·¹ýµ«·Ç³£Ïà¹ØµÄÄ£ÐÍÉÏ¡£ÎªÊ²Ã´ÒªÓ¦Óõ½Ò»¸öÍêȫѵÁ·¼¯ÑµÁ·¹ýµÄÄ£ÐÍÉÏ?ÒòΪģÐÍѵÁ·¼¯ÉϵÄÄ£ÐÍͨ³£½Ï²î£¬Òò´Ë׼ȷ¶È»á±È½ÏµÍ¡£¸Ã¾ö²ßÍøÂçÊÇ»ùÓÚÔÚÑéÖ¤¼¯ÉϵÄ
SUC ×î´ó»¯¶øÀûÓÃÔçÍ£(early stopping)ѵÁ·µÄ¡£
2. ¾ö²ßÍøÂçµÄ±íÏÖÈçºÎ?
ÈÃÎÒÃÇ´Ó¹Û²ì¾ö²ßÍøÂçµÄÔ¤²â¿ªÊ¼¡£

Êý¾ÝµãºÍ֮ǰʹÓôʴüÄ£ÐÍʱµÄ T-SNE ͼÏàͬ¡£ÂÌÉ«µã´ú±íʹÓôʴüÔ¤²âµÄ¾ä×Ó£¬»ÆÉ«µã´ú±í LSTM¡£
×¢Ò⣺ÕâÓжà½üËÆ´Ê´üµÄ¸ÅÂʽØÖ¹(probability cutoff)¡£ÈÃÎÒÃÇ¿´¿´¾ö²ßÍøÂç×îºóÒ»¸öÒþ²Ø²ãµÄ
T-SNE ÊÇ·ñÄܹ»ÕæµÄ¾Û¼¯Ò»Ð©¹ØÓÚ LSTM ʲôʱºòÕýÈ·»ò´íÎóµÄÐÅÏ¢¡£

3. ÍøÂçÈçºÎÖ´ÐÐÎÒÃǵľö²ß£¿
ÈÃÎÒÃÇ´Ó¾ö²ßÍøÂçµÄÔ¤²â¿ªÊ¼¡£
Êý¾Ýµã»ùÓÚ¾ö²ßÍøÂç×îºóÒþ²Ø×´Ì¬µÄÓï¾ä±íÕ÷£¬Ô´×ÔÑéÖ¤Óï¾ä¡£ÑÕÉ«ºÍ֮ǰµÄ±È½ÏͼÏàͬ¡£
¿´ÆðÀ´¾ö²ßÍøÂçÄܹ»´Ó´Ê´üµÄÒþ²Ø×´Ì¬ÖÐʰȡ¾ÛÀࡣȻ¶ø£¬ËüËÆºõ²»ÄÜÀí½âºÎʱ LSTM ¿ÉÄÜÊÇ´íÎóµÄ(½«»ÆÉ«ºÍºìÉ«¾ÛÀà·Ö¿ª)¡£

×ÏÉ«ÇúÏß´ú±íÔÚÑéÖ¤¼¯ÉÏÐÂÒýÈëµÄ¾ö²ßÍøÂ磬עÒâ¾ö²ßÍøÂçÈçºÎʵÏÖ½Ó½üµ«ÂÔ΢²»Í¬ÓÚ¸ÅÂÊãÐÖµµÄ½â¾ö·½°¸¡£´Óʱ¼äÇúÏߺÍÊý¾Ý¾«¶ÈÀ´¿´£¬¾ö²ßÍøÂçµÄÓÅÊÆ²¢²»Ã÷ÏÔ¡£

Bow Óë LSTM ÔÚ²âÊÔ¼¯ºÍÑéÖ¤¼¯ÖеıíÏÖ¡£SUC »ùÓÚ׼ȷÂÊÓëËÙ¶ÈͼµÄƽ¾ùÖµ¡£Ã¿¸öÄ£ÐͶ¼Óò»Í¬ÖÖ×Ó¼ÆËãÁËÊ®´Î¡£±íÖнá¹ûÀ´×Ô
SUC µÄƽ¾ùÊý¡£±ê׼ƫ²î»ùÓÚÓë±ÈÂʵIJîÒì¡£
´ÓÔ¤²âͼ¡¢Êý¾ÝÁ¿¡¢×¼È·ÂÊºÍ SUC ·ÖÊýÖУ¬ÎÒÃÇ¿ÉÒÔÍÆ¶Ï¾ö²ßÍøÂçºÜÉÆÓÚÁ˽â BoW ºÎʱÕýÈ·£¬ºÎʱ²»ÕýÈ·¡£¶øÇÒ£¬ËüÔÊÐíÎÒÃǹ¹½¨Ò»¸ö¸üͨÓõÄϵͳ£¬ÍÚ¾òÉî¶ÈѧϰģÐ͵ÄÒþ²Ø×´Ì¬¡£È»¶ø£¬ËüÒ²±íÃ÷Èþö²ßÍøÂçÁ˽âËüÎÞ·¨·ÃÎʵÄϵͳÐÐΪÊǷdz£À§Äѵģ¬ÀýÈç¸ü¸´ÔÓµÄ
LSTM¡£
Ê®¶þ¡¢ÌÖÂÛ
ÎÒÃÇÏÖÔÚÖÕÓÚÃ÷°×ÁË LSTM µÄÕæÕýʵÁ¦£¬Ëü¿ÉÒÔÔÚÎı¾ÉÏ´ïµ½½Ó½üÈËÀàµÄˮƽ£¬Í¬Ê±ÎªÁË´ïµ½Õâһˮƽ£¬ÑµÁ·Ò²²»ÐèÒª½Ó½üÕæÊµÊÀ½çµÄÊý¾ÝÁ¿¡£ÎÒÃÇ¿ÉÒÔѵÁ·Ò»¸ö´Ê´üÄ£ÐÍ(bag-of-words
model)ÓÃÓÚÀí½â¼òµ¥µÄ¾ä×Ó£¬Õâ¿ÉÒÔ½ÚÊ¡´óÁ¿µÄ¼ÆËã×ÊÔ´£¬Õû¸öϵͳµÄÐÔÄÜËðʧ΢ºõÆä΢(È¡¾öÓÚ´Ê´üãÐÖµµÄ´óС³Ì¶È)¡£
Õâ¸ö·½·¨ÓëÆ½¾ùÏà¹Ø£¬¸Ãƽ¾ùͨ³£Êǵ±ÀàËÆÓÚ´øÓиßÖÃÐŶȵÄÄ£Ðͽ«±»Ê¹ÓÃʱ¶øÖ´Ðеġ£µ«ÊÇ£¬Ö»ÒªÓÐÒ»¸ö¿Éµ÷ÕûÖÃÐŶȵĴʴü£¬²¢ÇÒ²»ÐèÒªÔËÐÐ
LSTM£¬ÎÒÃǾͿÉÒÔ×ÔÐÐȨºâ¼ÆËãʱ¼äºÍ׼ȷ¶ÈµÄÖØÒªÐÔ²¢µ÷ÕûÏàÓ¦²ÎÊý¡£ÎÒÃÇÏàÐÅÕâÖÖ·½·¨¶ÔÓÚÄÇЩѰÇóÔÚ²»ÎþÉüÐÔÄܵÄǰÌáϽÚÊ¡¼ÆËã×ÊÔ´µÄÉî¶Èѧϰ¿ª·¢Õß»á·Ç³£ÓаïÖú¡£
ÎÄÕÂÖÐÓÐһЩ½»»¥Ê½Í¼Ê¾£¬¸ÐÐËȤµÄ¶ÁÕß¿ÉÒÔä¯ÀÀÔÍøÒ³²éÔÄ¡£±¾ÎÄ×÷ÕßΪ MetaMind Ñо¿¿ÆÑ§¼Ò Alexander
Rosenberg Johansen¡£¾Ý½éÉÜ£¬¸ÃÑо¿µÄÏà¹ØÂÛÎĽ«»áºÜ¿ì·¢²¼µ½ arXiv ÉÏ¡£
|