±à¼ÍƼö: |
±¾ÎÄÖ÷Òª½²½âÁËÈ˹¤ÖÇÄܺÍÉî¶Èѧϰ¸ÅÂÛ¡¢Í¼Ïñ»ù´¡¡¢Éî¶Èѧϰ»ù´¡¡¢Éî¶ÈѧϰµÄ»ù±¾Êýѧ¡¢Àí½âµÄÈ˹¤Éñ¾ÍøÂç¡£
±¾ÎÄÀ´×ÔÓÚÔÆ+ÉçÇø£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£ |
|
È˹¤ÖÇÄܺÍÉî¶Èѧϰ¸ÅÂÛ
AIÓëMLÓëDL
Artificial intelligence(È˹¤ÖÇÄÜ)
ʹÈËÀàͨ³£Ö´ÐеÄÖÇÁ¦ÈÎÎñ×Ô¶¯»¯µÄŬÁ¦
Machine Learning(»úÆ÷ѧϰ)
ʹϵͳÎÞÐè½øÐÐÏÔʽ±à³Ì¼´¿É×Ô¶¯´ÓÊý¾Ý½øÐиĽø
Deep Learning(Éî¶Èѧϰ)
»úÆ÷ѧϰµÄÌØ¶¨×ÓÁìÓò
²àÖØÓÚѧϰԽÀ´Ô½ÓÐÒâÒåµÄ±íʾÐÎʽµÄÁ¬Ðø²ã
È˹¤Éñ¾ÍøÂç
×î³õÓÚ1950Äê´ú½øÐе÷²é£¬Ê¼ÓÚ1980Äê´ú
²»ÊÇÕæÕýµÄ´óÄÔÄ£ÐÍ
Êܵ½Éñ¾ÉúÎïѧÑо¿µÄ¿íËÉÆô·¢
Éî¶ÈѧϰµÄÉî¶ÈÈçºÎ£¿
Éî¶ÈѧϰÊÇÈ˹¤Éñ¾ÍøÂçµÄÖØËÜ£¬¾ßÓÐÁ½²ãÒÔÉÏ
¡°ÉîÈ롱²¢²»ÊÇָͨ¹ýÕâÖÖ·½·¨»ñµÃµÄ¸üÉî¿ÌµÄÀí½â
Ëü´ú±íÁ¬Ðø±íʾ²ãµÄÏë·¨
Éî¶Èѧϰ¿ò¼Ü

ÎÒÃǵÄÉî¶Èѧϰ¼¼Êõ¶ÑÕ»

GPUºÍCUDA
GPU(Graphics Processing Unit):
Êý°Ù¸ö¸ü¼òµ¥µÄÄÚºË
Êýǧ¸ö²¢·¢µÄÓ²¼þÏß³Ì
×î´ó»¯¸¡µãÍÌÍÂÁ¿
CUDA(Compute Unified Device Architecture)
²¢Ðбà³ÌÄ£ÐÍ£¬¿Éͨ¹ýÀûÓÃGPUÏÔ×ÅÌá¸ß¼ÆËãÐÔÄÜ
cuDNN£¨CUDAÉî¶ÈÉñ¾ÍøÂç¿â£©
GPU¼ÓËÙµÄÉñ¾ÍøÂçÔÓï¿â
ËüΪÒÔÏ·½ÃæÌṩÁ˸߶ÈÓÅ»¯µÄʵÏÖ£º¾í»ý£¬³Ø»¯£¬¹æ·¶»¯ºÍ¼¤»î²ã
ÉèÖÃÉî¶Èѧϰ»·¾³
°²×°Anaconda3-5.2.0
°²×°DL¿ò¼Ü
$ conda install -c conda-forge tensorflow
$ conda install -c conda-forge keras
Æô¶¯Jupyter Server
$ jupyter notebook
ͼÏñ´¦Àí»ù´¡
ÏñËØ
ÏñËØÊÇͼÏñµÄÔʼ¹¹½¨¿é¡£ ÿ¸öͼÏñ¶¼°üº¬Ò»×éÏñËØ
ÏñËØ±»ÈÏΪÊÇÔÚͼÏñÖиø¶¨Î»ÖóöÏֵĹâµÄ¡°ÑÕÉ«¡±»ò¡°Ç¿¶È¡±
ͼƬµÄ·Ö±æÂÊΪ1024x768£¬¼´1024ÏñËØ
¿í768ÏñËØ¸ß
»Ò¶ÈÓëÑÕÉ«
ÔÚ»Ò¶ÈͼÏñÖУ¬Ã¿¸öÏñËØµÄ±êÁ¿ÖµÎª0µ½255Ö®¼ä£¬ÆäÖÐÁã¶ÔÓ¦ÓÚºÚÉ«£¬¶ø255Ϊ°×É«¡£ 0µ½255Ö®¼äµÄÖµÊDZ仯µÄ»ÒÉ«ÒõÓ°
RGBÑÕÉ«¿Õ¼äÖеÄÏñËØÓÉÈý¸öÖµµÄÁбí±íʾ£ººìÉ«´ú±íÒ»¸öÖµ£¬ÂÌÉ«´ú±íÒ»¸öÖµ£¬À¶É«´ú±íÁíÒ»¸öÖµ
ͨ¹ýÖ´ÐоùÖµ¼õ·¨»òËõ·Å¶ÔÊäÈëͼÏñ½øÐÐÔ¤´¦Àí£¬ÕâÊǽ«Í¼Ïñת»»Îª¸¡µãÊý¾ÝÀàÐÍËù±ØÐèµÄ
ͼÏñ±íʾ
ÎÒÃÇ¿ÉÒÔ½«RGBͼÏñ¸ÅÄ£¬¸ÃͼÏñÓÉ¿í¶ÈWºÍ¸ß¶ÈHµÄÈý¸ö¶ÀÁ¢¾ØÕó×é³É£¬Ã¿¸öRGB·ÖÁ¿Ò»¸ö
RGBͼÏñÖеĸø¶¨ÏñËØÊÇ[0; 255]·¶Î§ÄÚµÄÈý¸öÕûÊýµÄÁÐ±í£ººìÉ«´ú±íÒ»¸öÖµ£¬ÂÌÉ«´ú±íµÚ¶þ¸öÖµ£¬À¶É«´ú±í×îÖÕÖµ
RGBͼÏñ¿ÉÒÔÒÔÐÎ×´£¨¸ß¶È£¬¿í¶È£¬Éî¶È£©´æ´¢ÔÚ3D NumPy¶àάÊý×éÖС£
ͼÏñ·ÖÀà
ͼÏñ·ÖÀàÊÇ´ÓÒ»×éÔ¤¶¨ÒåµÄÀà±ðÖÐΪͼÏñ·ÖÅä±êÇ©µÄÈÎÎñ
¼ÙÉè¼ÆËã»ú¿´µ½µÄÖ»ÊÇÒ»¸öºÜ´óµÄÏñËØ¾ØÕó£¬ÄÇôÈçºÎÒÔ¼ÆËã»ú¿ÉÒÔÀí½âµÄ¹ý³ÌÀ´±íʾͼÏñ
Ó¦ÓÃÌØÕ÷ÌáÈ¡ÒÔ»ñÈ¡ÊäÈëͼÏñ£¬Ó¦ÓÃËã·¨²¢»ñµÃÁ¿»¯ÄÚÈݵÄÌØÕ÷ÏòÁ¿
ÎÒÃǵÄÄ¿±êÊÇÓ¦ÓÃÉî¶ÈѧϰËã·¨À´·¢ÏÖͼÏñ¼¯ºÏÖеÄDZÔÚģʽ£¬´Ó¶øÊ¹ÎÒÃÇÄܹ»ÕýÈ··ÖÀàËã·¨ÉÐδÓöµ½µÄͼÏñ
Éî¶Èѧϰ»ù´¡
Êý¾Ý´¦Àí
ÏòÁ¿»¯
½«ÔʼÊý¾Ýת»»ÎªÕÅÁ¿
Õý³£»¯
ËùÓÐÌØÕ÷Öµ¾ùÔÚͬһ·¶Î§ÄÚ£¬±ê׼ƫ²îΪ1£¬Æ½¾ùֵΪ0
ÌØÕ÷¹¤³Ì
ͨ¹ýÔÚ½¨Ä£Ö®Ç°½«ÈËÀà֪ʶӦÓÃÓÚÊý¾ÝÀ´Ê¹Ëã·¨¸üºÃµØ¹¤×÷
ʲôÊÇÄ£ÐÍ
ÔÚѵÁ·Êý¾Ý¼¯ÉÏѵÁ·MLË㷨ʱÉú³ÉµÄº¯Êý
ÀýÈç¡£ ÕÒ³öwºÍbµÄÖµ£¬Òò´Ëf£¨x£©= wx + b½ôÃÜÆ¥ÅäÊý¾Ýµã¡£

Ä£ÐÍÖØÁ¿
Ä£ÐÍÈ¨ÖØ=²ÎÊý=ÄÚºË
´ÓѵÁ·Êý¾ÝÖÐѧµ½µÄÄ£Ð͵ĿÉѧϰ²¿·Ö
ѧϰ¿ªÊ¼Ö®Ç°£¬ÕâЩ²ÎÊýµÄÖµ»áËæ»ú³õʼ»¯
È»ºóµ÷ÕûΪ¾ßÓÐ×î¼ÑÊä³öµÄÖµ
Ä£Ðͳ¬²ÎÊý£º
ÔÚʵ¼ÊÓÅ»¯Ä£ÐͲÎÊý֮ǰÊÖ¶¯ÉèÖñäÁ¿
ÀýÈç ѧϰÂÊ£¬ÅúÁ¿´óС£¬Ê±´úµÈ
Optimizer(ÓÅ»¯Æ÷)
ͨ¹ý¶¯Ì¬µ÷ÕûѧϰÂÊÀ´È·¶¨ÈçºÎ¸üÐÂÍøÂçÈ¨ÖØ
ÈÈÃÅÓÅ»¯Æ÷:
SGD£¨Ëæ»úÌݶÈϽµ£©
RMSProp£¨¾ù·½¸ù´«²¥£©
¶¯Á¿Adam£¨×ÔÊÊÓ¦¾Ø¹À¼Æ£©
Loss Function(Ëðʧº¯Êý)
Ò²³ÆÎªÎó²îº¯Êý»ò³É±¾º¯Êý
ͨ¹ý»ã×ÜÕû¸öÊý¾Ý¼¯µÄÎó²î²¢Ç󯽾ùÖµÀ´Á¿»¯Ä£ÐÍÔ¤²âÓëµØÃæÕæÊµ³Ì¶ÈµÄ½Ó½ü³Ì¶È
²éÕÒÈ¨ÖØ×éºÏÒÔ×îС»¯Ëðʧº¯Êý
¹ã·ºÊ¹ÓõÄËðʧº¯Êý
½»²æìØ£¨¶ÔÊýËðʧ£©
¾ù·½Îó²î£¨MSE£©
¾ùÖµ¾ø¶ÔÎó²î£¨MAE£©
Weight Updates

ѧϰÂÊ
¿ØÖÆÎÒÃÇÔÚËðºÄÌݶȷ½Ãæµ÷ÕûÍøÂçÈ¨ÖØµÄ³Ì¶È
ϵÊý£¬ÓÃÓÚºâÁ¿Ä£ÐÍÔÚËðʧº¯ÊýÖÐÈ¨ÖØµÄ´óС
ËüÈ·¶¨Ëã·¨µÄÏÂÒ»²½ÒªÊ¹ÓöàÉÙÌݶȡ£

Ö¸±ê
Ñо¿ÈËԱʹÓöÈÁ¿±ê×¼À´ÔÚÿ¸öʱÆÚºó¸ù¾ÝÑéÖ¤¼¯ÅжÏÄ£Ð͵ÄÐÔÄÜ
·ÖÀàÖ¸±ê
׼ȷÐÔ
¾«È·
ÕÙ»Ø
»Ø¹éÖ¸±ê£º
ƽ¾ù¾ø¶ÔÎó²î
¾ù·½Îó²î
Éî¶ÈѧϰµÄ»ù±¾Êýѧ
ÏßÐÔ´úÊý£º¾ØÕóÔËËã
΢»ý·Ö£ºÎ¢·Ö£¬ÌݶÈϽµ
ͳ¼Æ: ¸ÅÂÊ
Tensors(ÕÅÁ¿)
ÕÅÁ¿:
ÊýÖµÊý¾ÝµÄÈÝÆ÷
¶àάÊý×é
»úÆ÷ѧϰµÄ»ù±¾Êý¾Ý½á¹¹

Tensor Operations(ÕÅÁ¿ÔËËã)
Element-wise product
½«Ò»¸öÔªËØµÄÿ¸öÔªËØÓëÁíÒ»¸öÔªËØµÄÿ¸öÔªËØÏà³Ë

Broadcasting(¹ã²¥)
¹ã²¥½ÏСµÄÕÅÁ¿ÒÔÆ¥Åä½Ï´óµÄÕÅÁ¿µÄÐÎ×´

Dot Product Example(µã»ýµÄÀý×Ó)
µÚÒ»¸ö¾ØÕóµÄÁÐÊý±ØÐëµÈÓÚµÚ¶þ¸ö¾ØÕóµÄÐÐÊý
A(m, n) x B(n, k) = C(m, k)

Derivative vs. Gradient (µ¼ÊýÓëÌݶÈ)
µ¼Êý²âÁ¿±êÁ¿ÖµµÄº¯ÊýµÄ¡°±ä»¯ÂÊ¡±
ÌݶÈÊÇÕÅÁ¿µÄµ¼Êý£¬µ¼ÊýµÄ¶àά·º»¯
ÔÚÒ»¸ö±äÁ¿µÄº¯ÊýÉ϶¨ÒåÁ˵¼Êý£¬¶øÌݶÈΪ ÓÃÓÚ¼¸¸ö±äÁ¿µÄº¯Êý¡£


ÌݶÈϽµ
ÌݶÈϽµ·¨ÊÇÕÒµ½Ò»¸öº¯ÊýµÄ¾Ö²¿×îСֵ
³É±¾º¯Êý¸æËßÎÒÃÇÀëÈ«¾Ö×îСֵÓжàÔ¶

È˹¤Éñ¾ÍøÂç
Ðí¶à¼òµ¥µÄµ¥ÔªÔÚûÓм¯ÖпØÖƵ¥ÔªµÄÇé¿öϲ¢Ðй¤×÷
ANNÓÉÊäÈë/Êä³ö²ãºÍÒþ²Ø²ã×é³É
ÿ²ãÖеĵ¥Ôª¶¼ÍêÈ«Á¬½Óµ½Ç°À¡Éñ¾ÍøÂçÖÐÏàÁÚ²ãÖеÄËùÓе¥Ôª¡£

Éñ¾Ôª
Éñ¾ÔªÊÇÈ˹¤Éñ¾ÍøÂçµÄÊýѧ¹¦Äܺͻù±¾´¦ÀíÔªËØ
È˹¤Éñ¾ÍøÂçÖеÄÿ¸öÉñ¾Ôª¶¼»á¶ÔÆäËùÓÐÊäÈë½øÐмÓȨÇóºÍ£¬Ìí¼ÓÒ»¸öºã¶¨µÄÆ«²î£¬È»ºóͨ¹ý·ÇÏßÐÔ¼¤»îº¯ÊýÌṩ½á¹û

²ãÊý
Ò»²ãÊÇÒ»¸öÊý¾Ý´¦ÀíÄ£¿é£¬Ê¹ÓÃÕÅÁ¿×÷ΪÊäÈëºÍÊä³öÕÅÁ¿¡£
Òþ²Ø²ãÊÇÍøÂçÍⲿÎÞ·¨¹Û²ìµ½µÄ²ã
´ó¶àÊý²ã¾ßÓÐÈ¨ÖØ£¬¶øÓÐЩÔòûÓÐÈ¨ÖØ
ÿ¸ö²ã¾ßÓв»Í¬ÊýÁ¿µÄµ¥Î»
Á¬½ÓÈ¨ÖØºÍÆ«²î
·Å´óÊäÈëÐźŲ¢ÒÖÖÆÍøÂçµ¥ÔªÔëÉùµÄϵÊý¡£
ͨ¹ý¼õÐ¡Ä³Ð©È¨ÖØ£¬ÁíÒ»Ð©È¨ÖØ½Ï´ó£¬¿ÉÒÔʹijЩ¹¦ÄÜÓÐÒâÒ壬²¢×îС»¯ÆäËû¹¦ÄÜ£¬´Ó¶øÁ˽âÄÄЩ½á¹ûÓÐÓá£
Æ«²îΪ±êÁ¿Öµ ÊäÈ룬ÒÔÈ·±£ÎÞÂÛÐźÅÇ¿¶ÈÈçºÎ£¬Ã¿²ãÖÁÉÙ¼¤»î¼¸¸öµ¥Î»
±¾ÖÊÉÏ£¬ANNÖ¼ÔÚÓÅ»¯È¨ÖØºÍÆ«²îÒÔ×îС»¯Îó²î
Activation Functions(¼¤»îº¯Êý)
µ±Ò»¸ö·ÇÁãÖµ´ÓÒ»¸öµ¥Ôª´«µÝµ½ÁíÒ»¸öµ¥ÔªÊ±£¬¸ÃÖµ±»¼¤»î
¼¤»î¹¦Äܽ«ÊäÈë£¬È¨ÖØºÍÆ«²îµÄ×éºÏ´ÓÒ»²ãת»»µ½ÏÂÒ»²ã
½«·ÇÏßÐÔÒýÈëÍøÂçµÄ½¨Ä£¹¦ÄÜ
Á÷ÐеÄActivation Functions
Sigmoid
Tanh
ReLU
Leaky ReLU
Softmax
SoftPlus
°¸ÀýÑо¿£ºÊÖдÊý×Ö·ÖÀà
LeCunµÈÔÚ1980Äê´ú×é×°µÄMNISTÊý¾Ý¼¯
70,000¸ö¾ßÓÐ28 x 28ÏñËØÕý·½ÐεÄÑù±¾Í¼Ïñ
¶àÀà·ÖÀàÎÊÌâ
ÊäÈë²ã
ʹÓÃ28 x 28 = 784ÏñËØ×÷ΪǰÀ¡ÍøÂçÊäÈë²ãÖеÄÊäÈëÏòÁ¿

Sigmoid Functions
Âß¼º¯Êý£¬¿Éͨ¹ý½«ÈÎÒâֵת»»Îª[0£¬1]À´Ô¤²âÊä³öµÄ¿ÉÄÜÐÔ
ÿ´ÎÌݶÈÐźÅÁ÷¹ýSÐÍÃÅʱ£¬Æä·ùֵʼÖÕ×î¶à¼õÉÙ0.25

Softmax Functions
Softmax·µ»Ø»¥³âÊä³öÀàÉϵĸÅÂÊ·Ö²¼
½á¹ûÏòÁ¿Çå³þµØÏÔʾÁËmaxÔªËØ£¬¸ÃÔªËØ½Ó½ü1£¬²¢±£³Ö˳Ðò

Êä³ö²ã
ÔÚÊä³ö²ãÉÏʹÓÃSoftmax½«Êä³öת»»ÎªÀàËÆ¸ÅÂʵÄÖµ

Classification Loss Function(·ÖÀàËðʧº¯Êý)
·ÖÀཻ²æìغâÁ¿Á˶à·ÖÀàÄ£Ð͵ÄÐÔÄÜ

ÔÚÒÔÉÏʾÀýÖУ¬Y¡¯ÊÇÕæÊµÄ¿±ê£¬¶øYÊÇÄ£Ð͵ÄÔ¤²âÊä³ö¡£ Êä³ö²ãÓÉSigmoid¼¤»î
SGD, Batch and Epoch
Ëæ»úÌݶÈϽµ£¨SGD£©£¬ÓÃÓÚ¼ÆËãÌݶȲ¢¸üÐÂÿ¸öµ¥¸öѵÁ·Ñù±¾ÉϵÄÈ¨ÖØ¾ØÕó
SGDʹ¼ÆËãËٶȸü¿ì£¬¶øÊ¹ÓÃÕû¸öÊý¾Ý¼¯»áʹʸÁ¿»¯Ð§ÂʽµµÍ¡£
¶ø²»ÊǶÔÕû¸öÊý¾Ý¼¯»òµ¥¸öÑù±¾¼ÆËãÌݶȣ¬ ÎÒÃÇͨ³£ÔÚmini-batch£¨16£¬32£¬64£©ÉÏÆÀ¹ÀÌݶȣ¬È»ºó¸üÐÂÈ¨ÖØ¾ØÕó
EpochÊÇËùÓÐѵÁ·Ñù±¾µÄÒ»¸öǰÏòͨ¹ý¼ÓÒ»¸öÏòºóͨ¹ý |