Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
ÈëÃÅϵÁÐÖ®Scikit-learnÔÚPythonÖй¹½¨»úÆ÷ѧϰ·ÖÀàÆ÷
 
  2590  次浏览      29
 2018-11-15
 
±à¼­ÍƼö:

±¾ÎÄÀ´×ÔÓÚsegmentfault£¬ÎÄÕÂÏêϸ½éÉÜÁËÈçºÎÔÚPythonÖй¹½¨»úÆ÷ѧϰ·ÖÀàÆ÷µÈÏà¹ØÖªÊ¶¡£

½éÉÜ

»úÆ÷ѧϰÊǼÆËã»ú¿ÆÑ§¡¢È˹¤ÖÇÄܺÍͳ¼ÆÑ§µÄÑо¿ÁìÓò¡£»úÆ÷ѧϰµÄÖØµãÊÇѵÁ·Ëã·¨ÒÔѧϰģʽ²¢¸ù¾ÝÊý¾Ý½øÐÐÔ¤²â¡£»úÆ÷Ñ§Ï°ÌØ±ðÓмÛÖµ£¬ÒòΪËüÈÃÎÒÃÇ¿ÉÒÔʹÓüÆËã»úÀ´×Ô¶¯»¯¾ö²ß¹ý³Ì¡£

ÔÚ±¾½Ì³ÌÖУ¬Äú½«Ê¹ÓÃScikit-learn£¨PythonµÄ»úÆ÷ѧϰ¹¤¾ß£©ÔÚPythonÖÐʵÏÖÒ»¸ö¼òµ¥µÄ»úÆ÷ѧϰËã·¨¡£Äú½«Ê¹ÓÃNaive Bayes£¨NB£©·ÖÀàÆ÷£¬½áºÏÈéÏÙ°©Ö×ÁöÐÅÏ¢Êý¾Ý¿â£¬Ô¤²âÖ×ÁöÊǶñÐÔ»¹ÊÇÁ¼ÐÔ¡£

ÔÚ±¾½Ì³Ì½áÊøÊ±£¬Äú½«Á˽âÈçºÎʹÓÃPython¹¹½¨×Ô¼ºµÄ»úÆ÷ѧϰģÐÍ¡£

×¼±¸

ÒªÍê³É±¾½Ì³Ì£¬ÄúÐèÒª£º

Python 3 ±¾µØ±à³Ì»·¾³

ÔÚvirtualenvÖа²×°Jupyter Notebook¡£Jupyter NotebooksÔÚÔËÐлúÆ÷ѧϰʵÑéʱ·Ç³£ÓÐÓá£Äú¿ÉÒÔÔËÐж̴úÂë¿é²¢¿ìËٲ鿴½á¹û£¬´Ó¶øÇáËɲâÊԺ͵÷ÊÔ´úÂë¡£

µÚÒ»²½ - µ¼ÈëScikit-learn

ÈÃÎÒÃÇÊ×ÏȰ²×°PythonÄ£¿éScikit-learn£¬ÕâÊÇPython ×îºÃ¡¢Îĵµ¼Ç¼×î¶àµÄ»úÆ÷ѧϰ¿âÖ®Ò»¡£

Òª¿ªÊ¼ÎÒÃǵıàÂëÏîÄ¿£¬ÏÈÒª¼¤»îÎÒÃǵÄPython 3±à³Ì»·¾³¡£È·±£ÄúλÓÚ»·¾³ËùÔÚµÄĿ¼ÖУ¬È»ºóÔËÐÐÒÔÏÂÃüÁ

$ . my_env/bin/activate

¼¤»îÎÒÃǵıà³Ì»·¾³ºó£¬¼ì²éÊÇ·ñÒѰ²×°Sckikit-learnÄ£¿é£º

(my_env) $ python -c "import sklearn"

Èç¹ûsklearnÒѰ²×°£¬Ôò´ËÃüÁÍê³ÉÇÒûÓдíÎó¡£Èç¹ûδ°²×°£¬Äú½«¿´µ½ÒÔÏ´íÎóÏûÏ¢£º

Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: No module named 'sklearn'

´íÎóÏûÏ¢±íÃ÷sklearnδ°²×°£¬Òò´ËÇëʹÓÃpipÏÂÔØ¿â£º

(my_env) $ pip install scikit-learn[alldeps]

°²×°Íê³Éºó£¬Æô¶¯Jupyter Notebook£º

(my_env) $ jupyter notebook

ÔÚJupyterÖУ¬´´½¨Ò»¸öÃûΪML TutorialµÄÐÂPython Notebook¡£ÔÚNotebookµÄµÚÒ»¸öµ¥Ôª¸ñ£¬ÊäÈësklearnÄ£¿é£º

ML Tutorial

import sklearn

ÄúµÄ NotebookÓ¦ÈçÏÂͼËùʾ£º

Notebook

ÏÖÔÚÎÒÃÇÒѾ­ÔÚ NotebookÖе¼ÈëÁËsklearn£¬ÎÒÃÇ¿ÉÒÔ¿ªÊ¼Ê¹ÓûúÆ÷ѧϰģÐ͵ÄÊý¾Ý¼¯¡£

µÚ¶þ²½ - µ¼ÈëScikit-learnµÄÊý¾Ý¼¯

ÎÒÃǽ«ÔÚ±¾½Ì³ÌÖÐʹÓõÄÊý¾Ý¼¯ÊÇÈéÏÙ°©Íþ˹¿µÐÇÕï¶ÏÊý¾Ý¿â¡£¸ÃÊý¾Ý¼¯°üÀ¨¹ØÓÚÈéÏÙ°©Ö×ÁöµÄ¸÷ÖÖÐÅÏ¢£¬ÒÔ¼°¶ñÐÔ»òÁ¼ÐԵķÖÀà±êÇ©¡£¸ÃÊý¾Ý¼¯ÔÚ569¸öÖ×ÁöÉϾßÓÐ569¸öʵÀý»òÊý¾Ý£¬²¢ÇÒ°üÀ¨¹ØÓÚ30¸öÊôÐÔ»òÌØÕ÷µÄÐÅÏ¢£¬ÀýÈçÖ×ÁöµÄ°ë¾¶£¬ÎÆÀí£¬Æ½»¬¶ÈºÍÃæ»ý¡£

ʹÓøÃÊý¾Ý¼¯£¬ÎÒÃǽ«¹¹½¨»úÆ÷ѧϰģÐÍÒÔʹÓÃÖ×ÁöÐÅÏ¢À´Ô¤²âÖ×ÁöÊǶñÐԵϹÊÇÁ¼ÐԵġ£

Scikit-learn°²×°Á˸÷ÖÖÊý¾Ý¼¯£¬ÎÒÃÇ¿ÉÒÔ½«Æä¼ÓÔØµ½PythonÖУ¬²¢°üº¬ÎÒÃÇÏëÒªµÄÊý¾Ý¼¯¡£µ¼Èë²¢¼ÓÔØÊý¾Ý¼¯£º

ML Tutorial

...

from sklearn.datasets import load_breast_cancer

# Load dataset
data = load_breast_cancer()

¸Ãdata±äÁ¿±íʾһ¸öÏñ×ÖµäÒ»Ñù¹¤×÷µÄPython¶ÔÏó¡£×ÖµäµÄ¹Ø¼üÊÇ·ÖÀà±êÇ©Ãû³Æ£¨target_names£©£¬Êµ¼Ê±êÇ©£¨target£©£¬ÊôÐÔ/ÌØÕ÷Ãû³Æ£¨feature_names£©ºÍÊôÐÔ£¨data£©¡£

ÊôÐÔÊÇÈκηÖÀàÆ÷µÄ¹Ø¼ü²¿·Ö¡£ÊôÐÔ²¶»ñÓйØÊý¾ÝÐÔÖʵÄÖØÒªÌØÕ÷¡£¼øÓÚÎÒÃÇÊÔͼԤ²âµÄ±êÇ©ÊǶñÐÔÖ×ÁöÓëÁ¼ÐÔÖ×Áö£¬¿ÉÄܵÄÓÐÓÃÊôÐÔÓÐÖ×ÁöµÄ´óС£¬°ë¾¶ºÍÖʵء£

Ϊÿ¸öÖØÒªÐÅÏ¢¼¯´´½¨Ð±äÁ¿²¢·ÖÅäÊý¾Ý£º

ML Tutorial

...

# Organize our data
label_names = data['target_names']
labels = data['target']
feature_names = data['feature_names']
features = data['data']

ÎÒÃÇÏÖÔÚÓÐÁËÿ×éÐÅÏ¢µÄÁÐ±í¡£ÎªÁ˸üºÃµØÀí½âÎÒÃǵÄÊý¾Ý¼¯£¬ÈÃÎÒÃÇͨ¹ýÊä³öÎÒÃǵÄÀà±êÇ©¡¢µÚÒ»¸öÊý¾ÝʵÀýµÄ±êÇ©¡¢ÎÒÃǵŦÄÜÃû³ÆÒÔ¼°µÚÒ»¸öÊý¾ÝʵÀýµÄ¹¦ÄÜÖµÀ´²é¿´ÎÒÃǵÄÊý¾Ý£º

ML Tutorial

...

# Look at our data
print(label_names)
print(labels[0])
print(feature_names[0])
print(features[0])

Èç¹ûÔËÐдúÂ룬Äú½«¿´µ½ÒÔϽá¹û£º

Êä³ö½á¹û

ÈçͼËùʾ£¬ÎÒÃǵÄÀàÃûÊǶñÐÔºÍÁ¼ÐÔ£¬È»ºó½«ÆäÓ³Éäµ½¶þ½øÖÆÖµ0ºÍ1£¬ÆäÖÐ0´ú±í¶ñÐÔÖ×Áö1´ú±íÁ¼ÐÔÖ×Áö¡£Òò´Ë£¬ÎÒÃǵĵÚÒ»¸öÊý¾ÝʵÀýÊǶñÐÔÖ×Áö£¬Æäƽ¾ù°ë¾¶Îª1.79900000e+01¡£

ÏÖÔÚÎÒÃÇÒѾ­¼ÓÔØÁËÊý¾Ý£¬ÎÒÃÇ¿ÉÒÔʹÓÃÎÒÃǵÄÊý¾ÝÀ´¹¹½¨ÎÒÃǵĻúÆ÷ѧϰ·ÖÀàÆ÷¡£

µÚÈý²½ - ½«Êý¾Ý×éÖ¯µ½¼¯ºÏÖÐ

ÒªÆÀ¹À·ÖÀàÆ÷µÄÐÔÄÜ£¬ÄúÓ¦¸ÃʼÖÕÔÚ¿´²»¼ûµÄÊý¾ÝÉϲâÊÔÄ£ÐÍ¡£Òò´Ë£¬ÔÚ¹¹½¨Ä£ÐÍ֮ǰ£¬½«Êý¾Ý²ð·ÖΪÁ½²¿·Ö£ºÑµÁ·¼¯ºÍ²âÊÔ¼¯¡£

Äú¿ÉÒÔʹÓÃѵÁ·¼¯ÔÚ¿ª·¢½×¶ÎѵÁ·ºÍÆÀ¹ÀÄ£ÐÍ¡£È»ºó£¬ÄúʹÓÃѵÁ·µÄÄ£ÐͶԿ´²»¼ûµÄ²âÊÔ¼¯½øÐÐÔ¤²â¡£ÕâÖÖ·½·¨ÈÃÄúÁ˽âÄ£Ð͵ÄÐÔÄܺÍÎȽ¡ÐÔ¡£

ÐÒÔ˵ÄÊÇ£¬sklearnÓÐÒ»¸öÃûΪtrain_test_split()µÄº¯Êý£¬Ëü½«ÄúµÄÊý¾Ý»®·ÖΪÕâЩ¼¯ºÏ¡£µ¼Èë¸Ãº¯Êý£¬È»ºóʹÓÃËüÀ´²ð·ÖÊý¾Ý£º

ML Tutorial

...

from sklearn.model_selection import train_test_split

# Split our data
train, test, train_labels, test_labels = train_test_split(features,
labels,
test_size=0.33,
random_state=42)

¸Ãº¯ÊýʹÓÃtest_size²ÎÊýËæ»ú·Ö¸îÊý¾Ý¡£ÔÚÕâ¸öÀý×ÓÖУ¬ÎÒÃÇÏÖÔÚÓÐÒ»¸ö²âÊÔ¼¯£¨test£©´ú±íԭʼÊý¾Ý¼¯µÄ33£¥¡£È»ºóʣϵÄÊý¾Ý£¨train£©×é³ÉѵÁ·Êý¾Ý¡£ÎÒÃÇ»¹ÓÐÁгµ/²âÊÔ±äÁ¿µÄÏàÓ¦±êÇ©£¬¼´train_labelsºÍtest_labels¡£

ÎÒÃÇÏÖÔÚ¿ÉÒÔ¼ÌÐøÅàѵÎÒÃǵĵÚÒ»¸öÄ£ÐÍ¡£

µÚËIJ½ - ¹¹½¨ºÍÆÀ¹ÀÄ£ÐÍ

»úÆ÷ѧϰÓкܶàÄ£ÐÍ£¬Ã¿ÖÖÄ£ÐͶ¼ÓÐ×Ô¼ºµÄÓŵãºÍȱµã¡£ÔÚ±¾½Ì³ÌÖУ¬ÎÒÃǽ«Öصã½éÉÜÒ»ÖÖͨ³£ÔÚ¶þ½øÖÆ·ÖÀàÈÎÎñÖбíÏÖÁ¼ºÃµÄ¼òµ¥Ëã·¨£¬¼´Naive Bayes (NB)¡£

Ê×ÏÈ£¬µ¼ÈëGaussianNBÄ£¿é¡£È»ºóʹÓÃGaussianNB()º¯Êý³õʼ»¯Ä£ÐÍ£¬È»ºóͨ¹ýʹÓÃgnb.fit()½«Ä£ÐÍÄâºÏµ½Êý¾ÝÀ´ÑµÁ·Ä£ÐÍ£º

ML Tutorial

...

from sklearn.naive_bayes import GaussianNB

# Initialize our classifier
gnb = GaussianNB()

# Train our classifier
model = gnb.fit(train, train_labels)

ÔÚÎÒÃÇѵÁ·Ä£ÐÍÖ®ºó£¬ÎÒÃÇ¿ÉÒÔʹÓÃѵÁ·µÄÄ£ÐͶÔÎÒÃǵIJâÊÔ¼¯½øÐÐÔ¤²â£¬ÕâÀÎÒÃÇʹÓÃpredict()º¯Êý¡£¸Ãpredict()º¯Êý·µ»Ø²âÊÔ¼¯ÖÐÿ¸öÊý¾ÝʵÀýµÄÔ¤²âÊý×顣ȻºóÎÒÃÇ¿ÉÒÔÊä³öÎÒÃǵÄÔ¤²â£¬ÒÔÁ˽âÄ£ÐÍÈ·¶¨µÄÄÚÈÝ¡£

ʹÓôøÓÐtestµÄpredict()º¯ÊýÊä³ö½á¹û£º

ML Tutorial!

...

# Make predictions
preds = gnb.predict(test)
print(preds)

ÔËÐдúÂ룬Äú½«¿´µ½ÒÔϽá¹û£º

Ô¤²âÊä³ö½á¹û

ÕýÈçÄúÔÚJupyter NotebookÊä³öÖп´µ½µÄ£¬¸Ãpredict()º¯Êý·µ»ØÁËÒ»¸ö0sºÍ1s Êý×飬ËüÃÇ´ú±íÁËÎÒÃǶÔÖ×ÁöÀàµÄÔ¤²âÖµ£¨¶ñÐÔÓëÁ¼ÐÔ£©¡£

ÏÖÔÚÎÒÃÇÓÐÁËÔ¤²â£¬ÈÃÎÒÃÇÆÀ¹À·ÖÀàÆ÷µÄ±íÏÖ¡£

µÚÎå²½ - ÆÀ¹ÀÄ£Ð͵Ä׼ȷÐÔ

ʹÓÃÕæÊµÀà±êÇ©Êý×飬ÎÒÃÇ¿ÉÒÔͨ¹ý±È½ÏÁ½¸öÊý×飨test_labelsvs.preds£©À´ÆÀ¹ÀÄ£ÐÍÔ¤²âÖµµÄ׼ȷÐÔ¡£ÎÒÃǽ«Ê¹ÓÃsklearnº¯Êýaccuracy_score()À´È·¶¨»úÆ÷ѧϰ·ÖÀàÆ÷µÄ׼ȷÐÔ¡£

ML Tutorial

...

from sklearn.metrics import accuracy_score

# Evaluate accuracy
print(accuracy_score(test_labels, preds))

Äú½«¿´µ½ÒÔϽá¹û£º

׼ȷÐÔ½á¹û

ÕýÈçÄúÔÚÊä³öÖп´µ½µÄÄÇÑù£¬NB·ÖÀàÆ÷׼ȷÂÊΪ94.15£¥¡£ÕâÒâζ×Å·ÖÀàÆ÷ÓÐ94.15£¥µÄʱ¼äÄܹ»ÕýÈ·Ô¤²âÖ×ÁöÊǶñÐÔ»¹ÊÇÁ¼ÐÔ¡£ÕâЩ½á¹û±íÃ÷ÎÒÃǵÄ30¸öÊôÐÔµÄÌØÕ÷¼¯ÊÇÖ×ÁöÀà±ðµÄÁ¼ºÃÖ¸±ê¡£

ÄúÒѳɹ¦¹¹½¨Á˵Úһ̨»úÆ÷ѧϰ·ÖÀàÆ÷¡£ÈÃÎÒÃÇͨ¹ý½«ËùÓÐimportÓï¾ä·ÅÔÚNotebook»ò½Å±¾µÄ¶¥²¿À´ÖØÐÂ×éÖ¯´úÂë¡£´úÂëµÄ×îÖÕ°æ±¾Ó¦ÈçÏÂËùʾ£º

ML½Ì³Ì

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()

# Organize our data
label_names = data['target_names']
labels = data['target']
feature_names = data['feature_names']
features = data['data']

# Look at our data
print(label_names)
print('Class label = ', labels[0])
print(feature_names)
print(features[0])

# Split our data
train, test, train_labels, test_labels = train_test_split(features,
labels,
test_size=0.33,
random_state=42)

# Initialize our classifier
gnb = GaussianNB()

# Train our classifier
model = gnb.fit(train, train_labels)

# Make predictions
preds = gnb.predict(test)
print(preds)

# Evaluate accuracy
print(accuracy_score(test_labels, preds))

ÏÖÔÚ£¬Äú¿ÉÒÔ¼ÌÐøÊ¹ÓôúÂëÀ´²é¿´ÊÇ·ñ¿ÉÒÔʹ·ÖÀàÆ÷µÄÐÔÄܸü¼Ñ¡£Äú¿ÉÒÔ³¢ÊÔ²»Í¬µÄ¹¦ÄÜ×Ó¼¯£¬ÉõÖÁ³¢ÊÔÍêÈ«²»Í¬µÄËã·¨¡£

½áÂÛ

ÔÚ±¾½Ì³ÌÖУ¬ÄúѧϰÁËÈçºÎÔÚPythonÖй¹½¨»úÆ÷ѧϰ·ÖÀàÆ÷¡£ÏÖÔÚ£¬Äú¿ÉÒÔʹÓÃScikit-learnÔÚPythonÖмÓÔØÊý¾Ý¡¢×éÖ¯Êý¾Ý¡¢ÑµÁ·¡¢Ô¤²âºÍÆÀ¹À»úÆ÷ѧϰ·ÖÀàÆ÷¡£±¾½Ì³ÌÖеIJ½Öè¿ÉÒÔ°ïÖúÄú¼ò»¯ÔÚPythonÖÐʹÓÃ×Ô¼ºµÄÊý¾ÝµÄ¹ý³Ì¡£

   
2590 ´Îä¯ÀÀ       29
Ïà¹ØÎÄÕÂ

»ùÓÚͼ¾í»ýÍøÂçµÄͼÉî¶Èѧϰ
×Ô¶¯¼ÝÊ»ÖеÄ3DÄ¿±ê¼ì²â
¹¤Òµ»úÆ÷ÈË¿ØÖÆÏµÍ³¼Ü¹¹½éÉÜ
ÏîĿʵս£ºÈçºÎ¹¹½¨ÖªÊ¶Í¼Æ×
 
Ïà¹ØÎĵµ

5GÈ˹¤ÖÇÄÜÎïÁªÍøµÄµäÐÍÓ¦ÓÃ
Éî¶ÈѧϰÔÚ×Ô¶¯¼ÝÊ»ÖеÄÓ¦ÓÃ
ͼÉñ¾­ÍøÂçÔÚ½»²æÑ§¿ÆÁìÓòµÄÓ¦ÓÃÑо¿
ÎÞÈË»úϵͳԭÀí
Ïà¹Ø¿Î³Ì

È˹¤ÖÇÄÜ¡¢»úÆ÷ѧϰ&TensorFlow
»úÆ÷ÈËÈí¼þ¿ª·¢¼¼Êõ
È˹¤ÖÇÄÜ£¬»úÆ÷ѧϰºÍÉî¶Èѧϰ
ͼÏñ´¦ÀíËã·¨·½·¨Óëʵ¼ù