| ±à¼ÍƼö: |
±¾ÎĽ«¶ÔAutoMLÖеÄ×Ô¶¯»¯ÌØÕ÷¹¤³ÌÄ£¿éµÄÏÖ×´Õ¹¿ª½éÉÜ,½éÉÜAutoMLÖеļ¼Êõ·½°¸£¬Ï£Íû¶ÔÄúµÄѧϰÓÖËù°ïÖú¡£
±¾ÎÄÀ´×ÔÓÚ²©¿ÍÔ°£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼¡¢ÍƼö¡£ |
|
1. ÒýÑÔ
¸öÈËÒÔΪ£¬»úÆ÷ѧϰÊdz¯×Ÿü¸ßµÄÒ×ÓÃÐÔ¡¢¸üµÍµÄ¼¼ÊõÃż÷¡¢¸üÃô½ÝµÄ¿ª·¢³É±¾µÄ·½ÏòÈ¥·¢Õ¹£¬ÇÒAutoML»òÕßAutoDLµÄ·¢Õ¹ÎÞÒÉÊÇ×îºÃµÄÖ¤Ã÷¡£Òò´Ë»¨·ÑһЩʱ¼äѧϰÁ˽âÁËAutoMLÁìÓòµÄһЩ֪ʶ£¬²¢¶ÔAutoMLÖеļ¼Êõ·½°¸½øÐйéÄÉÕûÀí¡£
ÖÚËùÖÜÖª£¬Ò»¸öÍêÕûµÄ»úÆ÷ѧϰÏîÄ¿¿É¸ÅÀ¨ÎªÈçÏÂËĸö²½Öè¡£

ÆäÖУ¬ÌØÕ÷¹¤³Ì£¨ÌáÈ¡£©ÍùÍùÊǾö¶¨Ä£ÐÍÐÔÄܵÄ×î¹Ø¼üÒ»²½¡£¶øÍùÍù»úÆ÷ѧϰÖÐ×îºÄʱµÄ²¿·ÖÒ²ÕýÊÇÌØÐÔ¹¤³ÌºÍ³¬²ÎÊýµ÷ÓÅ¡£Òò´Ë£¬Ðí¶àÄ£ÐÍÓÉÓÚʱ¼äÏÞÖÆ¶ø¹ýÔçµØ´ÓʵÑé½×¶Î×ªÒÆµ½Éú²ú½×¶Î´Ó¶øµ¼Ö²¢²»ÊÇ×îÓŵġ£
×Ô¶¯»¯»úÆ÷ѧϰ(AutoML)¿ò¼ÜÖ¼ÔÚ¼õÉÙËã·¨¹¤³ÌʦÃǵĸºµ££¬ÒÔ±ãÓÚËûÃÇ¿ÉÒÔÔÚÌØÕ÷¹¤³ÌºÍ³¬²ÎÊýµ÷ÓÅÉÏ»¨¸üÉÙµÄʱ¼ä£¬¶øÔÚÄ£ÐÍÉè¼ÆÉÏ»¨¸ü¶àµÄʱ¼ä½øÐг¢ÊÔ¡£

±¾ÎĽ«¶ÔAutoMLÖеÄ×Ô¶¯»¯ÌØÕ÷¹¤³ÌÄ£¿éµÄÏÖ×´Õ¹¿ª½éÉÜ£¬ÒÔÏÂÊÇĿǰÖ÷Á÷µÄÓйØAUTOMLµÄ¿ªÔ´°ü¡£

2. ʲôÊÇ×Ô¶¯»¯ÌØÕ÷¹¤³Ì£¿
×Ô¶¯»¯ÌØÕ÷¹¤³ÌÖ¼ÔÚͨ¹ý´ÓÊý¾Ý¼¯ÖÐ×Ô¶¯´´½¨ºòÑ¡ÌØÕ÷£¬ÇÒ´ÓÖÐÑ¡ÔñÈô¸É×î¼ÑÌØÕ÷½øÐÐѵÁ·µÄÒ»ÖÖ·½Ê½¡£
3. ×Ô¶¯»¯ÌØÕ÷¹¤³Ì¹¤¾ß°ü
3.1 Featuretools
FeaturetoolsʹÓÃÒ»ÖÖ³ÆÎªÉî¶ÈÌØÕ÷ºÏ³É£¨Deep Feature Synthesis£¬DFS£©µÄËã·¨£¬¸ÃËã·¨±éÀúͨ¹ý¹ØÏµÊý¾Ý¿âµÄģʽÃèÊöµÄ¹ØÏµÂ·¾¶¡£µ±DFS±éÀúÕâЩ·¾¶Ê±£¬Ëüͨ¹ýÓ¦ÓÃÓÚÊý¾ÝµÄ²Ù×÷£¨°üÀ¨ºÍ¡¢Æ½¾ùÖµºÍ¼ÆÊý£©Éú³É×ÛºÏÌØÕ÷¡£ÀýÈ磬¶ÔÀ´×Ô¸ø¶¨×Ö¶Îclient_idµÄÊÂÎñÁбíÓ¦ÓÃsum²Ù×÷£¬²¢½«ÕâЩÊÂÎñ¾ÛºÏµ½Ò»¸öÁÐÖС£¾¡¹ÜÕâÊÇÒ»¸öÉî¶È²Ù×÷£¬µ«¸ÃËã·¨¿ÉÒÔ±éÀú¸üÉî²ãµÄÌØÕ÷¡£Featuretools×î´óµÄÓŵãÊÇÆä¿É¿¿ÐԺʹ¦ÀíÐÅϢй©µÄÄÜÁ¦£¬Í¬Ê±¿ÉÒÔÓÃÀ´¶Ôʱ¼äÐòÁÐÊý¾Ý½øÐд¦Àí¡£
Àý×Ó£º
¼ÙÉèÓÐÈýÕÅ±í£¬·Ö±ðΪclients¡¢loans¡¢payments¡£
clients £ºÓйØÐÅÓúÏ×÷Éç¿Í»§µÄ»ù±¾ÐÅÏ¢±í¡£Ã¿¸ö¿Í»§¶ËÔÚ´ËÊý¾Ý¿òÖÐÖ»ÓÐÒ»ÐС£

loans£ºÏò¿Í»§ÌṩµÄ´û¿î±í¡£Ã¿±Ê´û¿îÔÚ´ËÊý¾Ý¿òÖÐÖ»ÓÐ×Ô¼ºµÄÐУ¬µ«¿Í»§¿ÉÄÜÓжà±Ê´û¿î¡£

payments£º´û¿î³¥»¹±í¡£Ã¿±Ê¸¶¿îÖ»ÓÐÒ»ÐУ¬µ«Ã¿±Ê´û¿î¶¼Óжà±Ê¸¶¿î¡£

ÒÔÿ¸öclient_idΪ¶ÔÏó¹¹ÔìÌØÕ÷£º
´«Í³µÄÌØÕ÷¹¤³Ì·½°¸ÊÇÀûÓÃPandas¶ÔËùÐèÌØÕ÷×ö´¦Àí£¬ÀýÈçϱíÖеĻñȡԷݡ¢ÊÕÈëÖµµÄ¶ÔÊý¡£

ͬʱ£¬Ò²¿ÉÒÔͨ¹ýÓëloans±í¹ØÁª»ñȡеÄÌØÕ÷£¨Ã¿¸öclientƽ¾ù´û¿î¶î¶È¡¢×î´ó´û¿î¶î¶ÈµÈ£©¡£

¶øFeaturetoolsͨ¹ý»ùÓÚÒ»ÖÖ³ÆÎª¡° Éî¶ÈÌØÕ÷ºÏ³É ¡±µÄ·½·¨£¬¼´Í¨¹ý¶Ñµþ¶à¸öÌØÕ÷À´Íê³ÉÌØÕ÷¹¤³Ì¡£
Éî¶ÈÌØÕ÷ºÏ³É¶Ñµþ¶à¸öת»»ºÍ¾ÛºÏ²Ù×÷£¨ÔÚÌØÕ÷¹¤¾ßµÄ´Ê»ãÖгÆÎªÌØÕ÷»ùÔª£©£¬ÒÔͨ¹ý·Ö²¼ÔÚÐí¶à±íÖеÄÊý¾Ý´´½¨ÌØÕ÷¡£
FeaturetoolsÓÐÁ½¸öÖ÷Òª¸ÅÄ
µÚÒ»¸öÊÇentities£¬Ëü¿É±»ÊÓΪµ¥¸ö±í¡£
µÚ¶þ¸öÊÇentityset£¬ËüÊÇʵÌå(±í)µÄ¼¯ºÏ£¬ÒÔ¼°ÓÃÀ´±íʾʵÌåÖ®¼äµÄ¹ØÏµ¡£
Ê×ÏÈ£¬ÐèÒª´´½¨Ò»¸ö´æ·ÅËùÓÐÊý¾Ý±íµÄ¿ÕʵÌ弯¶ÔÏó£º
import featuretools
as ft
es = ft.EntitySet(id='clients') |
ÏÖÔÚÐèÒªÌí¼ÓʵÌ壺ÿ¸öʵÌå¶¼±ØÐëÓÐÒ»¸öË÷Òý£¬Ë÷ÒýÊÇÓÉʵÌåÖоßÓÐÎ¨Ò»ÔªËØÖµµÄÁй¹³É¡£Ò²¾ÍÊÇ˵£¬Ë÷ÒýÖеÄÿ¸öÖµ±ØÐëÖ»³öÏÖÔÚ±íÖÐÒ»´Î¡£
es = es.entity_from_dataframe(entity_id='clients',
dataframe=clients,
index='client_id', time_index='joined')
es = es.entity_from_dataframe(entity_id='loans',
dataframe=loans,
index='loans_id', time_index='joined') |
¶ø¶ÔÓÚûÓÐΨһË÷ÒýµÄ±í£ºÐèÒª´«Èë²ÎÊýmake_index = True²¢Ö¸¶¨Ë÷ÒýµÄÃû³Æ¡£
´ËÍ⣬ËäÈ»featuretools»á×Ô¶¯ÍƶÏʵÌåÖÐÿ¸öÁеÄÊý¾ÝÀàÐÍ£¬µ«ÈÔ¿ÉÒÔͨ¹ý½«ÁÐÀàÐ͵Ä×ֵ䴫µÝ¸ø²ÎÊývariable_typesÀ´ÖØÐ¶¨ÒåÊý¾ÝÀàÐÍ¡£ÀýÈç¶Ô¡°missed¡±×Ö¶ÎÎÒÃǶ¨ÒåΪÀà±ðÐͱäÁ¿¡£
es = es.entity_from_dataframe(entity_id='payments',
dataframe=payments,
variable_types={'missed': ft.variable_types.Categorical},
make_index=True, index='payment_id', time_index='payment_date') |
ÔÚÖ´ÐоۺϼÆËãʱ£¬ÒªÔÚfeaturetoolsÖÐÖ¸¶¨±íÖ®¼äµÄ¹ØÏµÊ±£¬Ö»ÐèÖ¸¶¨½«Á½¸ö±í¹ØÁªÔÚÒ»ÆðµÄÌØÕ÷×ֶΡ£clientsºÍloans±íͨ¹ýclient_id×ֶιØÁª£¬loansºÍpaymentsͨ¹ýloan_id×ֶιØÁª¡£
´´½¨±íÖ®¼ä¹ØÏµ²¢½«ÆäÌí¼Óµ½entitysetµÄ´úÂëÈçÏÂËùʾ£º
# 'clients'±íÓëloans±í¹ØÁª
r_client_previous = ft.Relationship(es['clients']['client_id'],
es['loans']['client_id'])
# ½«¹ØÏµÌí¼Óµ½ÊµÌ弯
es = es.add_relationship(r_client_previous)
# loans±íÓëpayments±í¹ØÁª
r_payments = ft.Relationship(es['loans']['loan_id'],
es['payments']['loan_id'])
# ½«¹ØÏµÌí¼Óµ½ÊµÌ弯
es = es.add_relationship(r_payments) |
ÔÚÌí¼ÓʵÌåºÍÐÎʽ»¯¹ØÏµÖ®ºó£¬entityset¾ÍÍê³ÉÁË¡£
ÐèҪעÒ⣬featuretools ÊÇͨ¹ýÒÔÏÂÁ½ÖÖ²Ù×÷½øÐÐÌØÕ÷¹¹Ô죺
Aggregations:·Ö×é¾ÛºÏ
Transformations:ÁÐÖ®¼ä¼ÆËã
ÔÚ featuretools ÖУ¬¿ÉÒÔʹÓÃÕâЩÔÓï×ÔÐд´½¨ÐÂÌØÐÔ£¬Ò²¿ÉÒÔ½«¶à¸öÔÓïµþ¼ÓÔÚÒ»Æð¡£ÏÂÃæÊÇfeaturetoolsÖеÄһЩ¹¦ÄÜÔÓïÁÐ±í£º

´ËÍ⣬ÎÒÃÇÒ²¿ÉÒÔ¶¨Òå×Ô¶¨ÒåÔÓï
½ÓÏÂÀ´ÊǽøÐÐÌØÕ÷¹¹Ô죬ÕâÒ²ÊÇ×Ô¶¯»¯ÌØÕ÷¹¤³ÌÖÐ×îÖØÒªµÄÒ»²½£º
features, feature_names
= ft.dfs(entityset=es, target_entity='clients',
agg_primitives=['mean', 'max', 'percent_true',
'last'],
trans_primitives=['years', 'month', 'subtract',
'divide']) |
µ±È»£¬Ò²¿ÉÒÔÈà featuretools ×Ô¶¯ÎªÎÒÃÇÑ¡ÔñÌØÕ÷£º
| features, feature_names
= ft.dfs(entityset=es, target_entity='clients',
max_depth=2) |
3.2 Boruta
BorutaÖ÷ÒªÊÇÓÃÀ´½øÐÐÌØÕ÷Ñ¡Ôñ¡£ËùÒÔÑϸñÒâÒåÉÏ£¬Boruta²¢²»ÊÇÎÒÃÇËùÐèÒªµÄ×Ô¶¯»¯ÌØÕ÷¹¤³Ì°ü¡£
Boruta-pyÊÇbroutaÌØÕ÷Ô¼¼ò²ßÂÔµÄÒ»ÖÖʵÏÖ£¬ÔڸòßÂÔÖУ¬ÎÊÌâÒÔÒ»ÖÖÍêÈ«Ïà¹ØµÄ·½Ê½¹¹½¨£¬Ëã·¨±£Áô¶ÔÄ£ÐÍÓÐÏÔÖø¹±Ï×µÄËùÓÐÌØÕ÷¡£ÕâÓëÐí¶àÌØÕ÷Ô¼¼òËã·¨ËùÓ¦ÓõÄ×îС×îÓÅÌØÕ÷¼¯Ïà·´¡£boruta·½·¨Í¨¹ý´´½¨ÓÉÄ¿±êÌØÕ÷µÄËæ»úÖØÅÅÐòÖµ×é³ÉµÄºÏ³ÉÌØÕ÷À´È·¶¨ÌØÕ÷µÄÖØÒªÐÔ£¬È»ºóÔÚÔÊ¼ÌØÕ÷¼¯µÄ»ù´¡ÉÏѵÁ·Ò»¸ö¼òµ¥µÄ»ùÓÚÊ÷µÄ·ÖÀàÆ÷£¬ÔÚÕâ¸ö·ÖÀàÆ÷ÖУ¬Ä¿±êÌØÕ÷±»ºÏ³ÉÌØÕ÷ËùÌæ´ú¡£ËùÓÐÌØÐÔµÄÐÔÄܲîÒìÓÃÓÚ¼ÆËãÏà¶ÔÖØÒªÐÔ¡£
Borutaº¯Êýͨ¹ýÑ»·µÄ·½Ê½ÆÀ¼Û¸÷±äÁ¿µÄÖØÒªÐÔ£¬ÔÚÿһÂÖµü´úÖУ¬¶ÔÔʼ±äÁ¿ºÍÓ°×Ó±äÁ¿½øÐÐÖØÒªÐԱȽϡ£Èç¹ûÔʼ±äÁ¿µÄÖØÒªÐÔÏÔÖø¸ßÓÚÓ°×Ó±äÁ¿µÄÖØÒªÐÔ£¬ÔòÈÏΪ¸ÃÔʼ±äÁ¿ÊÇÖØÒªµÄ£»Èç¹ûÔʼ±äÁ¿µÄÖØÒªÐÔÃ÷ÏÔµÍÓÚÓ°×Ó±äÁ¿µÄÖØÒªÐÔ£¬ÔòÈÏΪ¸ÃÔʼ±äÁ¿ÊDz»ÖØÒªµÄ¡£ÆäÖУ¬Ôʼ±äÁ¿¾ÍÊÇÎÒÃÇÊäÈëµÄÒª½øÐÐÌØÕ÷Ñ¡ÔñµÄ±äÁ¿£»Ó°×Ó±äÁ¿¾ÍÊǸù¾ÝÔʼ±äÁ¿Éú³ÉµÄ±äÁ¿
Éú³É¹æÔòÊÇ£º
ÏÈÏòÔʼ±äÁ¿ÖмÓÈëËæ»ú¸ÉÈÅÏÕâÑùµÃµ½µÄÊÇÀ©Õ¹ºóµÄ±äÁ¿
´ÓÀ©Õ¹ºóµÄ±äÁ¿ÖнøÐгéÑù£¬µÃµ½Ó°×Ó±äÁ¿
ʹÓÃpythonÀ´ÊµÏÖÓ°×ÓÌØÕ÷£¬ÀàËÆÓÚ£º
# ´ÓѵÁ·Êý¾Ý¼¯»ñÈ¡ÌØÕ÷
z = train_df[f].values
# Shuffle
np.random.shuffle(z)
# Ó°×ÓÌØÕ÷
train_df[f + "shadow"] = z |
ÏÂÃæÊÇBorutaËã·¨ÔËÐеIJ½Ö裺
Ê×ÏÈ£¬Ëüͨ¹ý´´½¨»ìºÏÊý¾ÝµÄËùÓÐÌØÕ÷£¨¼´Ó°×ÓÌØÕ÷£©Îª¸ø¶¨µÄÊý¾Ý¼¯Ôö¼ÓÁËËæ»úÐÔ¡£
È»ºó£¬ËüѵÁ·Ò»¸öËæ»úÉÁÖ·ÖÀàµÄÀ©Õ¹Êý¾Ý¼¯£¬²¢²ÉÓÃÒ»¸öÌØÕ÷ÖØÒªÐÔ´ëÊ©£¨Ä¬ÈÏÉ趨Ϊƽ¾ù¼õÉÙ¾«¶È£©£¬ÒÔÆÀ¹ÀµÄÿ¸öÌØÕ÷µÄÖØÒªÐÔ£¬Ô½¸ßÔòÒâζ×ÅÔ½ÖØÒª¡£
ÔÚÿ´Îµü´úÖУ¬Ëü¼ì²éÒ»¸öÕæÊµÌØÕ÷ÊÇ·ñ±È×îºÃµÄÓ°×ÓÌØÕ÷¾ßÓиü¸ßµÄÖØÒªÐÔ£¨¼´¸ÃÌØÕ÷ÊÇ·ñ±È×î´óµÄÓ°×ÓÌØÕ÷µÃ·Ö¸ü¸ß£©²¢ÇÒ²»¶Ïɾ³ýËüÊÓΪ·Ç³£²»ÖØÒªµÄÌØÕ÷¡£
×îºó£¬µ±ËùÓÐÌØÕ÷µÃµ½È·ÈÏ»ò¾Ü¾ø£¬»òËã·¨´ïµ½Ëæ»úÉÁÖÔËÐеÄÒ»¸ö¹æ¶¨µÄÏÞÖÆÊ±£¬Ë㷨ֹͣ¡£
3.3 tsfresh
tsfreshÊÇ»ùÓÚ¿ÉÉìËõ¼ÙÉè¼ìÑéµÄʱ¼äÐòÁÐÌØÕ÷ÌáÈ¡¹¤¾ß¡£¸Ã°ü°üº¬¶àÖÖÌØÕ÷ÌáÈ¡·½·¨ºÍ³°ôÌØÕ÷Ñ¡ÔñËã·¨¡£
tsfresh¿ÉÒÔ×Ô¶¯µØ´Óʱ¼äÐòÁÐÖÐÌáÈ¡100¶à¸öÌØÕ÷¡£ÕâÐ©ÌØÕ÷ÃèÊöÁËʱ¼äÐòÁеĻù±¾ÌØÕ÷£¬Èç·åÖµÊýÁ¿¡¢Æ½¾ùÖµ»ò×î´óÖµ£¬»ò¸ü¸´ÔÓµÄÌØÕ÷£¬Èçʱ¼ä·´×ª¶Ô³ÆÐÔͳ¼ÆÁ¿µÈ¡£

Õâ×éÌØÕ÷¿ÉÒÔÓÃÀ´ÔÚʱ¼äÐòÁÐÉϹ¹½¨Í³¼Æ»ò»úÆ÷ѧϰģÐÍ£¬ÀýÈçÔڻعé»ò·ÖÀàÈÎÎñÖÐʹÓá£
ʱ¼äÐòÁÐͨ³£°üº¬ÔëÉù¡¢ÈßÓà»òÎÞ¹ØÐÅÏ¢¡£Òò´Ë£¬´ó²¿·ÖÌáÈ¡³öÀ´µÄÌØÕ÷¶Ôµ±Ç°µÄ»úÆ÷ѧϰÈÎÎñûÓÐÓô¦¡£ÎªÁ˱ÜÃâÌáÈ¡²»Ïà¹ØµÄÌØÐÔ£¬tsfresh°üÓÐÒ»¸öÄÚÖõĹýÂ˹ý³Ì¡£Õâ¸ö¹ýÂ˹ý³ÌÆÀ¹Àÿ¸öÌØÕ÷¶ÔÓÚÊÖÍ·µÄ»Ø¹é»ò·ÖÀàÈÎÎñµÄ½âÊÍÄÜÁ¦ºÍÖØÒªÐÔ¡£Ëü½¨Á¢ÔÚÍêÉÆµÄ¼ÙÉè¼ìÑéÀíÂ۵Ļù´¡ÉÏ£¬²ÉÓÃÁ˶àÖÖ¼ìÑé·½·¨¡£
ÐèҪעÒâµÄÊÇ£¬ÔÚʹÓÃtsfreshÌáÈ¡ÌØÕ÷ʱ£¬ÐèÒªÌáǰ°Ñ½á¹¹½øÐÐת»»£¬Ò»°ãÉÏÐèת»»Îª(None,2)µÄ½á¹¹£¬ÀýÈçÏÂͼËùʾ£º

Àý×Ó£º
import matplotlib.pylab
as plt
from tsfresh import extract_features, select_features
from tsfresh.utilities.dataframe_functions import
impute
from tsfresh.feature_extraction import ComprehensiveFCParameters
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import pandas as pd
import numpy as np
if __name__ == '__main__':
N = 500
df = pd.read_csv('UCI HAR Dataset/train/Inertial
Signals/body_acc_x_train.txt', delim_whitespace=True,
header=None)
y = pd.read_csv('UCI HAR Dataset/train/y_train.txt',
delim_whitespace=True, header=None, squeeze=True)[:N]
# plt.title('accelerometer reading')
# plt.plot(df.ix[0, :])
# plt.show()
#
extraction_settings = ComprehensiveFCParameters()
master_df = pd.DataFrame({'feature': df[:N].values.flatten(),
'id': np.arange(N).repeat(df.shape[1])}) # ʱ¼äÐòÁÐÌØÕ÷¹¤³Ì
X = extract_features(timeseries_container=master_df,
n_jobs=0, column_id='id', impute_function=impute,
default_fc_parameters=extraction_settings) X_train, X_test, y_train, y_test = train_test_split(X,
y, test_size=0.2)
cl = DecisionTreeClassifier()
cl.fit(X_train, y_train)
print(classification_report(y_test, cl.predict(X_test))) # δ½øÐÐʱ¼äÐòÁÐÌØÕ÷¹¤³Ì
X_1 = df.ix[:N - 1, :]
X_train, X_test, y_train, y_test = train_test_split(X_1,
y, test_size=.2)
cl = DecisionTreeClassifier()
cl.fit(X_train, y_train)
print(classification_report(y_test, cl.predict(X_test))) |
´ËÍ⣬¶ÔÓÚ½øÐÐʱ¼äÐòÁÐÌØÕ÷¹¤³ÌºóµÄÊý¾Ý¼¯½øÐÐÌØÕ÷Ñ¡Ôñ£¬½øÒ»²½Ìá¸ßÄ£ÐÍÖ¸±ê¡£
ÕâÀ¿ÉÒÔÀûÓÃtsfresh.select_features·½·¨½øÐÐÌØÕ÷Ñ¡Ôñ£¬È»¶øÓÉÓÚÆä½öÊÊÓÃÓÚ¶þ½øÖÆ·ÖÀà»ò»Ø¹éÈÎÎñ£¬ËùÒÔ¶ÔÓÚ6¸ö±êÇ©µÄ¶à·ÖÀ࣬ÎÒÃǽ«¶à·ÖÀàÎÊÌâת»»Îª6¸ö¶þÔª·ÖÀàÎÊÌ⣬¹Ê¶ÔÓÚÿһÖÖ·ÖÀ࣬¶¼¿ÉÒÔͨ¹ý¶þ·ÖÀà½øÐÐÌØÕ÷Ñ¡Ôñ£º
relevant_features
= set()
for label in y.unique():
y_train_binary = y_train == label
X_train_filtered = select_features(X_train, y_train_binary)
print("Number of relevant features for class
{}: {}/{}".format(label, X_train_filtered.shape[1],
X_train.shape[1]))
relevant_features = relevant_features.union(set(X_train_filtered.columns))
X_train_filtered = X_train[list(relevant_features)]
X_test_filtered = X_test[list(relevant_features)]
cl = DecisionTreeClassifier()
cl.fit(X_train_filtered, y_train)
print(classification_report(y_test, cl.predict(X_test_filtered))) |
×¢Ò⣺ÔÚWindows¿ª·¢»·¾³Ï£¬»áÅ׳ö¡°The "freeze_support()"
line can be omitted if the program is not going to
be frozen to produce an executable.¡±¶à½ø³ÌµÄ´íÎ󣬵¼ÖÂÎÞÏÞÑ»·£¬½â¾ö·½·¨ÊÇÔÚ´úÂëÖ´ÐÐʱÒýÈ롱
if __name__ == '__main__¡¯£º¡° ¡£¿É²Î¿¼£º https://github.com/blue-yonder/tsfresh/issues/185
¡£
ÒÔÏÂÊÇ·Ö±ðʹÓÃtsfresh½øÐÐÌØÕ÷¹¤³Ì¡¢Î´½øÐÐÌØÕ÷¹¤³ÌÒÔ¼°Ê¹ÓÃtsfresh½øÐÐÌØÕ÷¹¤³Ì+ÌØÕ÷Ñ¡ÔñºóµÄÄ£ÐÍЧ¹û£º


4. ×ܽá
×Ô¶¯»¯ÌØÕ÷¹¤³Ì½â¾öÁËÌØÕ÷¹¹ÔìµÄÎÊÌ⣬µ«Í¬Ê±Ò²²úÉúÁËÁíÒ»¸öÎÊÌ⣺ÔÚÊý¾ÝÁ¿Ò»¶¨µÄǰÌáÏ£¬ÓÉÓÚ²úÉú¹ý¶àµÄÌØÕ÷£¬ÍùÍùÐèÒª½øÐÐÏàÓ¦µÄÌØÕ÷Ñ¡ÔñÒÔ±ÜÃâÄ£ÐÍÐÔÄܵĽµµÍ¡£ÊÂʵÉÏ£¬Òª±£Ö¤Ä£ÐÍÐÔÄÜ£¬ÆäËùÐèµÄÊý¾ÝÁ¿¼¶ÐèÒªËæ×ÅÌØÕ÷µÄÊýÁ¿³ÊÖ¸Êý¼¶Ôö³¤¡£ |