ÕªÒª£ºÊÇ·ñÏëÔÚPython½âÊÍÆ÷µÄÄÚ²¿»ÎÓÆÒ»È¦£¿ÊDz»ÊÇÏëʵÏÖÒ»¸öPython´úÂëÖ´ÐеÄ×·×ÙÆ÷£¿Ã»Óлù´¡£¿²»ÒªÅ£¬ÕâÆªÎÄÕÂÈÃÄã³õ¿úPythonµ×²ãµÄ°ÂÃî¡£
¡¾±àÕß°´¡¿ÏÂÃæ²©ÎĽ«´øÄã´´½¨Ò»¸ö×Ö½ÚÂë¼¶±ðµÄ×·×ÙAPIÒÔ×·×ÙPythonµÄһЩÄÚ²¿»úÖÆ£¬±ÈÈçÀàËÆYIELDVALUE¡¢YIELDFROM²Ù×÷ÂëµÄʵÏÖ£¬ÍÆÊ½¹¹ÔìÁбí(List Comprehensions)¡¢Éú³ÉÆ÷±í´ïʽ(generator expressions)ÒÔ¼°ÆäËûһЩÓÐȤPythonµÄ±àÒë¡£
¹ØÓÚÒëÕߣºÕÔ±ó£¬ OneAPM¹¤³Ìʦ£¬³£ÄêʹÓà Python/Perl ½Å±¾£¬´ÓÊ DevOP¡¢²âÊÔ¿ª·¢Ïà¹ØµÄ¿ª·¢¹¤×÷¡£ÒµÓàÈȰ®¿´Ê飬ϲ»¶ MOOC¡£

ÒÔÏÂΪÒëÎÄ
×î½üÎÒÔÚѧϰ Python µÄÔËÐÐÄ£ÐÍ¡£ÎÒ¶Ô Python µÄһЩÄÚ²¿»úÖÆºÜÊÇºÃÆæ£¬±ÈÈç Python ÊÇÔõôʵÏÖÀàËÆ YIELDVALUE¡¢YIELDFROM ÕâÑùµÄ²Ù×÷ÂëµÄ£»¶ÔÓÚ µÝÍÆÊ½¹¹ÔìÁбí(List Comprehensions)¡¢Éú³ÉÆ÷±í´ïʽ(generator expressions)ÒÔ¼°ÆäËûһЩÓÐȤµÄ Python ÌØÐÔÊÇÔõô±àÒëµÄ£»´Ó×Ö½ÚÂëµÄ²ãÃæÀ´¿´£¬µ±Òì³£Å׳öµÄʱºò¶¼·¢ÉúÁËʲôÊÂÇé¡£·ÔÄ CPython µÄ´úÂë¶ÔÓÚ½â´ðÕâЩÎÊÌ⵱ȻÊǺÜÓаïÖúµÄ£¬µ«ÎÒÈÔÈ»¾õµÃÒÔÕâÑùµÄ·½Ê½À´×öµÄ»°¶ÔÓÚÀí½â×Ö½ÚÂëµÄÖ´ÐкͶÑÕ»µÄ±ä»¯»¹ÊÇȱÉÙµãʲô¡£GDB ÊǸöºÃÑ¡Ôñ£¬µ«ÊÇÎÒÀÁ£¬¶øÇÒÖ»ÏëʹÓÃһЩ±È½Ï¸ß½×µÄ½Ó¿Úдµã Python ´úÂëÀ´Íê³ÉÕâ¼þÊ¡£
ËùÒÔÄØ£¬ÎÒµÄÄ¿±ê¾ÍÊÇ´´½¨Ò»¸ö×Ö½ÚÂë¼¶±ðµÄ×·×Ù API£¬ÀàËÆ sys.setrace ËùÌṩµÄÄÇÑù£¬µ«Ïà¶Ô¶øÑÔ»áÓиüºÃµÄÁ£¶È¡£Õâ³ä·Ö¶ÍÁ¶ÁËÎÒ±àд Python ʵÏÖµÄ C ´úÂëµÄ±àÂëÄÜÁ¦¡£ÎÒÃÇËùÐèÒªµÄÓÐÈçϼ¸ÏÔÚÕâÆªÎÄÕÂÖÐËùÓÃµÄ Python °æ±¾Îª 3.5¡£
- Ò»¸öÐ嵀 Cpython ½âÊÍÆ÷²Ù×÷Âë
-
Ò»ÖÖ½«²Ù×÷Âë×¢Èëµ½ Python ×Ö½ÚÂëµÄ·½·¨
-
һЩÓÃÓÚ´¦Àí²Ù×÷ÂëµÄ Python ´úÂë
Ò»¸öÐ嵀 Cpython ²Ù×÷Âë
вÙ×÷Â룺DEBUG_OP
Õâ¸öеIJÙ×÷Âë DEBUG_OP ÊÇÎÒµÚÒ»´Î³¢ÊÔд CPython ʵÏÖµÄ C ´úÂ룬ÎÒ½«¾¡¿ÉÄܵÄÈÃËü±£³Ö¼òµ¥¡£ ÎÒÃÇÏëÒª´ï³ÉµÄÄ¿µÄÊÇ£¬µ±ÎÒÃǵIJÙ×÷Âë±»Ö´ÐеÄʱºòÎÒÄÜÓÐÒ»ÖÖ·½Ê½À´µ÷ÓÃһЩ Python ´úÂ롣ͬʱ£¬ÎÒÃÇÒ²ÏëÄܹ»×·×ÙһЩÓëÖ´ÐÐÉÏÏÂÎÄÓйصÄÊý¾Ý¡£ÎÒÃǵIJÙ×÷Âë»á°ÑÕâЩÐÅÏ¢µ±×÷²ÎÊý´«µÝ¸øÎÒÃǵĻص÷º¯Êý¡£Í¨¹ý²Ù×÷ÂëÄܱæÊ¶³öµÄÓÐÓÃÐÅÏ¢ÈçÏ£º
- ¶ÑÕ»µÄÄÚÈÝ
-
Ö´ÐÐ DEBUG_OP µÄÖ¡¶ÔÏóÐÅÏ¢
ËùÒÔÄØ£¬ÎÒÃǵIJÙ×÷ÂëÐèÒª×öµÄÊÂÇéÊÇ£º
- ÕÒµ½»Øµ÷º¯Êý
-
´´½¨Ò»¸ö°üº¬¶ÑÕ»ÄÚÈݵÄÁбí
-
µ÷Óûص÷º¯Êý£¬²¢½«°üº¬¶ÑÕ»ÄÚÈݵÄÁбíºÍµ±Ç°Ö¡×÷Ϊ²ÎÊý´«µÝ¸øËü
ÌýÆðÀ´Í¦¼òµ¥µÄ£¬ÏÖÔÚ¿ªÊ¼¶¯ÊÖ°É£¡ÉùÃ÷£ºÏÂÃæËùÓеĽâÊÍ˵Ã÷ºÍ´úÂëÊǾ¹ýÁË´óÁ¿¶Î´íÎóµ÷ÊÔÖ®ºó×ܽáµÃµ½µÄ½áÂÛ¡£Ê×ÏÈÒª×öµÄÊǸø²Ù×÷Â붨ÒåÒ»¸öÃû×ÖºÍÏàÓ¦µÄÖµ£¬Òò´ËÎÒÃÇÐèÒªÔÚ Include/opcode.hÖÐÌí¼Ó´úÂë¡£
/** My own comments begin by '**' **/ /** From: Includes/opcode.h **/ /* Instruction opcodes for compiled code */ /** We just have to define our opcode with a free value 0 was the first one I found **/ #define DEBUG_OP 0 #define POP_TOP 1 #define ROT_TWO 2 #define ROT_THREE 3
|
Õⲿ·Ö¹¤×÷¾ÍÍê³ÉÁË£¬ÏÖÔÚÎÒÃÇÈ¥±àд²Ù×÷ÂëÕæÕý¸É»îµÄ´úÂë¡£
ʵÏÖ DEBUG_OP
ÔÚ¿¼ÂÇÈçºÎʵÏÖDEBUG_OP֮ǰÎÒÃÇÐèÒªÁ˽âµÄÊÇDEBUG_OPÌṩµÄ½Ó¿Ú½«³¤Ê²Ã´Ñù¡£ ÓµÓÐÒ»¸ö¿ÉÒÔµ÷ÓÃÆäËû´úÂëµÄвÙ×÷ÂëÊÇÏ൱¿áÑ£µÄ£¬µ«ÊǾ¿¾¹Ëü½«µ÷ÓÃÄÄЩ´úÂëÄó£¿Õâ¸ö²Ù×÷ÂëÈçºÎÕÒµ½»Øµ÷º¯ÊýµÄÄó£¿ÎÒÑ¡ÔñÁËÒ»ÖÖ×î¼òµ¥µÄ·½·¨£ºÔÚÖ¡µÄÈ«¾ÖÇøÓòдËÀº¯ÊýÃû¡£ÄÇôÎÊÌâ¾Í±ä³ÉÁË£¬ÎÒ¸ÃÔõô´Ó×ÖµäÖÐÕÒµ½Ò»¸ö¹Ì¶¨µÄ C ×Ö·û´®£¿ÎªÁ˻شðÕâ¸öÎÊÌâÎÒÃÇÀ´¿´¿´ÔÚ Python µÄ main loop ÖÐʹÓõ½µÄºÍÉÏÏÂÎĹÜÀíÏà¹ØµÄ±êʶ·û__enter__ºÍ__exit__¡£
ÎÒÃÇ¿ÉÒÔ¿´µ½ÕâÁ½±êʶ·û±»Ê¹ÓÃÔÚ²Ù×÷ÂëSETUP_WITHÖУº
/** From: Python/ceval.c **/ TARGET(SETUP_WITH) { _Py_IDENTIFIER(__exit__); _Py_IDENTIFIER(__enter__); PyObject *mgr = TOP(); PyObject *exit = special_lookup(mgr, &PyId___exit__), *enter; PyObject *res;
|
ÏÖÔÚ£¬¿´Ò»ÑÛºê_Py_IDENTIFIERµÄ¶¨Òå
/** From: Include/object.h **/ /********************* String Literals ********************************/ /* This structure helps managing static strings. The basic usage goes like this: Instead of doing r = PyObject_CallMethod(o, "foo", "args", ...); do _Py_IDENTIFIER(foo); ... r = _PyObject_CallMethodId(o, &PyId_foo, "args", ...); PyId_foo is a static variable, either on block level or file level. On first usage, the string "foo" is interned, and the structures are linked. On interpreter shutdown, all strings are released (through _PyUnicode_ClearStaticStrings). Alternatively, _Py_static_string allows to choose the variable name. _PyUnicode_FromId returns a borrowed reference to the interned string. _PyObject_{Get,Set,Has}AttrId are __getattr__ versions using _Py_Identifier*. */ typedef struct _Py_Identifier { struct _Py_Identifier *next; const char* string; PyObject *object; } _Py_Identifier; #define _Py_static_string_init(value) { 0, value, 0 } #define _Py_static_string(varname, value) static _Py_Identifier varname = _Py_static_string_init(value) #define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)
|
àÅ£¬×¢ÊͲ¿·ÖÒѾ˵Ã÷µÃºÜÇå³þÁË¡£Í¨¹ýÒ»·¬²éÕÒ£¬ÎÒÃÇ·¢ÏÖÁË¿ÉÒÔÓÃÀ´´Ó×ÖµäÕҹ̶¨×Ö·û´®µÄº¯Êý_PyDict_GetItemId£¬ËùÒÔÎÒÃDzÙ×÷ÂëµÄ²éÕÒ²¿·ÖµÄ´úÂë¾ÍÊdz¤ÕâÑùµÎ¡£
/** Our callback function will be named op_target **/ PyObject *target = NULL; _Py_IDENTIFIER(op_target); target = _PyDict_GetItemId(f->f_globals, &PyId_op_target); if (target == NULL && _PyErr_OCCURRED()) { if (!PyErr_ExceptionMatches(PyExc_KeyError)) goto error; PyErr_Clear(); DISPATCH(); }
|
ΪÁË·½±ãÀí½â£¬¶ÔÕâÒ»¶Î´úÂë×öһЩ˵Ã÷£º
-
fÊǵ±Ç°µÄÖ¡£¬f->f_globalsÊÇËüµÄÈ«¾ÖÇøÓò
-
Èç¹ûÎÒÃÇûÓÐÕÒµ½op_target£¬ÎÒÃǽ«»á¼ì²éÕâ¸öÒì³£ÊDz»ÊÇKeyError
-
goto error;ÊÇÒ»ÖÖÔÚ main loop ÖÐÅ׳öÒì³£µÄ·½·¨
-
PyErr_Clear()ÒÖÖÆÁ˵±Ç°Òì³£µÄÅ׳ö£¬¶øDISPATCH()´¥·¢ÁËÏÂÒ»¸ö²Ù×÷ÂëµÄÖ´ÐÐ
ÏÂÒ»²½¾ÍÊÇÊÕ¼¯ÎÒÃÇÏëÒªµÄ¶ÑÕ»ÐÅÏ¢¡£
/** This code create a list with all the values on the current stack **/ PyObject *value = PyList_New(0); for (i = 1 ; i <= STACK_LEVEL(); i++) { tmp = PEEK(i); if (tmp == NULL) { tmp = Py_None; } PyList_Append(value, tmp); }
|
×îºóÒ»²½¾ÍÊǵ÷ÓÃÎÒÃǵĻص÷º¯Êý£¡ÎÒÃÇÓÃcall_functionÀ´¸ã¶¨Õâ¼þÊ£¬ÎÒÃÇͨ¹ýÑо¿²Ù×÷ÂëCALL_FUNCTIONµÄʵÏÖÀ´Ñ§Ï°ÔõôʹÓÃcall_function ¡£
/** From: Python/ceval.c **/ TARGET(CALL_FUNCTION) { PyObject **sp, *res; /** stack_pointer is a local of the main loop. It's the pointer to the stacktop of our frame **/ sp = stack_pointer; res = call_function(&sp, oparg); /** call_function handles the args it consummed on the stack for us **/ stack_pointer = sp; PUSH(res); /** Standard exception handling **/ if (res == NULL) goto error; DISPATCH(); }
|
ÓÐÁËÉÏÃæÕâЩÐÅÏ¢£¬ÎÒÃÇÖÕÓÚ¿ÉÒÔµ·¹Ä³öÒ»¸ö²Ù×÷ÂëDEBUG_OPµÄ²Ý¸åÁË£º
TARGET(DEBUG_OP) { PyObject *value = NULL; PyObject *target = NULL; PyObject *res = NULL; PyObject **sp = NULL; PyObject *tmp; int i; _Py_IDENTIFIER(op_target); target = _PyDict_GetItemId(f->f_globals, &PyId_op_target); if (target == NULL && _PyErr_OCCURRED()) { if (!PyErr_ExceptionMatches(PyExc_KeyError)) goto error; PyErr_Clear(); DISPATCH(); } value = PyList_New(0); Py_INCREF(target); for (i = 1 ; i <= STACK_LEVEL(); i++) { tmp = PEEK(i); if (tmp == NULL) tmp = Py_None; PyList_Append(value, tmp); } PUSH(target); PUSH(value); Py_INCREF(f); PUSH(f); sp = stack_pointer; res = call_function(&sp, 2); stack_pointer = sp; if (res == NULL) goto error; Py_DECREF(res); DISPATCH(); }
|
ÔÚ±àд CPython ʵÏÖµÄ C ´úÂë·½ÃæÎÒȷʵûÓÐʲô¾Ñ飬ÓпÉÄÜÎÒ©µôÁËЩϸ½Ú¡£Èç¹ûÄúÓÐʲô½¨Ò黹ÇëÄú¾ÀÕý£¬ÎÒÆÚ´ýÄúµÄ·´À¡¡£
±àÒëËü£¬³ÉÁË£¡
Ò»Çп´ÆðÀ´ºÜ˳Àû£¬µ«Êǵ±ÎÒÃdz¢ÊÔȥʹÓÃÎÒÃǶ¨ÒåµÄ²Ù×÷ÂëDEBUG_OPµÄʱºòȴʧ°ÜÁË¡£×Ô´Ó 2008 ÄêÖ®ºó£¬Python ʹÓÃÔ¤ÏÈдºÃµÄ goto(ÄãÒ²¿ÉÒÔ´Ó ÕâÀï»ñÈ¡¸ü¶àµÄѶϢ)¡£¹Ê£¬ÎÒÃÇÐèÒª¸üÐÂÏ goto jump table£¬ÎÒÃÇÔÚ Python/opcode_targets.h ÖÐ×öÈçÏÂÐ޸ġ£
/** From: Python/opcode_targets.h **/ /** Easy change since DEBUG_OP is the opcode number 1 **/ static void *opcode_targets[256] = { //&&_unknown_opcode, &&TARGET_DEBUG_OP, &&TARGET_POP_TOP, /** ... **/
|
Õâ¾ÍÍêÊÂÁË£¬ÎÒÃÇÏÖÔÚ¾ÍÓÐÁËÒ»¸ö¿ÉÒÔ¹¤×÷µÄвÙ×÷Â롣ΨһµÄÎÊÌâ¾ÍÊÇÕâ»õËäÈ»´æÔÚ£¬µ«ÊÇûÓб»È˵÷Óùý¡£½ÓÏÂÀ´£¬ÎÒÃǽ«DEBUG_OP×¢Èëµ½º¯ÊýµÄ×Ö½ÚÂëÖС£
ÔÚ Python ×Ö½ÚÂëÖÐ×¢Èë²Ù×÷Âë DEBUG_OP
Óкܶ෽ʽ¿ÉÒÔÔÚ Python ×Ö½ÚÂëÖÐ×¢ÈëеIJÙ×÷Â룺
- ʹÓà peephole optimizer£¬ Quarkslab¾ÍÊÇÕâô¸ÉµÄ
-
ÔÚÉú³É×Ö½ÚÂëµÄ´úÂëÖж¯Ð©ÊÖ½Å
-
ÔÚÔËÐÐʱֱ½ÓÐ޸ĺ¯ÊýµÄ×Ö½ÚÂë(Õâ¾ÍÊÇÎÒÃǽ«Òª¸ÉµÄʶù)
ΪÁË´´Ôì³öÒ»¸öвÙ×÷Â룬ÓÐÁËÉÏÃæµÄÄÇÒ»¶Ñ C ´úÂë¾Í¹»ÁË¡£ÏÖÔÚÈÃÎÒÃǻص½Ôµã£¬¿ªÊ¼Àí½âÆæ¹ÖÉõÖÁÉñÆæµÄ Python£¡
ÎÒÃǽ«Òª×öµÄʶùÓУº
- µÃµ½ÎÒÃÇÏëҪ׷×Ùº¯ÊýµÄ code object
-
ÖØÐ´×Ö½ÚÂëÀ´×¢ÈëDEBUG_OP
-
½«ÐÂÉú³ÉµÄ code object Ìæ»»»ØÈ¥
ºÍ code object ÓйصÄСÌùÊ¿
Èç¹ûÄã´ÓûÌý˵¹ý code object£¬ÕâÀïÓÐÒ»¸ö¼òµ¥µÄ ½éÉÜÍøÂ·ÉÏÒ²ÓÐһЩÏà¹ØµÄ Îĵµ¿É¹©²éÔÄ,¿ÉÒÔÖ±½ÓCtrl+F²éÕÒ code object
»¹ÓÐÒ»¼þÊÂÇéÐèҪעÒâµÄÊÇÔÚÕâÆªÎÄÕÂËùÖ¸µÄ»·¾³ÖÐ code object ÊDz»¿É±äµÄ£º
Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> x = lambda y : 2 >>> x.__code__ <code object <lambda> at 0x7f481fd88390, file "<stdin>", line 1> >>> x.__code__.co_name '<lambda>' >>> x.__code__.co_name = 'truc' Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute >>> x.__code__.co_consts = ('truc',) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute
|
µ«ÊDz»Óõ£ÐÄ£¬ÎÒÃǽ«»áÕÒµ½·½·¨ÈƹýÕâ¸öÎÊÌâµÄ¡£
ʹÓõŤ¾ß
ΪÁËÐÞ¸Ä×Ö½ÚÂëÎÒÃÇÐèҪһЩ¹¤¾ß£º
-
disÄ£¿éÓÃÀ´·´±àÒëºÍ·ÖÎö×Ö½ÚÂë
-
dis.BytecodePython 3.4 ÐÂÔöµÄÒ»¸öÌØÐÔ£¬¶ÔÓÚ·´±àÒëºÍ·ÖÎö×Ö½ÚÂëÌØ±ðÓÐÓÃ
-
Ò»¸öÄܹ»¼òµ¥ÐÞ¸Ä code object µÄ·½·¨
ÓÃdis.Bytecode·´±àÒë code bject ÄܸæËßÎÒÃÇһЩÓйزÙ×÷Âë¡¢²ÎÊýºÍÉÏÏÂÎĵÄÐÅÏ¢¡£
# Python3.4 >>> import dis >>> f = lambda x: x + 3 >>> for i in dis.Bytecode(f.__code__): print (i) ... Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False) Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval=3, argrepr='3', offset=3, starts_line=None, is_jump_target=False) Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False) Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=7, starts_line=None, is_jump_target=False)
|
ΪÁËÄܹ»ÐÞ¸Ä code object£¬ÎÒ¶¨ÒåÁËÒ»¸öºÜСµÄÀàÓÃÀ´¸´ÖÆ code object£¬Í¬Ê±Äܹ»°´ÎÒÃǵÄÐèÇóÐÞ¸ÄÏàÓ¦µÄÖµ£¬È»ºóÖØÐÂÉú³ÉÒ»¸öÐ嵀 code object¡£
class MutableCodeObject(object): args_name = ("co_argcount", "co_kwonlyargcount", "co_nlocals", "co_stacksize", "co_flags", "co_code", "co_consts", "co_names", "co_varnames", "co_filename", "co_name", "co_firstlineno", "co_lnotab", "co_freevars", "co_cellvars") def __init__(self, initial_code): self.initial_code = initial_code for attr_name in self.args_name: attr = getattr(self.initial_code, attr_name) if isinstance(attr, tuple): attr = list(attr) setattr(self, attr_name, attr) def get_code(self): args = [] for attr_name in self.args_name: attr = getattr(self, attr_name) if isinstance(attr, list): attr = tuple(attr) args.append(attr) return self.initial_code.__class__(*args)
|
Õâ¸öÀàÓÃÆðÀ´ºÜ·½±ã£¬½â¾öÁËÉÏÃæÌáµ½µÄ code object ²»¿É±äµÄÎÊÌâ¡£
>>> x = lambda y : 2 >>> m = MutableCodeObject(x.__code__) >>> m <new_code.MutableCodeObject object at 0x7f3f0ea546a0> >>> m.co_consts [None, 2] >>> m.co_consts[1] = '3' >>> m.co_name = 'truc' >>> m.get_code() <code object truc at 0x7f3f0ea2bc90, file "<stdin>", line 1>
|
²âÊÔÎÒÃǵÄвÙ×÷Âë
ÎÒÃÇÏÖÔÚÓµÓÐÁË×¢ÈëDEBUG_OPµÄËùÓй¤¾ß£¬ÈÃÎÒÃÇÀ´ÑéÖ¤ÏÂÎÒÃǵÄʵÏÖÊÇ·ñ¿ÉÓá£ÎÒÃǽ«ÎÒÃǵIJÙ×÷Âë×¢Èëµ½Ò»¸ö×î¼òµ¥µÄº¯ÊýÖУº
from new_code import MutableCodeObject def op_target(*args): print("WOOT") print("op_target called with args <{0}>".format(args)) def nop(): pass new_nop_code = MutableCodeObject(nop.__code__) new_nop_code.co_code = b"\x00" + new_nop_code.co_code[0:3] + b"\x00" + new_nop_code.co_code[-1:] new_nop_code.co_stacksize += 3 nop.__code__ = new_nop_code.get_code() import dis dis.dis(nop) nop() # Don't forget that ./python is our custom Python implementing DEBUG_OP hakril@computer ~/python/CPython3.5 % ./python proof.py 8 0 <0> 1 LOAD_CONST 0 (None) 4 <0> 5 RETURN_VALUE WOOT op_target called with args <([], <frame object at 0x7fde9eaebdb0>)> WOOT op_target called with args <([None], <frame object at 0x7fde9eaebdb0>)>
|
¿´ÆðÀ´Ëü³É¹¦ÁË£¡ÓÐÒ»ÐдúÂëÐèҪ˵Ã÷Ò»ÏÂnew_nop_code.co_stacksize += 3
- co_stacksize ±íʾ code object ËùÐèÒªµÄ¶ÑÕ»µÄ´óС
-
²Ù×÷ÂëDEBUG_OPÍù¶ÑÕ»ÖÐÔö¼ÓÁËÈýÏËùÒÔÎÒÃÇÐèҪΪÕâЩÔö¼ÓµÄÏîÔ¤ÁôЩ¿Õ¼ä
ÏÖÔÚÎÒÃÇ¿ÉÒÔ½«ÎÒÃǵIJÙ×÷Âë×¢È뵽ÿһ¸ö Python º¯ÊýÖÐÁË£¡
ÖØÐ´×Ö½ÚÂë
ÕýÈçÎÒÃÇÔÚÉÏÃæµÄÀý×ÓÖÐËù¿´µ½µÄÄÇÑù£¬ÖØÐ´ Pyhton µÄ×Ö½ÚÂëËÆºõ so easy¡£ÎªÁËÔÚÿһ¸ö²Ù×÷ÂëÖ®¼ä×¢ÈëÎÒÃǵIJÙ×÷Â룬ÎÒÃÇÐèÒª»ñȡÿһ¸ö²Ù×÷ÂëµÄÆ«ÒÆÁ¿£¬È»ºó½«ÎÒÃǵIJÙ×÷Âë×¢Èëµ½ÕâЩλÖÃÉÏ(°ÑÎÒÃDzÙ×÷Âë×¢Èëµ½²ÎÊýÉÏÊÇÓлµ´¦´ó´óµÎ)¡£ÕâÐ©Æ«ÒÆÁ¿Ò²ºÜÈÝÒ×»ñÈ¡£¬Ê¹ÓÃdis.Bytecode £¬¾ÍÏñÕâÑù ¡£
def add_debug_op_everywhere(code_obj): # We get every instruction offset in the code object offsets = [instr.offset for instr in dis.Bytecode(code_obj)] # And insert a DEBUG_OP at every offset return insert_op_debug_list(code_obj, offsets) def insert_op_debug_list(code, offsets): # We insert the DEBUG_OP one by one for nb, off in enumerate(sorted(offsets)): # Need to ajust the offsets by the number of opcodes already inserted before # That's why we sort our offsets! code = insert_op_debug(code, off + nb) return code # Last problem: what does insert_op_debug looks like?
|
»ùÓÚÉÏÃæµÄÀý×Ó£¬ÓÐÈË¿ÉÄÜ»áÏëÎÒÃǵÄinsert_op_debug»áÔÚÖ¸¶¨µÄÆ«ÒÆÁ¿Ôö¼ÓÒ»¸ö"\x00"£¬ÕâÄáÂêÊǸö¿Ó°¡£¡ÎÒÃǵÚÒ»¸öDEBUG_OP×¢ÈëµÄÀý×ÓÖб»×¢ÈëµÄº¯ÊýÊÇûÓÐÈκεķÖÖ§µÄ£¬ÎªÁËÄܹ»ÊµÏÖÍêÃÀÒ»¸öº¯Êý×¢È뺯Êýinsert_op_debugÎÒÃÇÐèÒª¿¼Âǵ½´æÔÚ·ÖÖ§²Ù×÷ÂëµÄÇé¿ö¡£
Python µÄ·ÖÖ§Ò»¹²ÓÐÁ½ÖÖ£º
- ¾ø¶Ô·ÖÖ§£º¿´ÆðÀ´ÊÇÀàËÆÕâÑù×ÓµÄInstruction_Pointer = argument(instruction)
-
Ïà¶Ô·ÖÖ§£º¿´ÆðÀ´ÊÇÀàËÆÕâÑù×ÓµÄInstruction_Pointer += argument(instruction)
ÎÒÃÇÏ£ÍûÕâЩ·ÖÖ§ÔÚÎÒÃDzåÈë²Ù×÷ÂëÖ®ºóÈÔÈ»Äܹ»Õý³£¹¤×÷£¬Îª´ËÎÒÃÇÐèÒªÐÞ¸ÄһЩָÁî²ÎÊý¡£ÒÔÏÂÊÇÆäÂß¼Á÷³Ì£º
- ¶ÔÓÚÿһ¸öÔÚ²åÈëÆ«ÒÆÁ¿Ö®Ç°µÄÏà¶Ô·ÖÖ§¶øÑÔ
-
Èç¹ûÄ¿±êµØÖ·ÊÇÑϸñ´óÓÚÎÒÃǵIJåÈëÆ«ÒÆÁ¿µÄ»°£¬½«Ö¸Áî²ÎÊýÔö¼Ó 1
-
Èç¹ûÏàµÈ£¬Ôò²»ÐèÒªÔö¼Ó 1 ¾ÍÄܹ»ÔÚÌø×ª²Ù×÷ºÍÄ¿±êµØÖ·Ö®¼äÖ´ÐÐÎÒÃǵIJÙ×÷ÂëDEBUG_OP
-
Èç¹ûСÓÚ£¬²åÈëÎÒÃǵIJÙ×÷ÂëµÄ»°²¢²»»áÓ°Ïìµ½Ìø×ª²Ù×÷ºÍÄ¿±êµØÖ·Ö®¼äµÄ¾àÀë
-
¶ÔÓÚ code object ÖеÄÿһ¸ö¾ø¶Ô·ÖÖ§¶øÑÔ
-
Èç¹ûÄ¿±êµØÖ·ÊÇÑϸñ´óÓÚÎÒÃǵIJåÈëÆ«ÒÆÁ¿µÄ»°£¬½«Ö¸Áî²ÎÊýÔö¼Ó 1
-
Èç¹ûÏàµÈ£¬ÄÇô²»ÐèÒªÈκÎÐ޸ģ¬ÀíÓɺÍÏà¶Ô·ÖÖ§²¿·ÖÊÇÒ»ÑùµÄ
-
Èç¹ûСÓÚ£¬²åÈëÎÒÃǵIJÙ×÷ÂëµÄ»°²¢²»»áÓ°Ïìµ½Ìø×ª²Ù×÷ºÍÄ¿±êµØÖ·Ö®¼äµÄ¾àÀë
ÏÂÃæÊÇʵÏÖ£º
# Helper def bytecode_to_string(bytecode): if bytecode.arg is not None: return struct.pack("<Bh", bytecode.opcode, bytecode.arg) return struct.pack("<B", bytecode.opcode) # Dummy class for bytecode_to_string class DummyInstr: def __init__(self, opcode, arg): self.opcode = opcode self.arg = arg def insert_op_debug(code, offset): opcode_jump_rel = ['FOR_ITER', 'JUMP_FORWARD', 'SETUP_LOOP', 'SETUP_WITH', 'SETUP_EXCEPT', 'SETUP_FINALLY'] opcode_jump_abs = ['POP_JUMP_IF_TRUE', 'POP_JUMP_IF_FALSE', 'JUMP_ABSOLUTE'] res_codestring = b"" inserted = False for instr in dis.Bytecode(code): if instr.offset == offset: res_codestring += b"\x00" inserted = True if instr.opname in opcode_jump_rel and not inserted: #relative jump are always forward if offset < instr.offset + 3 + instr.arg: # inserted beetwen jump and dest: add 1 to dest (3 for size) #If equal: jump on DEBUG_OP to get info before exec instr res_codestring += bytecode_to_string (DummyInstr(instr.opcode, instr.arg + 1)) continue if instr.opname in opcode_jump_abs: if instr.arg > offset: res_codestring += bytecode_to_string(DummyInstr(instr.opcode, instr.arg + 1)) continue res_codestring += bytecode_to_string(instr) # replace_bytecode just replaces the original code co_code return replace_bytecode(code, res_codestring)
|
ÈÃÎÒÃÇ¿´Ò»ÏÂЧ¹ûÈçºÎ£º
>>> def lol(x): ... for i in range(10): ... if x == i: ... break >>> dis.dis(lol) 101 0 SETUP_LOOP 36 (to 39) 3 LOAD_GLOBAL 0 (range) 6 LOAD_CONST 1 (10) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 GET_ITER >> 13 FOR_ITER 22 (to 38) 16 STORE_FAST 1 (i) 102 19 LOAD_FAST 0 (x) 22 LOAD_FAST 1 (i) 25 COMPARE_OP 2 (==) 28 POP_JUMP_IF_FALSE 13 103 31 BREAK_LOOP 32 JUMP_ABSOLUTE 13 35 JUMP_ABSOLUTE 13 >> 38 POP_BLOCK >> 39 LOAD_CONST 0 (None) 42 RETURN_VALUE >>> lol.__code__ = transform_code(lol.__code__, add_debug_op_everywhere, add_stacksize=3) >>> dis.dis(lol) 101 0 <0> 1 SETUP_LOOP 50 (to 54) 4 <0> 5 LOAD_GLOBAL 0 (range) 8 <0> 9 LOAD_CONST 1 (10) 12 <0> 13 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 16 <0> 17 GET_ITER >> 18 <0> 102 19 FOR_ITER 30 (to 52) 22 <0> 23 STORE_FAST 1 (i) 26 <0> 27 LOAD_FAST 0 (x) 30 <0> 103 31 LOAD_FAST 1 (i) 34 <0> 35 COMPARE_OP 2 (==) 38 <0> 39 POP_JUMP_IF_FALSE 18 42 <0> 43 BREAK_LOOP 44 <0> 45 JUMP_ABSOLUTE 18 48 <0> 49 JUMP_ABSOLUTE 18 >> 52 <0> 53 POP_BLOCK >> 54 <0> 55 LOAD_CONST 0 (None) 58 <0> 59 RETURN_VALUE # Setup the simplest handler EVER >>> def op_target(stack, frame): ... print (stack) # GO >>> lol(2) [] [] [<class 'range'>] [10, <class 'range'>] [range(0, 10)] [<range_iterator object at 0x7f1349afab80>] [0, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [2, <range_iterator object at 0x7f1349afab80>] [0, 2, <range_iterator object at 0x7f1349afab80>] [False, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [1, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [2, <range_iterator object at 0x7f1349afab80>] [1, 2, <range_iterator object at 0x7f1349afab80>] [False, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [2, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [2, <range_iterator object at 0x7f1349afab80>] [2, 2, <range_iterator object at 0x7f1349afab80>] [True, <range_iterator object at 0x7f1349afab80>] [<range_iterator object at 0x7f1349afab80>] [] [None]
|
ÉõºÃ£¡ÏÖÔÚÎÒÃÇÖªµÀÁËÈçºÎ»ñÈ¡¶ÑÕ»ÐÅÏ¢ºÍ Python ÖÐÿһ¸ö²Ù×÷¶ÔÓ¦µÄÖ¡ÐÅÏ¢¡£ÉÏÃæ½á¹ûËùչʾµÄ½á¹ûĿǰ¶øÑÔ²¢²»ÊǺÜʵÓá£ÔÚ×îºóÒ»²¿·ÖÖÐÈÃÎÒÃǶÔ×¢Èë×ö½øÒ»²½µÄ·â×°¡£
Ôö¼Ó Python ·â×°
ÕýÈçÄúËù¼ûµ½µÄ£¬ËùÓеĵײã½Ó¿Ú¶¼ÊǺÃÓõġ£ÎÒÃÇ×îºóÒª×öµÄÒ»¼þÊÂÊÇÈà op_target ¸ü¼Ó·½±ãʹÓÃ(Õⲿ·ÖÏà¶Ô¶øÑԱȽϿշºÒ»Ð©£¬±Ï¾¹ÔÚÎÒ¿´À´Õâ²»ÊÇÕû¸öÏîÄ¿ÖÐ×îÓÐȤµÄ²¿·Ö)¡£
Ê×ÏÈÎÒÃÇÀ´¿´Ò»ÏÂÖ¡µÄ²ÎÊýËùÄÜÌṩµÄÐÅÏ¢£¬ÈçÏÂËùʾ£º
- f_codeµ±Ç°Ö¡½«Ö´ÐÐµÄ code object
-
f_lastiµ±Ç°µÄ²Ù×÷(code object ÖеÄ×Ö½ÚÂë×Ö·û´®µÄË÷Òý)
¾¹ýÎÒÃǵĴ¦ÀíÎÒÃÇ¿ÉÒÔµÃÖªDEBUG_OPÖ®ºóÒª±»Ö´ÐеIJÙ×÷Â룬Õâ¶ÔÎÒÃǾۺÏÊý¾Ý²¢Õ¹Ê¾ÊÇÏ൱ÓÐÓõġ£
н¨Ò»¸öÓÃÓÚ×·×Ùº¯ÊýÄÚ²¿»úÖÆµÄÀࣺ
- ¸Ä±äº¯Êý×ÔÉíµÄco_code
-
ÉèÖûص÷º¯Êý×÷Ϊop_debugµÄÄ¿±êº¯Êý
Ò»µ©ÎÒÃÇÖªµÀÏÂÒ»¸ö²Ù×÷£¬ÎÒÃǾͿÉÒÔ·ÖÎöËü²¢ÐÞ¸ÄËüµÄ²ÎÊý¡£¾ÙÀýÀ´ËµÎÒÃÇ¿ÉÒÔÔö¼ÓÒ»¸öauto-follow-called-functionsµÄÌØÐÔ¡£
def op_target(l, f, exc=None): if op_target.callback is not None: op_target.callback(l, f, exc) class Trace: def __init__(self, func): self.func = func def call(self, *args, **kwargs): self.add_func_to_trace(self.func) # Activate Trace callback for the func call op_target.callback = self.callback try: res = self.func(*args, **kwargs) except Exception as e: res = e op_target.callback = None return res def add_func_to_trace(self, f): # Is it code? is it already transformed? if not hasattr(f ,"op_debug") and hasattr(f, "__code__"): f.__code__ = transform_code(f.__code__, transform=add_everywhere, add_stacksize=ADD_STACK) f.__globals__['op_target'] = op_target f.op_debug = True def do_auto_follow(self, stack, frame): # Nothing fancy: FrameAnalyser is just the wrapper that gives the next executed instruction next_instr = FrameAnalyser(frame).next_instr() if "CALL" in next_instr.opname: arg = next_instr.arg f_index = (arg & 0xff) + (2 * (arg >> 8)) called_func = stack[f_index] # If call target is not traced yet: do it if not hasattr(called_func, "op_debug"): self.add_func_to_trace(called_func)
|
ÏÖÔÚÎÒÃÇʵÏÖÒ»¸ö Trace µÄ×ÓÀ࣬ÔÚÕâ¸ö×ÓÀàÖÐÔö¼Ó callback ºÍ doreport ÕâÁ½¸ö·½·¨¡£callback ·½·¨½«ÔÚÿһ¸ö²Ù×÷Ö®ºó±»µ÷Óá£doreport ·½·¨½«ÎÒÃÇÊÕ¼¯µ½µÄÐÅÏ¢´òÓ¡³öÀ´¡£
ÕâÊÇÒ»¸öαº¯Êý×·×ÙÆ÷ʵÏÖ£º
class DummyTrace(Trace): def __init__(self, func): self.func = func self.data = collections.OrderedDict() self.last_frame = None self.known_frame = [] self.report = [] def callback(self, stack, frame, exc): if frame not in self.known_frame: self.known_frame.append(frame) self.report.append(" === Entering New Frame {0} ({1}) ===".format(frame.f_code.co_name, id(frame))) self.last_frame = frame if frame != self.last_frame: self.report.append(" === Returning to Frame {0} {1}===".format(frame.f_code.co_name, id(frame))) self.last_frame = frame self.report.append(str(stack)) instr = FrameAnalyser(frame).next_instr() offset = str(instr.offset).rjust(8) opname = str(instr.opname).ljust(20) arg = str(instr.arg).ljust(10) self.report.append("{0} {1} {2} {3}".format(offset, opname, arg, instr.argval)) self.do_auto_follow(stack, frame) def do_report(self): print("\n".join(self.report))
|
ÕâÀïÓÐһЩʵÏÖµÄÀý×ÓºÍʹÓ÷½·¨¡£¸ñʽÓÐЩ²»·½±ã¹Û¿´£¬±Ï¾¹ÎÒ²¢²»Éó¤ÓÚ¸ãÕâÖÖ¶ÔÓû§ÓѺõı¨¸æµÄʶù¡£
-
Àý1×Ô¶¯×·×Ù¶ÑÕ»ÐÅÏ¢ºÍÒѾִÐеÄÖ¸Áî
-
Àý2ÉÏÏÂÎĹÜÀí
µÝÍÆÊ½¹¹ÔìÁбí(List Comprehensions)µÄ×·×ÙʾÀý ¡£
-
Àý3α׷×ÙÆ÷µÄÊä³ö
-
Àý4Êä³öÊÕ¼¯µÄ¶ÑÕ»ÐÅÏ¢
×ܽá
Õâ¸öСÏîÄ¿ÊÇÒ»¸öÁ˽â Python µ×²ãµÄÁ¼ºÃ;¾¶£¬°üÀ¨½âÊÍÆ÷µÄ main loop£¬Python ʵÏÖµÄ C ´úÂë±à³Ì¡¢Python ×Ö½ÚÂ롣ͨ¹ýÕâ¸öС¹¤¾ßÎÒÃÇ¿ÉÒÔ¿´µ½ Python һЩÓÐȤ¹¹Ô캯ÊýµÄ×Ö½ÚÂëÐÐΪ£¬ÀýÈçÉú³ÉÆ÷¡¢ÉÏÏÂÎĹÜÀíºÍµÝÍÆÊ½¹¹ÔìÁÐ±í¡£
ÕâÀïÊÇÕâ¸öСÏîÄ¿µÄÍêÕû´úÂë¡£¸ü½øÒ»²½µÄ£¬ÎÒÃÇ»¹¿ÉÒÔ×öµÄÊÇÐÞ¸ÄÎÒÃÇËù×·×ٵĺ¯ÊýµÄ¶ÑÕ»¡£ÎÒËäÈ»²»È·¶¨Õâ¸öÊÇ·ñÓÐÓ㬵«ÊÇ¿ÉÒԿ϶¨ÊÇÕâÒ»¹ý³ÌÊÇÏ൱ÓÐȤµÄ¡£
ÔÎÄÁ´½Ó£º Understanding Python execution from inside: A Python assembly tracer |