English 中文(简体)
Get stacktrace from stuck python process that does not accept signals
原标题:

I have to run a legacy Zope2 website and have some grievance with it. The biggest issue is that, occasionally, it just locks up, running at 100% CPU load and not answering to requests anymore. While the problem isn t reproducible on a regular basis, one page containing 3 dynamic graphs triggers it sometimes, so I suspect some kind of race condition that leads to an endless loop or a stuck busywait.

The problem is, I have not yet found a way to debug this thing. There s nothing in the Zope logs and nothing in the system logs. I tried the suggestions from this question to get a stacktrace, but the only signal that has any effect is SIGKILL.

Is there another possibility to find out where exactly the process is when it gets stuck?

最佳回答

If the process is stuck in a way that no other signal gets through, you might want to consider running it from a debugger, instead of trying to attach to it at runtime.

Also, it might be useful to other debugging tactics, like turning off certain parts of the code to find out the minimal case in which it is still reproducible in order to see what causes it better.

问题回答

You can print out a nice stack trace using pyrasite.

First, you ll need to have gdb installed.

# Redhat, CentOS, etc
$ yum install gdb

# Ubuntu, Debian, etc
$ apt-get update && apt-get install gdb

Then, install pyrasite.

$ pip install pyrasite

Use ps or some other method to find the process ID for the stuck python process and run pyrasite-shell with it.

# Assuming process ID is 12345
$ pyrasite-shell 12345

You should now see a python REPL. Run the following in the REPL to see stack traces for all threads.

import sys, traceback
for thread_id, frame in sys._current_frames().items():
    print  Stack for thread {} .format(thread_id)
    traceback.print_stack(frame)
    print   

See my answer to this SO question, use Products.signalstack. It registers the same handler as the answer you already found, at Product registration time. Perhaps it works better for you.

If not, you probably have a OS-level I/O problem on your hands, and your only hope is attaching gdb to the process. Search Stack Overflow for gdb answers; there is a wealth of information here!

While pyrasite might work, it does not handle some corner cases and hang/fail silently.

If the package does not work as expected, it s possible to do what the package does under the hood manually to figure out what went wrong.

  • Attach gdb to the Python process: gdb -p <PID> (may need sudo.)
  • Run the following functions by type the commands into gdb
set $gstate = PyGILState_Ensure()
call          PyRun_SimpleString(" <some Python code> ")
call          PyGILState_Release($gstate)

See Python API documentation for the functions: 1 2.


In case Python is not compiled with debug symbols, it s necessary to provide the explicit data types for the functions:

Refer to the Python source code https://github.com/python/cpython/blob/4fe5585240f64c3d14eb635ff82b163f92074b3a/Include/pystate.h#L86-L88 , the type PyGILState_STATE is an enum with 2 values, so we "guess" that we can use int. (although it may not work.)

In conclusion, according to the documentation, the "correct (subject to the restriction above)" commands for the functions are

set $gstate = ((int (*)())            PyGILState_Ensure ) ()
call          ((int (*)(const char*)) PyRun_SimpleString) (" <some Python code> ")
call          ((void(*)(int))         PyGILState_Release) ($gstate)

This solution does not rely on the Python-debugging extension for gdb. Otherwise it s possible to simply run py-bt.


I have a more up-to-date fork of pyrasite, (currently) named pyrasite-ng. If there s any bug it can be reported there, hopefully I can fix it quickly.

You could try to attach a debugger to the running process. See also this question.

after running around the internet in circles for a while I finally ended up here: http://podoliaka.org/2016/04/10/debugging-cpython-gdb/ - describes in detail how all the pieces fit together. the money quote for me was gdb /usr/bin/python -p $PID - the name of the executable is required in order for gdb to find the correct debug info files.

with the arrival python 3.8 you can also use faulthandler

import faulthandler
faulthandler.enable()
faulthandler.dump_traceback_later(timeout=10) 
// it will dump the traceback of all threads after a timeout of "10" seconds in this case

for more info checkout faulthandler documentation





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签