-
Notifications
You must be signed in to change notification settings - Fork 80
Description
Describe the bug
When the Elastic APM Agent for PHP is enabled, child processes spawned from PHP processes deadlock after the fork when the parent (PHP) process has called pthread_create. This seems to be due to a bug in glibc fixed in the following commit: https://sourceware.org/git/?p=glibc.git;a=commit;h=52a103e237329b9f88a28513fe7506ffc3bd8ced. Unfortunately, Debian Bullseye bundles an older version of glibc. I think the change that introduced the issue is part of #857. This is probably also the underlying cause of #876.
I understand that there might not be much the Elastic APM Agent for PHP can do for that, since it is due to Debian Bullseye bundling an older version of glibc. However, I'm posting this in the hope it can help other people experiencing the issue and that we can find a workaround for that bug.
Additional details about my investigation are provided in the Dockerfile code snippet below.
To Reproduce
- Build the following Dockerfile:
FROM php:8.0-cli
RUN curl -fsSL https://github.com/elastic/apm-agent-php/releases/download/v1.8.3/apm-agent-php_1.8.3_all.deb > /tmp/apm-agent.deb \
&& dpkg --install /tmp/apm-agent.deb
# This triggers the bug fixed in the following glibc commit: https://sourceware.org/git/?p=glibc.git;a=commit;h=52a103e237329b9f88a28513fe7506ffc3bd8ced.
# Debian Bullseye uses glibc 2.31, which is an older release.
# The Elastic APM Agent for PHP calls pthread_atfork from a pthread_atfork callback here: https://github.com/elastic/apm-agent-php/blob/da2e59b9cd0acc58fcec067d5e282d37261dc56b/src/ext/platform_threads_linux.c#L532. This triggers the deadlock.
# The pthread_atfork handlers are called from __run_fork_handlers, which is itself called from https://github.com/bminor/glibc/blob/9ea3686266dca3f004ba874745a4087a89682617/sysdeps/nptl/fork.c#L56-L58.
# In order to trigger the bug, the process must have called pthread_create at least once, so that the multiple_threads flag is set.
# cURL calls pthread_create, which sets the multiple_threads flag: https://github.com/bminor/glibc/blob/9ea3686266dca3f004ba874745a4087a89682617/nptl/descr.h#L131-L146.
# This in turn causes __run_fork_handlers to acquire atfork_lock: https://github.com/bminor/glibc/blob/9ea3686266dca3f004ba874745a4087a89682617/nptl/register-atfork.c#L110-L117.
# When pthread_atfork tries to acquire atfork_lock, it blocks because it's already held by __run_fork_handlers: https://github.com/bminor/glibc/blob/9ea3686266dca3f004ba874745a4087a89682617/nptl/register-atfork.c#L40.
RUN echo '<?php\n\
$ch = curl_init("https://www.google.com/");\n\
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);\n\
curl_exec($ch);\n\
$p = proc_open("echo test", [["pipe", "r"], ["pipe", "w"], ["pipe", "w"]], $pipes);\n\
echo stream_get_contents($pipes[1]);\n\
' >> test.php
CMD ["php", "test.php"]- Run the resulting Docker image.
- Notice that it does not print
testand hangs.
Commenting out the lines that install the Elastic APM Agent for PHP fixes the problem.
Expected behavior
Child processes spawned from PHP processes should not deadlock when the Elastic APM Agent for PHP is enabled.