Allow to use the process name to watch process (-d) #95

Closed
opened 2024-08-23 08:57:15 +02:00 by tbertels · 7 comments
Contributor

By using pidof -s [processname] internally, pv could work without having to first manually use pidof.
Maybe a different argument name should be used instead of -d.

There would still be room for improvement in case of multiple PIDs for the same process name, but for many use cases it would make it much easier to use.

By using `pidof -s [processname]` internally, pv could work without having to first manually use pidof. Maybe a different argument name should be used instead of `-d`. There would still be room for improvement in case of multiple PIDs for the same process name, but for many use cases it would make it much easier to use.
Owner

That sounds reasonable, changing -d to accept a non-numeric argument, which will cause it to use the first process with that name.

I'm uneasy about actually spawning "pidof" so may try to implement the lookup inside PV.

To find a matching process we could look at:

  1. What executable the process is running (e.g. "/bin/cp")
  2. What the "zero" argument (argv[0]) of the process is (e.g. "cp")
  3. Which terminal the process is attached to
  4. Which user ID the process is owned by
  5. Whether the process is in a "zombie" state

I would suggest we should look for matches in this order, and choose the first match, ignoring all zombie processes:

  1. Executable path matches fully, same terminal as PV, same user as current user
  2. Leafname of executable path matches argument, same terminal, same user
  3. Process argv[0] matches fully, same terminal, same user
  4. Leafname of process argv[0] matches argument, same terminal, same user
  5. As 1 but any terminal, same user
  6. As 2 but any terminal, same user
  7. As 3 but any terminal, same user
  8. As 4 but any terminal, same user
  9. As 5 but any user
  10. As 6 but any user
  11. As 7 but any user
  12. As 8 but any user

Just calling "pidof -s" would be simpler. I don't like the idea of doing that because we're then making PV spawn a subprocess and it has to do that safely, run the real "pidof" from the right path, and capture the output without making a mess, and these all seem like they will be tricky to do right. We're then also stuck with the first match, with no control over the search order.

On the other hand, to do it without "pidof" will make PV a bit more complicated. How much more complicated I don't know, I need to go look at the source for "pidof" to see how it finds out things like the controlling terminal of a process.

Can't tell if I'm fussing over nothing and making it more than it needs to be - thoughts welcome.

That sounds reasonable, changing `-d` to accept a non-numeric argument, which will cause it to use the first process with that name. I'm uneasy about actually spawning "`pidof`" so may try to implement the lookup inside PV. To find a matching process we could look at: 1. What executable the process is running (e.g. "/bin/cp") 2. What the "zero" argument (argv[0]) of the process is (e.g. "cp") 3. Which terminal the process is attached to 4. Which user ID the process is owned by 5. Whether the process is in a "zombie" state I would suggest we should look for matches in this order, and choose the first match, ignoring all zombie processes: 1. Executable path matches fully, same terminal as PV, same user as current user 2. Leafname of executable path matches argument, same terminal, same user 3. Process argv[0] matches fully, same terminal, same user 4. Leafname of process argv[0] matches argument, same terminal, same user 5. As 1 but any terminal, same user 6. As 2 but any terminal, same user 7. As 3 but any terminal, same user 8. As 4 but any terminal, same user 9. As 5 but any user 10. As 6 but any user 11. As 7 but any user 12. As 8 but any user Just calling "`pidof -s`" would be simpler. I don't like the idea of doing that because we're then making PV spawn a subprocess and it has to do that safely, run the real "`pidof`" from the right path, and capture the output without making a mess, and these all seem like they will be tricky to do right. We're then also stuck with the first match, with no control over the search order. On the other hand, to do it without "`pidof`" will make PV a bit more complicated. How much more complicated I don't know, I need to go look at the source for "`pidof`" to see how it finds out things like the controlling terminal of a process. Can't tell if I'm fussing over nothing and making it more than it needs to be - thoughts welcome.
Author
Contributor

I would suggest checking each value returned by pidof and stopping at the first one that shows any kind of progress (non zero speed).
A more complicated way would be to list all processes with progress (showing the interesting line(s)) and ask the user which one to watch.

I would suggest checking each value returned by `pidof` and stopping at the first one that shows any kind of progress (non zero speed). A more complicated way would be to list all processes with progress (showing the interesting line(s)) and ask the user which one to watch.
Owner

If we look for progress, that means a delay in starting up, but I think it won't matter very much.

If we use pgrep instead of pidof, then it will work on BSD as well as Linux. Can you see any problems with making this change?

If we look for progress, that means a delay in starting up, but I think it won't matter very much. If we use `pgrep` instead of `pidof`, then it will work on BSD as well as Linux. Can you see any problems with making this change?
Author
Contributor

One thing I've noticed is that pgrep -x firefox will only return the main process whereas pidof firefox will return the spawn processes (which aren't named "firefox") too.

On the other hand, pidof kthreadd doesn't return anything, and pgrep kworker returns of course all kworkers (wether that's a good thing or not though is another question).

pv could also try with pidof and use pgrep otherwise.

One thing I've noticed is that `pgrep -x firefox` will only return the main process whereas `pidof firefox` will return the spawn processes (which aren't named "firefox") too. On the other hand, `pidof kthreadd` doesn't return anything, and `pgrep kworker` returns of course all kworkers (wether that's a good thing or not though is another question). pv could also try with pidof and use pgrep otherwise.
a-j-wood added this to the (deleted) milestone 2024-10-03 18:18:57 +02:00
Owner

OK so PV should try to run "pidof " and then, if pidof can't execute at all (command not found), run "pgrep " instead. It should read back the result, and use it as a list of process IDs to look for activity in. The first process in the list that shows movement on a file descriptor will then be the one it watches.

(Note to self: pidof separates the PIDs in its output with a space, while pgrep uses a newline; although pidof provides a "-d" option to change the separator, that's not available everywhere)

I've grouped this issue along with several others under a general milestone of "rewrite of watchfd", since to take this forward the file descriptor watching code needs restructuring. In particular for this issue I need to separate out the watched fd information from the overall state, so we can look at several PIDs for a short while to see which one has activity first, and then drop it all and start a regular watchfd loop on one selected PID. The whole milestone looks like this at the moment:

  • Separate the watched file descriptor information from the main state so that multiple processes can be tracked
  • Allow multiple "-d" options or just make "-d" take no options and cause it to parse all remaining arguments as PID[:FD]
  • Allow mixing of "-d PID" with "-d PID:FD"
  • Extended formatting directives to use with "-d", to allow use or omission of PID, FD, underlying file
  • FD selection criteria to use with "-d PID", perhaps to hide FDs that are not moving, or set glob patterns

Getting those done together will allow several other issues to be dealt with too.

OK so PV should try to run "pidof <arg>" and then, if pidof can't execute at all (command not found), run "pgrep <arg>" instead. It should read back the result, and use it as a list of process IDs to look for activity in. The first process in the list that shows movement on a file descriptor will then be the one it watches. (Note to self: pidof separates the PIDs in its output with a space, while pgrep uses a newline; although pidof provides a "-d" option to change the separator, that's not available everywhere) I've grouped this issue along with several others under a general milestone of "rewrite of watchfd", since to take this forward the file descriptor watching code needs restructuring. In particular for this issue I need to separate out the watched fd information from the overall state, so we can look at several PIDs for a short while to see which one has activity first, and then drop it all and start a regular watchfd loop on one selected PID. The whole milestone looks like this at the moment: * Separate the watched file descriptor information from the main state so that multiple processes can be tracked * Allow multiple "-d" options or just make "-d" take no options and cause it to parse all remaining arguments as PID[:FD] * Allow mixing of "-d PID" with "-d PID:FD" * Extended formatting directives to use with "-d", to allow use or omission of PID, FD, underlying file * FD selection criteria to use with "-d PID", perhaps to hide FDs that are not moving, or set glob patterns Getting those done together will allow several other issues to be dealt with too.
Owner

The simpler approach of just calling pgrep has been taken, since pv can now watch multiple processes (#12). The syntax is "--watchfd =NAME", where NAME is the process name - this is passed to pgrep and the resultant list of PIDs is watched.

The simpler approach of just calling `pgrep` has been taken, since `pv` can now watch multiple processes (#12). The syntax is "**--watchfd =NAME**", where _NAME_ is the process name - this is passed to `pgrep` and the resultant list of PIDs is watched.
Owner

Closing as this is now in version 1.9.42.

Closing as this is now in version 1.9.42.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ivarch/pv#95
No description provided.