Support for backgrounding pv, and allowing it to be monitored separately #56

Closed
opened 2022-11-23 22:57:23 +01:00 by jimbobmcgee · 5 comments
jimbobmcgee commented 2022-11-23 22:57:23 +01:00 (Migrated from github.com)

I would like to be able to use pv to monitor a background transfer and periodically check the current progress. At a glance, a combination of pv -q & and pv -R $! seems like it should do the job but fails on two accounts:

  1. pv -q & immediately stops, as it wants access to the terminal (even with -q)
  2. pv -R ... affects the behaviour of the first pv in its terminal, rather than showing the progress in the current terminal

(1) is demonstrable -- just running pv -q </dev/zero >/dev/null & ends up with a stopped job that only works when forgrounded with fg
(2) is demonstrable with help from screen or tmux -- running pv -q </dev/zero >/dev/null (no &) in one window and then running pv -R $(pidof -s pv) in a second does not display anything in the second window; while running pv -R $(pidof -s pv) -p starts writing output in the first window.

pv -d $(pidof pv) does not appear to help, I assume because the first pv already has the files open, so never "sees" a new open "event". Likewise pv -d "$(pidof pv):0" doesn't work, because the first pv has opened devices, not files (I am really hoping to monitor transfer between non-files, specifically stdout-to-fifo). I've also tried dd ... &; pv -d $(pidof dd), which suffers the same.

I suppose both pv -R and pv -d would also be limited to only having state information about the transfer since the second pv started, and would not have access to the time/byte-count/avg-rate of the running first pv.

As such, my enhancement request is twofold:

  1. ability to background a pv without it stopping for terminal access
  2. a specific switch, e.g. -M <pid>, to monitor (or mirror) the existing pv process indicated by <pid>, e.g by instructing the first process to IPC its transfer history to the second, and for the second to output to the terminal.
    • if a fifo-driven --redir-to / --redir-from semantic makes more sense than some other IPC, that would be fine

Note that I am not just hoping to solve the issue of backgrounding and checking in the same session -- I would hope/expect that implementing (1) would just allow CTRL+Z and judicious use of fg/bg for that purpose. I am also looking to have pv invoked somewhere else (e.g. by cron) and be able to attach to it for occasional progress updates by an operator. It just so happens that I need a long-running-writer | pv > /tmp/fifo & behaviour, so that something else can read the fifo.

(If any of this exists in the releases after 1.6.0, please advise/accept my apologies for the noise!)

I would like to be able to use `pv` to monitor a background transfer and periodically check the current progress. At a glance, a combination of `pv -q &` and `pv -R $!` seems like it should do the job but fails on two accounts: 1. `pv -q &` immediately stops, as it wants access to the terminal (even with `-q`) 2. `pv -R ...` affects the behaviour of the first `pv` in its terminal, rather than showing the progress in the current terminal (1) is demonstrable -- just running `pv -q </dev/zero >/dev/null &` ends up with a stopped job that only works when forgrounded with `fg` (2) is demonstrable with help from `screen` or `tmux` -- running `pv -q </dev/zero >/dev/null` (no `&`) in one window and then running `pv -R $(pidof -s pv)` in a second does not display anything in the second window; while running `pv -R $(pidof -s pv) -p` starts writing output in the first window. `pv -d $(pidof pv)` does not appear to help, I assume because the first `pv` already has the files open, so never "sees" a new open "event". Likewise `pv -d "$(pidof pv):0"` doesn't work, because the first `pv` has opened *devices*, not files (I am really hoping to monitor transfer between non-files, specifically stdout-to-fifo). I've also tried `dd ... &; pv -d $(pidof dd)`, which suffers the same. I suppose both `pv -R` and `pv -d` would also be limited to only having state information about the transfer since the *second* `pv` started, and would not have access to the time/byte-count/avg-rate of the running *first* `pv`. As such, my enhancement request is twofold: 1. ability to background a `pv` without it stopping for terminal access 2. a specific switch, e.g. `-M <pid>`, to *monitor* (or *mirror*) the existing `pv` process indicated by `<pid>`, e.g by instructing the first process to IPC its transfer history to the second, and for the second to output to the terminal. * if a fifo-driven `--redir-to` / `--redir-from` semantic makes more sense than some other IPC, that would be fine Note that I am not just hoping to solve the issue of backgrounding and checking in the *same* session -- I would hope/expect that implementing (1) would just allow <kbd>CTRL+Z</kbd> and judicious use of `fg`/`bg` for that purpose. I am also looking to have `pv` invoked somewhere *else* (e.g. by `cron`) and be able to *attach* to it for occasional progress updates by an operator. It just so happens that I need a `long-running-writer | pv > /tmp/fifo &` behaviour, so that something *else* can read the fifo. (If any of this exists in the releases after 1.6.0, please advise/accept my apologies for the noise!)
jimbobmcgee commented 2022-11-23 23:04:13 +01:00 (Migrated from github.com)

Note, the backgrounding issue may be as simple as redirecting stderr to /dev/null, e.g. pv </dev/zero >/dev/null 2>/dev/null &.

Monitoring that from a separate session is still an issue...

Note, the backgrounding issue may be as simple as redirecting stderr to /dev/null, e.g. `pv </dev/zero >/dev/null 2>/dev/null &`. Monitoring that from a separate session is still an issue...
a-j-wood commented 2023-07-16 02:52:37 +02:00 (Migrated from github.com)

This is a use case that I'd not considered but I can see where it would be handy.

It sounds a little similar to issue #54 "Run command every n percent", in which the suggestion is to send a desktop notification every now and then - but not quite the same.

Yes I would see it being implemented as "pv --quiet < bigfile | bigprogram > outputfile &" and then "pv --query " to see how it's getting on.

Not sure specifically how it would be implemented, but IPC is probably plausible.

This is a use case that I'd not considered but I can see where it would be handy. It sounds a little similar to [issue #54 "Run command every n percent"](https://github.com/a-j-wood/pv/issues/54), in which the suggestion is to send a desktop notification every now and then - but not quite the same. Yes I would see it being implemented as "pv --quiet < bigfile | bigprogram > outputfile &" and then "pv --query <pid-of-first-pv>" to see how it's getting on. Not sure specifically _how_ it would be implemented, but IPC is probably plausible.

the backgrounding issue may be as simple as redirecting stderr to /dev/null

yes, but i want to see stderr of pv in my terminal

this is blocking the background process, which is not expected, so its a bug in pv

$ echo x | pv -r | cat >/dev/null &

$ jobs
[1]+  Stopped                 echo x | pv -r | cat > /dev/null

$ fg

expected behavior

$ echo x | tee -a /dev/stderr | cat >/dev/null &
x

this shows no output from pv

$ echo x | pv -r 2>/dev/null | cat >/dev/null &

writing stderr of pv to a tempfile works with pv -f

$ t=$(mktemp); echo x | pv -f -r 2>$t | cat >/dev/null & wait $!; cat $t; rm $t
[4.21KiB/s]

why? im using pv to debug parallel processing

$ seq 0 9999999 | xargs printf "%7s\n" >input-padded.txt
$ n=2; p=; ts=; for i in $(seq $n); do t=$(mktemp); ts+=" $t"; time tail -c +$((1 + 8 * (10000000 / n * (i - 1)))) input-padded.txt | pv -f -r 2>$t | head -c$((8 * (10000000 / n))) >/dev/null & p+=" $!"; done; wait $p; cat $ts; rm $ts
[64.4MiB/s]
[63.3MiB/s]
> the backgrounding issue may be as simple as redirecting stderr to /dev/null yes, but i want to see stderr of pv in my terminal this is blocking the background process, which is not expected, so its a bug in pv ```console $ echo x | pv -r | cat >/dev/null & $ jobs [1]+ Stopped echo x | pv -r | cat > /dev/null $ fg ``` expected behavior ```console $ echo x | tee -a /dev/stderr | cat >/dev/null & x ``` this shows no output from pv ```console $ echo x | pv -r 2>/dev/null | cat >/dev/null & ``` writing stderr of pv to a tempfile works with `pv -f` ```console $ t=$(mktemp); echo x | pv -f -r 2>$t | cat >/dev/null & wait $!; cat $t; rm $t [4.21KiB/s] ``` why? im using pv to debug parallel processing ```console $ seq 0 9999999 | xargs printf "%7s\n" >input-padded.txt $ n=2; p=; ts=; for i in $(seq $n); do t=$(mktemp); ts+=" $t"; time tail -c +$((1 + 8 * (10000000 / n * (i - 1)))) input-padded.txt | pv -f -r 2>$t | head -c$((8 * (10000000 / n))) >/dev/null & p+=" $!"; done; wait $p; cat $ts; rm $ts [64.4MiB/s] [63.3MiB/s] ```

pv shouldn't block when run in the background, so that's a bug, but as you'll find when you do "bg" after "echo x | pv -r | cat >/dev/null &", pv will run in the background happily but will never write any progress information to the terminal when it's not in the foreground. This is by design.

Strangely, "echo x | pv -r | cat >/dev/null &" stops immediately and will then complete when told to "bg", but "echo x | pv -r >/dev/null &" will run in the background and complete normally without stopping.

This is when I test with the current git code as it is today.

Having more than one pv instance writing to the terminal in the background doesn't seem like it will work very well. Maybe with "-c" it could work, but I'm not sure.

It looks like we've got a few issues rolled into one here.

  1. Running "pv </dev/zero >/dev/null &" no longer stops, but adding pipes like "pv </dev/zero | cat >/dev/null &" starts to behave strangely, so I think that signal handling relating to terminal I/O is a bit wrong.

  2. It would be useful to be able to show the progress of some other pv, such as "pv --query 12345" to show what the pv process 12345 would output if it was on this terminal.

  3. It would be useful to be able to run multiple pv instances at once on the same terminal in separate pipelines.

Would a "-ff" option to force output even when not controlling the terminal (i.e. when in the background) be of any use here?

pv shouldn't block when run in the background, so that's a bug, but as you'll find when you do "`bg`" after "`echo x | pv -r | cat >/dev/null &`", pv will run in the background happily but will never write any progress information to the terminal when it's not in the foreground. This is by design. Strangely, "`echo x | pv -r | cat >/dev/null &`" stops immediately and will then complete when told to "`bg`", but "`echo x | pv -r >/dev/null &`" will run in the background and complete normally without stopping. This is when I test with the current git code as it is today. Having more than one pv instance writing to the terminal in the background doesn't seem like it will work very well. Maybe with "`-c`" it could work, but I'm not sure. It looks like we've got a few issues rolled into one here. 1. Running "`pv </dev/zero >/dev/null &`" no longer stops, but adding pipes like "`pv </dev/zero | cat >/dev/null &`" starts to behave strangely, so I think that signal handling relating to terminal I/O is a bit wrong. 1. It would be useful to be able to show the progress of some other pv, such as "`pv --query 12345`" to show what the pv process 12345 would output if it was on this terminal. 1. It would be useful to be able to run multiple pv instances at once on the same terminal in separate pipelines. Would a "`-ff`" option to force output even when not controlling the terminal (i.e. when in the background) be of any use here?

The backgrounding issue is fixed in the latest commit.

I will close this issue and have raised a separate one (#101) about being able to query the progress of one PV from another.

The backgrounding issue is fixed in the latest commit. I will close this issue and have raised a separate one (#101) about being able to query the progress of one PV from another.
a-j-wood 2024-10-04 11:35:34 +02:00
  • closed this issue
  • added the
    bug
    label
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ivarch/pv#56
No description provided.