we have installed gpg4win-4.1.0.exe on Windows Server 2016. We use it to encrypt en decrypt files with powershell scripts which call gpg.exe with different parameters. Sometimes gpg.exe processes get stuck. New gpg.exe process also get stuck and do not work anymore. We didn’t have this problem with gpg4win-2.3.1.
We tried different things to get it working when it is broken.
- kill gpg.exe processes which are stuck and make sure no new gpg processes are started.
- kill gpg-agent.exe processes (1 or 2). This sometimes helps.
- if killing gpg-agent doesnt work we reboot the whole server.
This now happens almost daily and we would like to know what could be the problem and what we can do about it.
Probably @aheinecke can help you but he is busy today.
You could try to log gpg-agent (options) and maybe find some helpful information in the logs of post it in this thread.
Without knowing the scripts I can only speculate, maybe that some process is using a pipe for data / io and the writing pipe is not closed? That could cause a hang. On the server, how do you provide the passphrase? Do you pass it with --passphrase-file ? Otherwise maybe a password entry is somehow required and it does not fail?
It could also be something trivial like that gnupg wants you to ask if it is ok to overwrite a file with the same name and if it gets no response keeps running and holding a lock? I assume you have “–batch” as a parameter in your calls?
You could use a tool like process explorer to look for the exact command line of the “hanging” process, that might tell you more, or turn on some debug options and write the logs for gpg and gpg-agent into log-files to investigate what they wanted to do when they started to hang.
Thanks for your reply and suggestions. Sorry for my late reply, since the new forum the emails got in my spam folder so I thought I had no reply yet.
Fortunately gpg.exe doesn’t hang as much. In the last two weeks only a couple of times. 99.9% of the time gpg works just fine.
As a workaround I made a script to periodically look for hanging gpg.exe processes and kill them. Most of the time the next gpg.exe process runs fine. Sometimes not, then I had to kill the gpg-agent process and/or an scdaemon process. It wasn’t neseccary to reboot the system. When the problem is solved and I rerun a failed batch script it always runs fine. So it has nothing to do with a particular script or file.
I pass the passphrase with --passphrase-fd 0 and we indeed use -batch.
I used process explorer to look at the process and the exact command. I couldn’t discover any difference with process that run well.
I turned on logging to a file (I turned it off after a while, because the scripts rely on some output that didn’t go to the scripts anymore).The following message probably has something to do with the problem, sometimes we got a lot of these entries for minutes.
gpg waiting for lock E:\Apps\GNU\CERT\gnupg_spawn_agent_sentinel.lock…
Because it is also sometimes necessary to kill the gpg-agent process it seems likely that the problem has something to do with the gpg-agent. We also often have 2 gpg-agent.exe processes. I don’t know why that is, but maybe that is also an indication that there is sometimes a problem with the gpg-agent.
But I don’t know what is triggering the problem or how to investigate further.
Your most likely problem is that two users are using that directory with different permissions. So the lock file created by user a) cannot be deleted by user b)
That might be in line with two gpg-agent processes running on your system.
Maybe an interactive user uses Kleopatra with that homedir to manage the keys and then your automatic user comes around and is locked out?
I mean this should be resolved by the fact that the spawn lock should be removed automatically but if for some reason this does not happen e.g. when the interactive user logs out that would explain this issue. Better to have an interactive user working on a copy and then copy the pubring.kbx from there into the automated environment when something changes?
Thanks, something like that indeed seems likely. Although we also had the problem when I know nobody else was logged into the system. But it could be that another “automatic” user sometimes runs a gpg command. That maybe could also cause a conflict. I’ll look into that further. Thanks!
Well its enough if that user created the lockfile and it might be (would have to test) that we don’t actually clean our log files manually, usually we just close them and the next process that needs them locks them exclusively, that could fail if you have multiple users with different permissions creating the file.