I asked on stackoverflow for my problem. And i will ask here too…
Does anybody know, what the problem could be?
I just tested this on three Debian 12 VMs, two computers running Arch and one running Rocky 8. In all cases the socket file gets removed as soon as the first
ssh-agent
process is killed, so I can’t reproduce your case where the socketfile remains.user@bookworm:~$ ssh-agent -a ~/.ssh/my-agent -t 7200 SSH_AUTH_SOCK=/home/user/.ssh/my-agent; export SSH_AUTH_SOCK; SSH_AGENT_PID=51401; export SSH_AGENT_PID; echo Agent pid 51401; user@bookworm:~$ rm .ssh/my-agent user@bookworm:~$ ssh-agent -a ~/.ssh/my-agent -t 7200 SSH_AUTH_SOCK=/home/user/.ssh/my-agent; export SSH_AUTH_SOCK; SSH_AGENT_PID=51411; export SSH_AGENT_PID; echo Agent pid 51411; user@bookworm:~$ kill 51401 user@bookworm:~$ ls .ssh/my-agent ls: cannot access '.ssh/my-agent': No such file or directory
Running an
strace
on the firstssh-agent
pid produces something like:strace: Process 51401 attached restart_syscall(<... resuming interrupted poll ...>) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=2407, si_uid=1026} --- getpid() = 51401 unlink("/home/user/.ssh/my-agent") = 0 exit_group(2) = ? +++ exited with 2 +++
I would be curious to see what
strace
shows on your machine where the socketfile does not get deleted. Does it show an unlink, but does it fail for some reason? Or doesn’t it attempt to unlink at all?Please feel free to ignore my comment, if it’s too basic.
What is socketfile?
I have been using ssh for a couple of years, this is the first time I am hearing about socket file. Also, why is it’s deletion at the end of session relevant?
Thanks in advance.
What is socketfile?
A socket file allows two processes on a Unix or Linux system to communicate with each other. In this case, the ssh client communicates with the ssh agent over this socket so that it can authenticate to servers with the private key stored inside the agent. Think of a socket like a TCP port on the network, but instead of being network connected it’s just a file on the filesystem.
I’m sure you’ve heard of the “everything on Unix is a file” paradigm? This is an example of that.
Also, why is it’s deletion at the end of session relevant?
That is a question you should ask OP. I don’t really know what he’s trying to accomplish, I’m just curious about what would cause the difference in behavior.
I don’t really have an answer but why would you even want two running agents using the same socket file path, one with an unlinked socket and one with a socket actually existing in the filesystem? I assume you want to avoid a gap between stopping the old and starting the new agent?
My next step would probably be to check the openssh source code for the bit that removes the socket file to see what kind of conditions there are and maybe also try strace or similar tools to see if it does not remove it because the removal fails or because it is never attempted.
I start for every connection-group an own ssh-agent with different ssh-keys in it. And i connect from my laptop sometimes (regulary) to my desktop-machine and forward the agent to the desktop. This is a setup, i need.
And i have a script, which chooses from ssh config, (Match section) the ssh-agent i need for this connection-group. This script starts automatically an ssh-agent and loads the identities (private-keys, hardware-token…) into this ssh-agent and per configfile it is choosen as IdentityAgent.
When i’m connected to my desktop with my laptop and i work on my desktop, then i use the forwarded agent, because i have some keys only on my laptop, which i want to use also from my desktop. So i link the forwarded agent-socket to the IdentityAgent, which is configured in ssh-config for this connection… When there is no forwared ssh-agent, the symlink is deleted and a new agent is started with a socketfile on the same path.
It sound’s a bit complicated… and yes, it is.
An i don’t get it, why sometimes the socketfile is deleted and sometimes it remains. Now i tested it from home on the remote-connection. The temporary, forwarded agent-socket is a symlink to my regular socket-file. and i killed the running ssh-agent… and also the symlink is removed.
It is strange behaviour… a process unlinks a socket-file, which does not belong to him, only the name is the same… and not every time.
It is strange behaviour… a process unlinks a socket-file, which does not belong to him, only the name is the same
That is what I would expect it to do actually. I would expect it to close the socket it has open and the delete (unlink) it by name.
I expect, it deletes the socket, which on which the process is listening. what if i rename the socket (for some reason). Then the socketfile should be deleted also.
Directory operations like unlinking (deletion) traditionally work via paths, not open file handles.
Lol I love that there’s no activity on so, but plenty here 😁
strace the process and see if it attempts deletion
Maybe check the method you’re using to identify which agent process is the old one??? Idk. Sanity check that the pid is lower?
I check the pid for each process i opened, so i know which one is the older… and yes. the older has a lower pid. :)
Lol 🤣 sorry, you never know who’ll be reading this in the future :P