Skip to content

"java.net.SocketException: Socket closed" when in a cluster mode + Docker + acquireHostList enabled #384

Closed
@wajda

Description

@wajda

The issue was first discovered here AbsaOSS/spline#869

The error occurs in the combination of circumstances: Cluster mode + Docker + acqureHostList=true

My understanding of what is happening is the following.
When the VST connection is established the respective HostHandler asks VstCommunication class to refresh the host list from the server. When the new hosts are added to the set, the old ones (unless are pointing to exactly the same ip:port) are immediately discarded along with all associated connection pools and sockets.
The problem is that the connection instance, that has just been created and triggered the host list refreshing process in the first place, the one that is being returned from the VstCommunication.connect() method holds a pointer to the host that might have just been discarded (and the associated socket closed) during this host list refreshing routine. As a result in this circumstances the VstCommunication.connect() method returns a connection that is dead on the moment of creation, with all the consequences.

This is exactly what happens when ArangoDB runs in a virtualized environment (Docker in our case) when the networking is organized in a way that the client process addresses the server via a different IP (or a host name) that the server sees from inside its network.

The issue is reproducible by spinning up a DB cluster via arangodb-starter in a Docker, and run ArangoDBTest.execute_acquireHostList_enabled() test method against it.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions