Skip to content

[Bug] Networks function failure #3905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
Cikaros opened this issue May 31, 2025 · 15 comments
Open
1 task done

[Bug] Networks function failure #3905

Cikaros opened this issue May 31, 2025 · 15 comments

Comments

@Cikaros
Copy link

Cikaros commented May 31, 2025

Bug Issue Report

Describe the problem

After upgrading from version 0.45.1 to version 0.45.2, the Networks functionality in NetBird is not working properly. The services configured are unable to be accessed.

To Reproduce

Steps to reproduce the behavior:

  1. Upgrade NetBird from version 0.45.1 to 0.45.2.
  2. Attempt to access a configured service through the Networks feature.
  3. See error or no connectivity.

Expected behavior

The Network services should be accessible and functioning correctly as they were in the previous version (0.45.1).

Are you using NetBird Cloud?

self-host NetBird's control plane.

NetBird version

netbird version: 0.45.2

Is any other VPN software installed?

If yes, which one?

  • Yes
  • [] No
@mlsmaycon
Copy link
Collaborator

@Cikaros, we are missing the required debug logs. Can you please run:

netbird debug bundle -U -AS

If the failure still happening, can you also collect the Wireguard status:

sudo wg show | egrep 'peer|allowed'

@mlsmaycon
Copy link
Collaborator

can you share Wireguard status too?

you might need to install wireguard-tools via homebrew

Also, is this route running in high availability mode?

@Cikaros
Copy link
Author

Cikaros commented May 31, 2025

It is not in high-performance mode, there is only one active node, and it is currently online.

@mlsmaycon
Copy link
Collaborator

Ok, it seems like the client didn't add the 172.18.0.10/32 range into your node's routing table.

We will check the logs and to see if we find something.

@Cikaros
Copy link
Author

Cikaros commented May 31, 2025

Sure, the previous version is available.

@Cikaros
Copy link
Author

Cikaros commented Jun 2, 2025

netird 0.45.3 still has this issue

@Cikaros
Copy link
Author

Cikaros commented Jun 2, 2025

When checking the Signal server logs, it was found that:

2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:43Z INFO ./caller_not_available:0: 2025/06/02 10:01:43 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:46Z INFO ./caller_not_available:0: 2025/06/02 10:01:46 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:49Z INFO ./caller_not_available:0: 2025/06/02 10:01:49 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2025-06-02T10:01:49Z INFO ./caller_not_available:0: 2025/06/02 10:01:49 WARNING: [core] [Server #1]grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

@mlsmaycon
Copy link
Collaborator

@Cikaros, thanks for sharing a status update on this. We are attempting to reproduce the issue without success.

Can you confirm if the issue happens after returning from sleep mode and can you please run the following commands:

# enable trace logs
netbird debug log level trace
# start collecting debugging data netbird for 2 minutes
netbird debug for 2m --upload-bundle --anonymize

In parallel run the usual network tests for this resource (172.18.0.10/32) and also run the following commands to collect routing table and wireguard info:

sudo wg show | egrep 'peer|allowed'
netstart -nr | grep 172

If, during the 2 minutes, the route works, we might need to wait until it fails again, and in that time, you can run the following command to get a debug bundle and collect the routing table and wireguard info again.

netbird debug bundle --upload-bundle --anonymize
sudo wg show | egrep 'peer|allowed'
netstart -nr | grep 172

In all cases, please share the Upload key for the bundle and the output of netstat and wg commands.

@Cikaros
Copy link
Author

Cikaros commented Jun 2, 2025

Current server setup:
Server A:

Netbird client 0.45.3
DNS (in Docker) at 172.18.0.10
MacOS B:

Netbird client 0.45.3
Windows C:

Netbird client 0.45.3
Using Networks, the DNS on Server A (172.18.0.10) is intended to be accessible by both B and C, with custom DNS settings pushing 172.18.0.10 to B and C.

At the moment, both B and C cannot access 172.18.0.10 (even though the relevant ports have been opened).

@mlsmaycon
Copy link
Collaborator

@Cikaros from my last comment: #3905 (comment)

You can replace the netstat command with route print, and the wg command would need you to install wireguard using https://www.wireguard.com/install/

@Cikaros
Copy link
Author

Cikaros commented Jun 2, 2025

Please accept the information as soon as possible, and then I will delete the record.

@mlsmaycon
Copy link
Collaborator

I copied the info above.

Can you confirm that during the time when you captured the info, the route was working right?

@Cikaros
Copy link
Author

Cikaros commented Jun 2, 2025

The resource has not been accessible during the capture.

@Cikaros
Copy link
Author

Cikaros commented Jun 6, 2025

@mlsmaycon During the troubleshooting process, it was discovered that this issue originates from the Routing Peer. In previous versions, the Routing Peer was able to forward any IP resources that it could access directly. However, in the upgraded version, while it is still possible to access resources on the physical network interface, issues arise when attempting to access IP resources located in virtual bridges created by Docker — in such cases, those resources become inaccessible.

@Cikaros
Copy link
Author

Cikaros commented Jun 7, 2025

The problem might be related to the operating system. There is no such issue when using the Docker version of Routing peer. Do any of the developers have any ideas? I'll go and verify this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants