-
Notifications
You must be signed in to change notification settings - Fork 3k
Adjustment of netsocket tests to the lossy mesh network limitation. #11654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d5cfc7b
to
34c3302
Compare
@tymoteuszblochmobica, thank you for your changes. |
TESTS/netsocket/dns/main.cpp
Outdated
printf("MBED: IP address is '%s'\n", address ? address : "null"); | ||
#if MBED_CONF_TARGET_NETWORK_DEFAULT_INTERFACE_TYPE == MESH | ||
uint8_t addr_hex[16]; | ||
stoip6(address, strlen(address), addr_hex); | ||
if ((addr_hex[0] & 0xe0) != 0x20) { | ||
printf("Adress is not globlal Please check Border Router TEST SKIP \n"); | ||
GREENTEA_TESTSUITE_RESULT(false); | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This same code block has now been copied on many places. Can it me just one function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could follow the way this is done with a function+macro for SKIP_IF_TCP_UNSUPPORTED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, @tymoteuszblochmobica pointed out to me, that this is already only copied once per test directory. We can wrap it with a function, but we'll have to declare and define this function for every test directory (tcp, tls, dns, udp) anyway. We've been through similar discussion in #10037 and we agreed that there is no function sharing between different test suites, unfortunately.
So I'm afraid this cannot be just one function.
TESTS/netsocket/tcp/main.cpp
Outdated
#if MBED_CONF_TARGET_NETWORK_DEFAULT_INTERFACE_TYPE != MESH | ||
Case("TCPSOCKET_RECV_100K", TCPSOCKET_RECV_100K), | ||
Case("TCPSOCKET_RECV_100K_NONBLOCK", TCPSOCKET_RECV_100K_NONBLOCK), | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this makes sense?
Does it really help if we start skipping test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a big challenge to pass 100k recv in mesh.
After increasing timeouts up to 20 minutes tests mostly pass but still sometimes fails.
Smaller timeouts causes 100% fails
Its rfc864 tests so receiving DUT cant slow down transmision if mesh network is unable to deliver packets on time.
So problem with TCP retransmision traffic will arise.
Maybe the solution it to keep this increased timeouts for mesh and run tests only in RF box under no other devices "RF traffic "and "clear channel" condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test skip reverted. It is successful under local mesh BR. Testing mesh performance under heavy RF traffic is beyond the scope of netsockets tests.
TESTS/netsocket/udp/main.cpp
Outdated
#if MBED_CONF_TARGET_NETWORK_DEFAULT_INTERFACE_TYPE != MESH | ||
Case("UDPSOCKET_ECHOTEST_BURST_NONBLOCK", UDPSOCKET_ECHOTEST_BURST_NONBLOCK), | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something we should not skip.
Its basically close to what DTLS handshake is sending. Not accurate, but still emulates it.
Now we are just hiding problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test skip reverted. It is successful under local mesh BR. Testing mesh performance under heavy RF traffic is beyond the scope of netsockets tests.
TESTS/netsocket/udp/main.cpp
Outdated
#if MBED_CONF_TARGET_NETWORK_DEFAULT_INTERFACE_TYPE != MESH | ||
Case("UDPSOCKET_ECHOTEST_BURST", UDPSOCKET_ECHOTEST_BURST), | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.. I don't see the point of skipping this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test skip reverted. It is successful under local mesh BR. Testing mesh performance under heavy RF traffic is beyond the scope of netsockets tests.
static const int pkt_sizes[PKTS] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, \ | ||
100, 200, 300 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then who tests that those bigger packets can pass if we are not going to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test skip reverted. It is successful under local mesh BR. Testing mesh performance under heavy RF traffic is beyond the scope of netsockets tests.
@SeppoTakalo @tymoteuszblochmobica What is the status for this PR? Looks like the review has not been completed |
472d254
to
512e237
Compare
Seppo can you review again? |
3703305
to
7c8abc5
Compare
TESTS/netsocket/tcp/main.cpp
Outdated
uint8_t addr_hex[16]; | ||
stoip6(address, strlen(address), addr_hex); | ||
if ((addr_hex[0] & 0xe0) != 0x20) { | ||
printf("Adress is not globlal Please check Border Router TEST SKIP \n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix typos adress -> Address
and globlal -> global
(+ 2 other files where you have copy pasted same mistake)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
7c8abc5
to
1d6985b
Compare
CI started |
Test run: FAILEDSummary: 1 of 5 test jobs failed Failed test jobs:
|
@tymoteuszblochmobica , apparently your check if the test is running for mesh network didn't work, although I can't see any obvious problem with it.
I doubt that our CI is running any mesh network tests. This shouldn't trigger. |
@tymoteuszblochmobica Please talk to @VeliMattiLahtela |
Strange
|
What strange there is about that? For example test configuration on mesh, see: https://github.com/ARMmbed/mbed-os/blob/master/tools/test_configs/6lowpanInterface_router.json#L26..L36 |
TESTS/netsocket/tls/main.cpp
Outdated
stoip6(address, strlen(address), addr_hex); | ||
if ((addr_hex[0] & 0xe0) != 0x20) { | ||
printf("Address is not global Please check Border Router TEST SKIP \n"); | ||
GREENTEA_TESTSUITE_RESULT(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be TEST_SKIP_MESSAGE("Mesh not working")
I don't really know the difference why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to skip the tests? We just want them to fail with a reasonable message, pointing to the border router issue, to avoid unnecessary debugging/bug reports.
1d6985b
to
6d6076d
Compare
Fixed |
6d6076d
to
b6b4519
Compare
TESTS/netsocket/tcp/main.cpp
Outdated
#if MBED_CONF_TARGET_NETWORK_DEFAULT_INTERFACE_TYPE == MESH | ||
uint8_t addr_hex[16]; | ||
stoip6(address, strlen(address), addr_hex); | ||
if ((addr_hex[0] & 0xe0) != 0x20) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are these why is there a check for address for mesh
To me this looks like bug in code that is circumvented in tests?
In applications we should wait for Global UP event and then everything should work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikter , this PR is now only trying to improve error reporting due to mesh network in RAAS being heavily loaded and often unable to perform UPD tests. We can tell that this situation occurs by observing the IP address our DUT got. If we did not receive a global address, the test will fail anyway, so we just want to minimize the amount of unnecessary debugging and print a reasonable error message suggesting
Now that you mentioned it, perhaps it is neater to wait for GLOBAL_UP
event instead of checking if the address is global or not, but the essence of the changes remains the same - don't run the tests if the global connection failed and print a reasonable error message, instead.
@tymoteuszblochmobica , can we try observing the events instead of observing the IP address? I imagine we could set a blocking semaphore after connect()
returns and only release it in a GLOBAL_UP
event callback. If the semaphore is not released in time, the test will timeout. If the last message printed is "Waiting for GLOBAL_UP" I think that will be a clear indication that there is a connectivity problem. Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with GLOBAL_UP wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test checking that address is 20**** meaning some default public IPv6 address and if not skip it.
This should be fixed in mesh API if there is a bug that ULA prefix is global
If you check file meshinterfacenanostack or any other mesh there is a check when local or Global is UP and there Global UP is given even for ULA prefixes How is mesh.connect supposed to work wait until GLOBAL_UP or LOCAL_UP. if it should wait for GLOBAL_UP the bug is in the mesh apis if it should stop even at LOCAL_UP the bug is in test case and it should follow the same check as mesh_minimal_application and do following while loop where it waits the Global Address. not failure |
Regarding what I agree with you, @mikter , that it is better to wait for GLOBAL_UP event until test times out. @tymoteuszblochmobica , do you also agree? Can you modify the code so tests wait until |
There's no clearly right behaviour here. It's similar to lwIP's dilemma about what do do when asked for IPv4+IPv6, preferring IPv6, and it gets IPv4 first. How long should "connect" wait for IPv6 before returning? Should it wait, even? There we did decide to make it wait a bit longer to see if IPv6 came up, on the assumption that apps might just be written to do a simple "connect" then check "IPv4 or IPv6 address"? You could do the same thing here - wait a bit longer after the local up, to see if we get a global address, for simple apps that are going to check the address immediately after connect return. It would be consistent. |
b6b4519
to
1aacaf3
Compare
Replaced with GLOBAL_UP waiting loop. |
@mikter Please review |
1aacaf3
to
983c958
Compare
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@0xc0170 , it seems that this is now also ready to go (as soon as the current release is over). Would you please adjust the labels?
CI started |
Test run: SUCCESSSummary: 5 of 5 test jobs passed |
Description
Netsocket tests are modified according to loosy mesh network environment.
Some tests are skipped.
The failures need to be investigated by the infra team.
Local network setup, passes all tests
Pull request type
Reviewers
@mikter
@SeppoTakalo
@michalpasztamobica
Release Notes