Skip to content

Update timing tests to be robust #4599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 7, 2017
Merged

Update timing tests to be robust #4599

merged 5 commits into from
Jul 7, 2017

Conversation

studavekar
Copy link
Contributor

@studavekar studavekar commented Jun 21, 2017

In the new approach the host controls the device activity when the test
should start, finish and pass/fail status is send to device. Also deprecate
the test cases which can't accurately test.

Status

READY

Migrations

If this PR changes any APIs or behaviors, give a short description of what API users should do when this PR is merged.

NO

Related PRs

List related PRs against other branches: None

Todos

  • Tests

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things that I believe could help us to maintain/improve these tests


DigitalOut led1(LED1);
void update_tick_thread() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the style, if we are changing we should align it. Should be separate commit

utest::v1::status_t one_ticker_case_setup_handler_t(const Case *const source, const size_t index_of_case) {
ticker1 = new Ticker();
return greentea_case_setup_handler(source, index_of_case);
utest::v1::status_t one_ticker_case_setup_handler_t(const Case * const source, const size_t index_of_case) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was good as it was *const source ?, same as int *pointer ?


def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add any comments to this host files, there are lto of magic numbers, computations. It would explain what it does, for future would help to edit (at least me :-) ).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little math nitpick:

This:

(self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

Can be simplified to this:

((self.round_trip_final_start + self.round_trip_final_end) - (self.round_trip_base_start + self.round_trip_base_end)) / 2.0

^ that can technically be simplified further, but I think its a bit more obvious what you're doing if you leave it that way.

Not necessary to change though if you think its more readable as is.


#define SIGNAL_PRINT_TICK 0x01
volatile uint32_t callback_trigger_count = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be asking too much but I never had an idea what test does, can we add something here to the code file to explain?

@studavekar
Copy link
Contributor Author

@bridadan can you please review, As Martin suggested will add few comments to explain the magic numbers :)

@bridadan
Copy link
Contributor

@studavekar will do. Would you also mind posting a log from running one of these tests?

@studavekar
Copy link
Contributor Author

Here are the logs for the run

test_run.txt

Copy link
Contributor

@bridadan bridadan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the logic looks good! Mostly just commenting on cleaning up some dead code and some print statements.


# this value is obtained for measurements when there is 0 transport delay and we want accurancy of 5%
time_measurement_multiplier = 80

def _callback_result(self, key, value, timestamp):
# We should not see result data in this test
self.__result = False

def _callback_end(self, key, value, timestamp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're not registering an end callback in setup, I'd remove this function.


# this value is obtained for measurements when there is 0 transport delay and we want accurancy of 5%
time_measurement_multiplier = 80

def _callback_result(self, key, value, timestamp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like this is being used, I'd remove this.


def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little math nitpick:

This:

(self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

Can be simplified to this:

((self.round_trip_final_start + self.round_trip_final_end) - (self.round_trip_base_start + self.round_trip_base_end)) / 2.0

^ that can technically be simplified further, but I think its a bit more obvious what you're doing if you leave it that way.

Not necessary to change though if you think its more readable as is.


def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0
transport_error = self.round_trip_final_time / 2.0 + self.round_trip_base_time / 2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here about simplifying. Not necessary though.

self.log("min_range : {} , max_range : {}".format(min_range, max_range))

if (min_pass_value >= min_range) and (max_pass_value <= max_range):
self.log("Test passed !!!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this print to something a bit more formal: "Time within passing range"

self.log("Test passed !!!")
self.__result = True
elif (min_pass_value > max_range) or (max_pass_value < min_pass_value):
self.log("Test Failed !!!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this print to something a bit more formal: "Time outside of passing range. Timing drift seems to be present."

self.__result = False
else:
self.__result = True

self.log("Test Failed/Retry !!!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this print to something a bit more formal: "Test inconclusive due to transport delay, retrying"


self.log("Time required to measure drift accurately : {} ".format(measurement_stretch))
if measurement_stretch > self.max_measurement_time:
self.log("Measurement time is high to accurately qualify device timer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message is a bit vague here. Could it be: "Time required to determine device timer is too high due to transport delay, skipping"

if measurement_stretch > self.max_measurement_time:
self.log("Measurement time is high to accurately qualify device timer")
else:
time.sleep(measurement_stretch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you print here that you'll be waiting for x seconds? Just so the user knows how long to wait.

self.log("Actual Time : {}".format(device_time))
self.log("Nominal Time : {}".format(nominal_time))

self.log("min_pass : {} , max_pass_value : {}".format(min_pass_value, max_pass_value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you print min_pass_value instead of min_pass for consistency?

@studavekar
Copy link
Contributor Author

triggering a request to check CI if it effects other platform, have tested locally on couple of devices.

/morph test

@mbed-bot
Copy link

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 608

Build failed!

@studavekar
Copy link
Contributor Author

/morph test

@mbed-bot
Copy link

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 610

Test failed!

Case("Timers: 2x tickers", two_ticker_case_setup_handler_t, test_case_2x_ticker, two_ticker_case_teardown_handler_t),
Case("Timers: 2x callbacks", two_ticker_case_setup_handler_t, test_case_2x_callbacks,two_ticker_case_teardown_handler_t),
};
Case("Timers: 1x ticker", one_ticker_case_setup_handler_t,test_case_1x_ticker, one_ticker_case_teardown_handler_t),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indent got really wacky here!

Thread gt_conn_thread(osPriorityNormal, TEST_STACK_SIZE);

tick_thread.start(update_tick_thread);
gt_conn_thread.start(gt_comm_wait_thread);
Copy link
Contributor

@bridadan bridadan Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend not creating another thread for this and just to run it on the main stack to reduce memory requirements.

Oops, just realized you need control over the priorities, so disregard my comment

In the new approach the host controls the device activity when the test
should start, finish and pass/fail status is send to device. Also deprecate
the test cases which can't accurately test.
1) RtosTimer is been Deprecated by EventQueue.
2) wait_us testing can't be reliably tested.
1) mbed-os-tests-mbedmicro-rtos-mbed-basic tests use to fail when run in loop
after 20-25, This was because the stack used by test application was in range of
744-840 bytes. So bumped the stack size to 1024 bytes.

2) Corrected the assert and taken out the assert which was not required.
@studavekar
Copy link
Contributor Author

have taken care of most of the review comments, will trigger nightly to check if this patch helps with the results

/morph test-nightly

@mbed-bot
Copy link

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 618

Test failed!

@studavekar
Copy link
Contributor Author

NCS36510 ticker test failed as per logs looks like timer is ticking faster for this device. Need to investigate more on it.

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 29, 2017

@studavekar Any update for this patch?

@c1728p9
Copy link
Contributor

c1728p9 commented Jun 29, 2017

You could probably get rid of most of these tests since they are testing the same functionality of the us ticker. A more concise test could test the us ticker in the hal directly by reading the time rather than incrementing a variable on an interrupt.

In addition, if you test against a ticker_data_t object then you could use the same test to for both the us ticker (via get_us_ticker_data) and the lp ticker (via get_lp_ticker_data).

@studavekar
Copy link
Contributor Author

We have seen with current implementation NCS36510 timer is off by 1 sec which caused the failure, That is not the case with other devices. However as suggested we can read us ticker values directly.

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 30, 2017

@c1728p9 Valid points. I reviewed the ticker test, and came to the conclusion is not only testing timing via interrupts, but also how fast can you remove/add a new callback - I suspect this is why on semi board fails, because detach() - removing from queue, is under critical section, and onsemi has 2x 16bit tickers using ISR. I suspect this is the main problem (it accumulates by the time), and if you remove detach/attach, and have just one callback , it ticks as it should - no time drift. I tried to improve ticker in on semi target, there's some code in HAL that can be removed, still was failing though.
Even the first test case (1x callback) is for onsemi on the limit (I get failures from time to time):

[1498820977.89][HTST][INF] min_pass : 8.73335007429 , max_pass : 9.54240006208 for 5.0%%
[1498820977.89][HTST][INF] min_inconclusive : 8.63360005617 , max_inconclusive : 9.65265008211
[1498820977.89][SERI][TXD] {{pass;0}}
[1498820977.89][HTST][INF] Time reported by device: 9.458

very very close for this target :) We can squeeze it by few more milliseconds

We shall rethink entire ticker test. As above, we can test us ticker in HAL tests, and having this attach/detach might require more time than this test currently sets. Reading the tickers directly would be much more accurate.

We should move forward with this, timing failures the last couple of days are blocking patches. I can help, let me know your thoughts and lets make this in.

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 30, 2017

This is my proposal:

Remove attach/detach from this test, use 2 tickers, one firing at 0,2,4 intervals, the other one at 1,3,5 . Same result as this one, should not have any timedrift caused by attach/detach and basically the functionality we care about in this test. Detach/attach might be another test case, here is my commit 0xc0170@31c3734

the timing is still close to the upper limit but passing on ARM and GCC ARM . Shall we make this change?

I got also ncs changes but it hangs from time to time, exploring what is going on., My goal is that in ticker ISR, we hsould not read(), but rather use a reminder, here is a snippet 0xc0170@05fe987

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 30, 2017

the timing is still close to the upper limit but passing on ARM and GCC ARM . Shall we make this change?
I got also ncs changes but it hangs from time to time, exploring what is going on., My goal is that in ticker ISR, we hsould not read(), but rather use reminder, here is a snippet 0xc0170/mbed-os@05fe987

An update: This commit now 0xc0170@b3d3e3c is passing for GCC and ARM. We are not at the upper limit anymore. Somewhere in the middle . compare this to the above I posted.

[1498824908.79][HTST][INF] min_pass : 8.80935000181 , max_pass : 9.62219978571 for 5.0%%
[1498824908.79][HTST][INF] min_inconclusive : 8.70579980612 , max_inconclusive : 9.736650002
[1498824908.79][HTST][INF] Time reported by device: 9.211

@0xc0170 0xc0170 closed this Jun 30, 2017
@0xc0170 0xc0170 reopened this Jun 30, 2017
@0xc0170
Copy link
Contributor

0xc0170 commented Jun 30, 2017

I accidentally closed this, sorry ! Reopened. Please review my two commits, tests are now passing with some margin ! (using heap/stack statistics, previously I was testing before as it was failing even without them).

This is for 1x callback, GCC ARM, on semi target, we can see that from times close to the upper limit we are much much down.

[1498825219.20][HTST][INF] min_pass : 8.88820017576 , max_pass : 9.70304993391 for 5.0%%
[1498825219.20][HTST][INF] min_inconclusive : 8.7789499402 , max_inconclusive : 9.82380019426
[1498825219.20][HTST][INF] Time reported by device: 9.292

Based on this, we might not even need to update the ticker test, just this on semi ticker ! tested on this PR, all passing.

@studavekar studavekar changed the title Update timing tests to be robust. Update timing tests to be robust Jun 30, 2017
@0xc0170
Copy link
Contributor

0xc0170 commented Jul 3, 2017

@studavekar What is the status of this PR? Besides the ncs fix, that should land soon, what else is missing from this patch? Just CI and ready for review/integration?

It is important to fix ticker/timer tests and their implementations. As I noticed, the one device left is that maxim, that might also have some ticker problems. Anyone investigated this already? I don't have this target available at the moment to investigate.

@studavekar
Copy link
Contributor Author

@0xc0170 once ncs fix in in, we should be good with this PR. Have not investigated maxim device.

@theotherjimmy
Copy link
Contributor

/morph test-nightly

@studavekar
Copy link
Contributor Author

Will also trigger export-build in parallel to this PR to check if it can handle load and actually fixes the issue.

@mbed-bot
Copy link

mbed-bot commented Jul 3, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 723

Test failed!

@studavekar
Copy link
Contributor Author

features_netsocket_cellular_generic_modem_driver_tests_unit_tests_default tests failed on all the platform , #4672 should fix it.

@studavekar
Copy link
Contributor Author

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Jul 4, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 726

Test failed!

@0xc0170
Copy link
Contributor

0xc0170 commented Jul 4, 2017

MAX32625MBED-IAR.tests-mbedmicro-rtos-mbed-basic.tests-mbedmicro-rtos-mbed-basic failure :( @studavekar Do you have this target to test and reproduce this?

Copy link
Contributor

@bridadan bridadan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xc0170 @studavekar I can see why this one failed! And its very fixable 😄 Please see my comments on where you need to make the changes. I'll explain below.

This test failed because of a "feature" of greentea. If a greentea doesn't see a response to the first __sync message, it will send another one after a second. This is done in the off case that the device hasn't fully booted when the first one is sent. However this can be caused by high transport delay as well.

From the log:

[1499167387.28][HTST][INF] host test executor ver. 1.1.8
[1499167387.28][HTST][INF] copy image onto target...
[1499167387.28][COPY][INF] Waiting up to 60 sec for '04150000b08042ed00000000000000000000000097969904' mount point (current is 'H:')...
        1 file(s) copied.
[1499167393.71][HTST][INF] starting host test process...
[1499167394.07][CONN][INF] starting connection process...
[1499167394.07][CONN][INF] notify event queue about extra 60 sec timeout for serial port pooling
[1499167394.07][CONN][INF] initializing serial port listener... 
[1499167394.07][PLGN][INF] Waiting up to 60 sec for '04150000b08042ed00000000000000000000000097969904' serial port (current is 'COM8')...
[1499167394.08][HTST][INF] setting timeout to: 60 sec
[1499167394.43][SERI][INF] serial(port=COM8, baudrate=9600, read_timeout=0.01, write_timeout=5)
[1499167394.43][SERI][INF] reset device using 'default' plugin...
[1499167394.68][SERI][INF] waiting 1.00 sec after reset
[1499167395.68][SERI][INF] wait for it...
[1499167395.68][SERI][TXD] mbedmbedmbedmbedmbedmbedmbedmbedmbedmbed
[1499167395.68][CONN][INF] sending up to 2 __sync packets (specified with --sync=2)
[1499167395.68][CONN][INF] sending preamble '69235bae-b7bb-435c-a1fd-f8370feebe91'
[1499167395.68][SERI][TXD] {{__sync;69235bae-b7bb-435c-a1fd-f8370feebe91}}
[1499167396.70][CONN][INF] resending new preamble 'cd7132ca-d89f-4f91-ac13-b36caa40763f' after 1.02 sec
[1499167396.70][SERI][TXD] {{__sync;cd7132ca-d89f-4f91-ac13-b36caa40763f}}

The key line there being resending new preamble 'cd7132ca-d89f-4f91-ac13-b36caa40763f' after 1.02 sec. So at this point, the device should have two __sync messages in the serial buffer.

However, the greentea will only remove the first __sync message from the buffer. The test will then continue. For these tests, you send the initial request from the device with the key timing_drift_check_start. Then a call to greentea_parse_kv is made. This will actually pull the second __sync message from the serial buffer, NOT the responds from the host test for timing_drift_check_start. This will cause the test to use the UUID sent as the start time, causing the test to fail.

So TL;DR, in the places where I've specified in this review, just call greentea_parse_kv in a loop and check to make sure the key received matches the key you're expecting. It's only necessary to do this at the start of each test case.

ticker1->attach_us(&ticker_callback_0, ONE_MILLI_SEC);

// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call this in a loop, double checking that the key is what you expect it to be.

ticker1->attach_us(ticker_callback_1_switch_to_2, ONE_MILLI_SEC);

// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call this in a loop, double checking that the key is what you expect it to be.

timeout.attach_us(set_incremeant_count, ONE_MILLI_SEC);

// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call this in a loop, double checking that the key is what you expect it to be.

Thread tick_thread(osPriorityNormal, TEST_STACK_SIZE);
tick_thread.start(print_tick_thread);
// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call this in a loop, double checking that the key is what you expect it to be.

@studavekar
Copy link
Contributor Author

Yeah i thought the uart stream is corrupted with below string received on host.

 found KV pair in stream: {{__sync;842000}}, queued... 

Can loop forever till we get the proper key.

@bridadan
Copy link
Contributor

bridadan commented Jul 5, 2017

Yeah not a corruption, really just greentea attempting to ensure a handshake happens when the system is under heavy load. When tests don't have a heavy reliance (for messages containing test information) on the host PC, you never notice it doing this and it just works. So in this case you just have to handle this behavior in the test.

This is required as greentea can send multiple __sync packets and cause the
to be in unexpected state.
@studavekar
Copy link
Contributor Author

studavekar commented Jul 5, 2017

Have added a check to validate the key received from greentea. Also have triggered morph test and export build in parallel to load the systems. We shouldn't have any timing tests failure under such load.

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Jul 5, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 735

Test failed!

@studavekar
Copy link
Contributor Author

Changes are good there are no timing tests failure :). tests-mbed_functional-functionpointer test failed but doesn't seem to be related.

@studavekar
Copy link
Contributor Author

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Jul 6, 2017

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 737

All builds and test passed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants