Skip to content

SPIFBlockDevice doesn't play nice on shared SPI bus #11732. #12681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 1, 2020

Conversation

mtomczykmobica
Copy link

@mtomczykmobica mtomczykmobica commented Mar 24, 2020

Fixes #11732 by use valid constructor SPI

With current SPI pass use_gpio_ssel to the SPI constructor. SPI write command use lock/unlock and select deselect inside. So no more needed to use this methods outside the SPI class.

Summary of changes

Impact of changes

Migration actions required

Documentation

Not needed to change documentation.


Pull request type

[X] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

[] No Tests required for this change (E.g docs only update)
[X] Covered by existing mbed-os tests (Greentea or Unittest)
[] Tests / results supplied as part of this PR

Reviewers

@SeppoTakalo
@VeijoPesonen
@0xc0170

@mtomczykmobica mtomczykmobica changed the title [IOTSTOR-990] SPIFBlockDevice doesn't play nice on shared SPI bus. Fi… SPIFBlockDevice doesn't play nice on shared SPI bus #11732. Fi… Mar 24, 2020
@mtomczykmobica mtomczykmobica changed the title SPIFBlockDevice doesn't play nice on shared SPI bus #11732. Fi… SPIFBlockDevice doesn't play nice on shared SPI bus #11732. Mar 24, 2020
@ciarmcom ciarmcom requested review from 0xc0170, SeppoTakalo, VeijoPesonen and a team March 24, 2020 10:00
@ciarmcom
Copy link
Member

@mtomczykmobica, thank you for your changes.
@VeijoPesonen @SeppoTakalo @0xc0170 @ARMmbed/mbed-os-storage @ARMmbed/mbed-os-maintainers please review.

@@ -1129,29 +1129,22 @@ void SDBlockDevice::_spi_wait(uint8_t count)

void SDBlockDevice::_spi_init()
{
_spi.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the locks removed in here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lock and unlock is made in select and unselect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but _select and _deselect are not called here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lock and unlock are called separately in each frequency, format and set_default_write_value -function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VeijoPesonen exactly Jarno is right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so there is no reason to protect the sequence somehow. Which raises a question that should SPI.h expose those lock/unlock-functions if all the public functions call those implicitly? @kjbracey-arm any thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably lock/unlock shouldn't be public API any more. Most driver classes have virtual protected lock/unlock that lets you override them for a UnlockedX class or whatever, but for this class they were then made public to let apps use them as a a bus lock.

select/deselect basically replace them for app use. But there's no way to deprecate lock as a public method while keeping it available for the override purpose.

Now, you don't normally need to manually call select or deselect either, but you do need to call them if you want to group multiple transactions together.

An analogue in POSIX is flock/funlock - each stdio call like printf does call them internally, but you'd want to call them yourself if you wanted to keep two printfs together.

So these two successfully lock against each other (X can't occur between A and B).

 // Task 1
 spi.write(X); 

 // Task 2
 spi.select();
 spi.write(A);
 spi.write(B);
 spi.deselect();

But only because X has the implicit select.

(Don't make the mistake I made a few years ago of writing code like that when I had a bus for which a single op DIDN'T have a built-in implicit mutex lock. If it didn't you'd need to put the manual select around the single op too. I got confused between the "lock two things" together function and the "mutex against other tasks" function).

@@ -470,9 +470,6 @@ spif_bd_error SPIFBlockDevice::_spi_send_read_command(int read_inst, uint8_t *bu
uint32_t dummy_bytes = _dummy_and_mode_cycles / 8;
int dummy_byte = 0;

// csel must go low for the entire command (Inst, Address and Data)
_cs = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be replaced with _spi.select()?
This also applies to the lines below...
Maybe there is a good reason, but I don't understand it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is write function:
int SPI::write(int value)
{
select();
int ret = spi_master_write(&_peripheral->spi, value);
deselect();
return ret;
}
so there is done select and deselect.
additionaly lock and unlock is made in select and unselect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True about select and deselect, but looking at the comment csel must go low for the entire command (Inst, Address and Data) flipping the chip select off and on in every write function actually changes the behaviour of the whole function here. Chip select isn't low for the entire command now. Was that intentional?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i correct understand SPI is only interface between block device, so when we want write/ready to block device something we select cs and do operations on spi (one block device operation can take few spi actions). Function _

int DataFlashBlockDevice::read(void *buffer, bd_addr_t addr, bd_size_t size)

_ make few that operation and i cannot see reason to all time hold down cs (in my opinion cs should be down only when write spi operation).
@SeppoTakalo please review my understanding.

_spi_wait(10);
_spi.unlock();
}

void SDBlockDevice::_select()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually understand what is the purpose of this function. I would have assumed lock would be acquired here and released in _deselect

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spi_wait() is:
// SPI function to wait for count
void SDBlockDevice::_spi_wait(uint8_t count)
{
for (uint8_t i = 0; i < count; ++i) {
_spi.write(SPI_FILL_CHAR);
}
}
and _spi.write:
int SPI::write(int value)
{
select();
int ret = spi_master_write(&_peripheral->spi, value);
deselect();
return ret;
}

and deselect:
void SPI::deselect()
{
if (--_select_count == 0) {
_set_ssel(1);
}
unlock();
}
So unlock is in spi_wait.

Copy link
Contributor

@jarlamsa jarlamsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like some clarification about the issue mentioned about chip select must be low for the entire read command.

@@ -470,9 +470,6 @@ spif_bd_error SPIFBlockDevice::_spi_send_read_command(int read_inst, uint8_t *bu
uint32_t dummy_bytes = _dummy_and_mode_cycles / 8;
int dummy_byte = 0;

// csel must go low for the entire command (Inst, Address and Data)
_cs = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True about select and deselect, but looking at the comment csel must go low for the entire command (Inst, Address and Data) flipping the chip select off and on in every write function actually changes the behaviour of the whole function here. Chip select isn't low for the entire command now. Was that intentional?

@mtomczykmobica
Copy link
Author

I would like some clarification about the issue mentioned about chip select must be low for the entire read command.

If i correct understand SPI is only interface between block device, so when we want write/ready to block device something we select cs and do operations on spi (one block device operation can take few spi actions). Function _

int DataFlashBlockDevice::read(void *buffer, bd_addr_t addr, bd_size_t size)

_ make few that operation and i cannot see reason to all time hold down cs (in my opinion cs should be down only when write spi operation).
@SeppoTakalo please review my understanding.

@SeppoTakalo
Copy link
Contributor

SPI flash memories define transactions as a complete sequences, I don't believe we could toggle the select line within a block read/write operation without having side effect.
It might work, but I suspect its really unrealiable.

Instead, I suggest that you use SPI::select() and SPI::deselect() in places where _cs = 0 and _cs = 1 have been used.

Secondly, study whether you can completely drop SDBlockDevice::_select() and SDBlockDevice::_deselect() and just use SPI::select() and SPI::deselect(). I don't really understand why they send extra fill cycles to SPI bus when CS is not toggled. This looks like a workaround for buggy device or driver, we should see in tests whether it is actually needed.

adbridge
adbridge previously approved these changes Mar 26, 2020
Copy link
Contributor

@adbridge adbridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT looks ok

@mergify mergify bot added needs: CI and removed needs: review labels Mar 26, 2020
@0xc0170
Copy link
Contributor

0xc0170 commented Mar 26, 2020

CI started

@mergify mergify bot added needs: work and removed needs: CI labels Mar 26, 2020
@mbed-ci
Copy link

mbed-ci commented Mar 27, 2020

Test run: FAILED

Summary: 3 of 6 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_dynamic-memory-usage
  • jenkins-ci/mbed-os-ci_cloud-client-pytest
  • jenkins-ci/mbed-os-ci_greentea-test

@mergify mergify bot dismissed adbridge’s stale review March 30, 2020 07:19

Pull request has been modified.

@mtomczykmobica
Copy link
Author

SPI flash memories define transactions as a complete sequences, I don't believe we could toggle the select line within a block read/write operation without having side effect.
It might work, but I suspect its really unrealiable.

Instead, I suggest that you use SPI::select() and SPI::deselect() in places where _cs = 0 and _cs = 1 have been used.

Secondly, study whether you can completely drop SDBlockDevice::_select() and SDBlockDevice::_deselect() and just use SPI::select() and SPI::deselect(). I don't really understand why they send extra fill cycles to SPI bus when CS is not toggled. This looks like a workaround for buggy device or driver, we should see in tests whether it is actually needed.

OK, I've changed code as suggested. let's wait for result.

@kjbracey
Copy link
Contributor

kjbracey commented Mar 30, 2020

I don't believe we could toggle the select line within a block read/write operation without having side effect.

Even if you could (from the device protocol point of view) it might make it non-thread-safe. (Unless there's another higher-level locking mechanism). Another thread might be accessing the same device. If your two commands are "set address pointer" and "read data from address", or something like that, then you need to maintain some lock on the device to stop anyone else messing with the address pointer between the two.

Doing the SPI::select is an easy way to achieve that lock, even if you could technically release the CSEL for an instant. The "select" has a free "lock" on the device (although it actually locks the entire bus).

If they were slow commands, on the other hand, a high-level mutex separate from the SPI bus lock would be worthwhile. You might not want to hold the SPI bus for the entire duration of an erase cycle, say.

Instead, I suggest that you use SPI::select() and SPI::deselect() in places where _cs = 0 and _cs = 1 have been used.

Absolutely, you need to do this to maintain the same grouping and timing, adding only the locking.

Secondly, study whether you can completely drop SDBlockDevice::_select() and SDBlockDevice::_deselect()

But I would be a bit wary about pulling out those timing fudges. SD devices are many and varied, and I wouldn't be at all surprised if some couldn't cope with the "immediate" select that SPI alone gives you. We discussed adding timing parameters to SPI itself for that, but hasn't happened yet.

@mtomczykmobica
Copy link
Author

Even if you could (from the device protocol point of view) it might make it non-thread-safe. (Unless there's another higher-level locking mechanism). Another thread might be accessing the same device. If your two commands are "set address pointer" and "read data from address", or something like that, then you need to maintain some lock on the device to stop anyone else messing with the address pointer between the two.

SPI select and deselect functions have included lock and unlock function. So it is save.

@mtomczykmobica
Copy link
Author

mtomczykmobica commented Mar 30, 2020

I don't believe we could toggle the select line within a block read/write operation without having side effect.

Even if you could (from the device protocol point of view) it might make it non-thread-safe. (Unless there's another higher-level locking mechanism). Another thread might be accessing the same device. If your two commands are "set address pointer" and "read data from address", or something like that, then you need to maintain some lock on the device to stop anyone else messing with the address pointer between the two.

select and deselect have included lock and unlock functions. New fix implements using select and deselect SPI functions.

Doing the SPI::select is an easy way to achieve that lock, even if you could technically release the CSEL for an instant. The "select" has a free "lock" on the device (although it actually locks the entire bus).

Done in this way now.

If they were slow commands, on the other hand, a high-level mutex separate from the SPI bus lock would be worthwhile. You might not want to hold the SPI bus for the entire duration of an erase cycle, say.

Instead, I suggest that you use SPI::select() and SPI::deselect() in places where _cs = 0 and _cs = 1 have been used.

Absolutely, you need to do this to maintain the same grouping and timing, adding only the locking.

Done

Secondly, study whether you can completely drop SDBlockDevice::_select() and SDBlockDevice::_deselect()

But I would be a bit wary about pulling out those timing fudges. SD devices are many and varied, and I wouldn't be at all surprised if some couldn't cope with the "immediate" select that SPI alone gives you. We discussed adding timing parameters to SPI itself for that, but hasn't happened yet.

@mergify mergify bot dismissed kjbracey’s stale review March 30, 2020 12:07

Pull request has been modified.

SeppoTakalo
SeppoTakalo previously approved these changes Mar 30, 2020
@mergify mergify bot added needs: CI and removed needs: work labels Mar 30, 2020
@mergify mergify bot added needs: work and removed needs: CI labels Mar 30, 2020
@mergify mergify bot dismissed stale reviews from kjbracey and SeppoTakalo March 30, 2020 13:32

Pull request has been modified.

…ck to use select and deselect, use correct spi constructor
@mergify mergify bot added needs: CI and removed needs: work labels Mar 31, 2020
@0xc0170
Copy link
Contributor

0xc0170 commented Mar 31, 2020

CI started

@mbed-ci
Copy link

mbed-ci commented Apr 1, 2020

Test run: SUCCESS

Summary: 6 of 6 test jobs passed
Build number : 2
Build artifacts

@0xc0170 0xc0170 merged commit 7a0a168 into ARMmbed:master Apr 1, 2020
@mergify mergify bot removed the ready for merge label Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SPIFBlockDevice doesn't play nice on shared SPI bus
10 participants