DW3000 SPI collision errors

kylepd · May 10, 2024, 2:34pm

I am using the DWM3001C and am occasionally getting SPI error callbacks from the driver with status_hi set to 0x0800 indicating an SPI collision. When I query the SPI collision register I get 0x04 (CIA RAM conflict). This is occurring while using double buffers I think while reading diagnostics with dwt_readdiagnostics.

These logs have the headers of SPI operations leading up to the error.

############## /dev/cu.usbmodem0007601470141: [00:00:10.012,725] <inf> dw3000: RX OK event
/dev/cu.usbmodem0007601470141: [00:00:10.012,756] <inf> dw3000: wr: header
/dev/cu.usbmodem0007601470141: 85                                               |.                
/dev/cu.usbmodem0007601470141: [00:00:10.012,817] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 26                                               |&                
/dev/cu.usbmodem0007601470141: [00:00:10.012,908] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 3c                                               |<                
############## /dev/cu.usbmodem0007601470141: [00:00:10.013,092] <wrn> dw3000: IRQ callback work already busy
############## /dev/cu.usbmodem0007601470141: [00:00:10.013,427] <inf> dw3000: RX done
/dev/cu.usbmodem0007601470141: [00:00:10.013,458] <inf> dw3000: wr: header
/dev/cu.usbmodem0007601470141: a7                                               |.                
/dev/cu.usbmodem0007601470141: [00:00:10.013,519] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 3e                                               |>                
/dev/cu.usbmodem0007601470141: [00:00:10.013,580] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 41 10                                            |A.               
/dev/cu.usbmodem0007601470141: [00:00:10.013,641] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 41 20                                            |A                
/dev/cu.usbmodem0007601470141: [00:00:10.013,671] <inf> dw3000: wr: header
/dev/cu.usbmodem0007601470141: c1 10                                            |..               
/dev/cu.usbmodem0007601470141: [00:00:10.013,732] <inf> dw3000: wr: header
/dev/cu.usbmodem0007601470141: c1 20                                            |.                
/dev/cu.usbmodem0007601470141: [00:00:10.013,793] <inf> dw3000: rd: header
/dev/cu.usbmodem0007601470141: 42 80                                            |B.               
/dev/cu.usbmodem0007601470141: [00:00:10.013,824] <err> dw3000: SPI error callback sys_hi: 0x0800, collission: 0x04

dw3000: IRQ callback work already busy This line is when the IRQ gpio went high work was then scheduled to handle that after finishing the current interrupt handler. So I think this is when the SPI collision occurred.

What are likely causes of SPI conflicting with CIA when accessing RAM and how can I avoid this happening?

Yves_Bernard_Qorvo · May 21, 2024, 4:44pm

Hi Kyle,

This issue does not ring a bell, how fast did you configure the SPI ?

I can see you’re on a linux host, and qorvo does not provide a linux driver for the DW3000, so it’s difficult for us to comment here.

kylepd · May 24, 2024, 8:00am

SPI is at 8M.

As for linux I’m not using linux. The terminal logs are reading a UART adapter connected to the board I am working on. We are using the qorvo provided binary drivers hence why I mentioned the call to dwt_readdiagnostics.

kylepd · July 8, 2024, 11:10am

This is still occurring with SPI at 32M on a DW33110w @Yves_Bernard_Qorvo

AndyA · July 8, 2024, 11:24am

It looks like your code is trying to start an SPI transaction when an SPI transaction is already in progress. Either you need to structure your code so that it doesn’t do this or handle the error condition by either waiting and retrying or coping with the failure in some other way.

kylepd · July 8, 2024, 1:04pm

I dont believe this to be the case for 2 reasons.

We have a semaphore around spi
we only have one spi peripheral that is connected to the driver so its not possible to run 2 transactions over it at once (the semaphore is to guard the CS pin)

AndyA · July 8, 2024, 1:44pm

If you don’t think that’s the case then what you you think it means by a collision on an SPI bus?

SPI busses normally only have one master. They are a synchronous bus where data is only sent on the selected edge of the clock signal, the only thing that can drive the clock signal is the bus master. The only thing that can drive the MOSI line is the bus master. Multiple devices can, if fitted, drive the MISO line at the same time but only if the bus master selects both devices at the same time. A slave device can not initiate a data transfer, can only drive it’s output line when selected, and can only change it’s output data on the edge of the clock, which is driven by the master.

As such the only way to get a bus collision at a physical hardware level is to either have hardware with two masters connected to the same bus or to have multiple slaves on the same bus and invalid firmware that selects both at the same time.
Some systems do allow multi-master operation using select lines to determine the current master and will consider it a collision if both devices attempt to be a master at the same time but this is not a common configuration. I doubt that is how you have your system configured since you’ve not mentioned having multiple processors on the bus.

Beyond a multi-processor multi-master configuration hardware level collisions are basically impossible and the hardware typically has no way of detecting something like that if it were to happen.

If the driver is reporting a collision then the logical conclusion is that your semaphore isn’t working and you are trying to perform two SPI accesses to the device at the same time. If you have other devices on the same SPI bus it is also possible you are trying to access a different device on the same bus at the same time as the DW chip.

kylepd · July 8, 2024, 1:58pm

I think it means the spi is trying to interact with a register with an ongoing operation internal to the dw chip.

To be clear are you from qorvo and do you know what this error from their chip actually is? If not your answer is just speculation which isnt helping the thread.

AndyA · July 8, 2024, 2:37pm

No, I’m not from qorvo.

And you’re correct, this does look like an internal error from their drivers. My apologies, the vast majority of firmware related questions here end up being people with limited experience and some fairly basic bugs in their code. It becomes habit to assume the error is due to them missing something.
Looking at the source for the driver (the old one before it went closed source) there is only one mention of SPI_COLLISION I can see and that’s commented out. So it is some error condition they can report but the route to it being reported isn’t clear.

To be honest your best bet is to get a copy of the old code, hope that bit isn’t changes significantly from the version you’re using now and try to track back what conditions will result in those flags getting set. Or run the old open source version of the driver that you can then debug and see what happens.

Or you could wait for a reply from someone at Qorvo. I’ve been waiting for a reply to a technical question for around 5 years but you never know, you may have better luck.

simon.desfarges · July 8, 2024, 5:04pm

Hello,

After the IRQ from the chip is raised, it is possible that the dwt_readdiagnostics command arrives too fast while the CIA did not finish his job. It could happen if there is a large payload and the ISR mask maybe not correctly set: the CIA processed a bit of the UWB frame, raises the IRQ and continues processing the payload (CRC of the payload). To verify this hypothesis is correct, you can try adding some “sleep” before dwt_readdiagnostics. You can also review the flags provided to dwt_setinterrupt and ensure these are the correct ones.

kylepd · July 9, 2024, 10:31am

The CIA done bit in status is set at the point we are reading the diagnostics register.