Coordinating multiple initiators

Harshal · July 24, 2023, 12:55pm

Hello everyone,

I am not sure whether this question belongs here, but any help would be very much appreciated.

I am working with DWM3001CDK with multiple initiators (3) and responders (2-5) making up a SS-TWR network. I am using the sample code from DWM3000 as a start point. The setup is such that 1 initiator sends ranging messages to a group of responder addresses (1-10), gets back their responses, and makes way (i.e., waits its turn without transmitting) for another initiator to do the same. I am opening ports to JLink using C++ socket lib.
However, after a varied amount of time(2 minutes to a few hours), at least one of the initiator devices’ ranging values cannot seemingly be retrieved. The stability improves when I add a delay/sleep between the ranging requests to different devices (connected to different ports). The same setup is very stable when running just one initiator. I believe that the above system was created to not have multiple initiators trying to talk to the same responder at the “same” time, but in contrast running ranging asynchronously is able to keep the initiators and their data alive for a longer time. The ranging does freeze momentarily randomly, but is able to recover much easier than in the synchronized calls to get UWB ranges.

Thanks,
Harshal

AndyA · July 26, 2023, 7:51am

How is the synchronisation done? Is it handled at the radio level or by an external controller telling them when to run? If an external controller then how deterministic is it, how much variation between when you want an initiator to start and when it actually sends a radio packet?

Do the radio packets sent in each direction include the initiator ID so that each device can tell who they are for?

Ideally you’d design a system that fails gracefully and recovers, so if things do talk at the same time nothing crashes and things automatically re-synchronise and don’t clash next time around.

Harshal · July 26, 2023, 2:56pm

Hi AndyA,

An external controller sends the trigger over network (ethernet/WiFi) to trigger each of the units connected to different RaspberryPis. Since this can get pretty undeterministic, the external controller accepts ‘no ranging data’ in case of lost packets but otherwise I would have thought that the command and trigger time difference shouldn’t play a role, perhaps wrongly(?).

The packets in each direction include initiator and responder IDs to selectively filter responses. Each sensor unit has a unique ID as well. I should clarify here that the initiator is issuing a lot of redundant ranging requests (one for each responder existing = 5-10), although this is temporary. I might be using up more air time and working on lower frequency, but would that be a overloading it?

The idea of the external controller was intended for the exact same reason, i.e., to serialize the ranging requests, so regardless of the network variability, a response would be received or lost.

Thanks for your input and patience!

AndyA · July 26, 2023, 4:03pm

Given the delays on wifi at times unless you are leaving some fairly large guard periods between units (which will mess with your maximum measurement rate) there is a real risk of overlaps with that approach.
Just for reference the approach I took was that each initiator is allocated a portion of the airtime. They all know the order in which to transmit, all packets include where we are in that order.
Each unit listens for a while on startup. If it hears another initiator then it calculates where that unit is in the time sequence and how long until its turn. If after a while it’s not heard anything it just goes ahead whenever it wants.
All the time between it’s turns all initiators are listening and refining their estimate of when their next turn starts.
This way assuming you can keep their configuration up to date it’s possible to get very tight time synchronisation with very minimal risk of overlap. If there is a collision it normally sorts itself out in a cycle or two.
Assuming everything is within range I normally see a measurement success rate of 99% or more with minimal down time, my total measurement rate for multiple devices is the same as for a single device.

Harshal · July 27, 2023, 6:20am

I see what you mean. I will check with the improvements I can get with this approach instead. Thank you for the guidance!