CIR data extraction with dwm1001

Hello everyone,

i am developing a CIR data extraction starting from dwm1001 examples provided by decaware in GitHub. I have successfully modified the ss_twr_init and ss_twr_resp so they share messages as expected, but I’m facing errors in my cir extraction that I can’t solve easily as a I’m a newbie in firmware and maybe I’m catching something wrong. This is my main function in the responder to analyze the cir:

int ss_init_run(void) {
  uart_print("Tx starts\r\n");
  /* Loop forever initiating ranging exchanges. */

  /* Write frame data to DW1000 and prepare transmission. See NOTE 3 below. */
  tx_poll_msg[ALL_MSG_SN_IDX] = frame_seq_nb;
  dwt_write32bitreg(SYS_STATUS_ID, SYS_STATUS_TXFRS);
  dwt_writetxdata(sizeof(tx_poll_msg), tx_poll_msg, 0); /* Zero offset in TX buffer. */
  dwt_writetxfctrl(sizeof(tx_poll_msg), 0, 1); /* Zero offset in TX buffer, ranging. */

  /* Start transmission, indicating that a response is expected so that reception is enabled automatically after the frame is sent and the delay
  * set by dwt_setrxaftertxdelay() has elapsed. */
  dwt_starttx(DWT_START_TX_IMMEDIATE | DWT_RESPONSE_EXPECTED);
  tx_count++;
  uart_printf("Transmission # : %d\r\n",tx_count);
  status_reg = dwt_read32bitreg(SYS_STATUS_ID);
  if (status_reg & SYS_STATUS_TXFRS) {
    uart_print("Tx OK\r\n");
  } else {
    uart_print("Tx ERROR\r\n");
  }


  /* We assume that the transmission is achieved correctly, poll for reception of a frame or error/timeout. See NOTE 4 below. */
  while (!((status_reg = dwt_read32bitreg(SYS_STATUS_ID)) & (SYS_STATUS_RXFCG | SYS_STATUS_ALL_RX_TO | SYS_STATUS_ALL_RX_ERR))) //TODO: Software timeout
  {};

  /* Increment frame sequence number after transmission of the poll message (modulo 256). */
frame_seq_nb++;

  if (status_reg & SYS_STATUS_RXFCG) {
  uart_printf("Reception OK\r\n");
    dwt_write32bitreg(SYS_STATUS_ID, SYS_STATUS_RXFCG);
    uart_print("CIR_START\r\n");

    for (int offset = 0; offset < CIR_LEN; offset += BlockSize) {
      int samples = (CIR_LEN - offset < BlockSize) ? CIR_LEN - offset : BlockSize;
      dwt_readaccdata(accumulator_data, samples * 4 + 1, offset * 4);

      // Debug: Print first bytes of each block
      uart_printf("Offset %d, First bytes: %d %d %d %d\r\n", offset, accumulator_data[0], accumulator_data[1], accumulator_data[2], accumulator_data[3]);

      for (int i = 0; i < samples; i++) {
        uint8 lowRealPart = accumulator_data[i * 4 + 1];
        uint8 highRealPart = accumulator_data[i * 4 + 2];
        uint8 lowImaginaryPart = accumulator_data[i * 4 + 3];
        uint8 highImaginaryPart = accumulator_data[i * 4 + 4];
        int16_t decimalRealPart = (int16_t)((highRealPart << 8) | lowRealPart);
        int16_t decimalImaginaryPart = (int16_t)((highImaginaryPart << 8) | lowImaginaryPart);

        uart_printf("%d, %d, %d\r\n", offset + i, decimalRealPart, decimalImaginaryPart);

        // Flush UART every 5 samples
        if ((i + 1) % 5 == 0) {
          while (app_uart_flush() != NRF_SUCCESS) {
            vTaskDelay(pdMS_TO_TICKS(1));
          }
        }
      }
    }
    uart_print("CIR_END\r\n");
  } else if (status_reg & SYS_STATUS_ALL_RX_ERR) {
    uart_printf("Reception ERROR\r\n");
    if (status_reg & SYS_STATUS_RXFCE) uart_printf("--->CRC ERROR\r\n");
    else if (status_reg & SYS_STATUS_RXPHE) uart_printf("--->Header ERROR\r\n");
    else if (status_reg & SYS_STATUS_RXOVRR) uart_printf("--->Buffer Overflow ERROR\r\n");
    else if (status_reg & SYS_STATUS_RXSFDTO) uart_printf("--->Prefix detection ERROR\r\n");
    dwt_rxreset();
  } else if (status_reg & SYS_STATUS_ALL_RX_TO) {
    uart_printf("TimeOut ERROR\r\n");
    dwt_rxreset();
  } else {
    uart_printf("status_reg ERROR\r\n");
  }

  return 0;
}

My CIR_LEN is 1016, but if I set BlockSize to 254, the initiator prints 13-14 samples and then hangs. If a set it to a lower number, for example 64, it prints 0s constantly. I have developed the code based on other posts from this mode, so, as I said, maybe I’m misunderstanding something, but I can’t understand why setting the BlockSize to different values leads to very different behaviors, specially when it is a parameter that only sets how many blocks are processed for printing.
Also, trying to allocate memory with malloc is constantly giving me errors, and I do not know why, whn my Memory heap is 8kB (and I was trying to allocate it outside the loop). I had to set it statically:

static uint8 accumulator_data_array[BlockSize * 4 + 1];
static uint8 *accumulator_data = accumulator_data_array;

Every suggestion is welcome,

Thank you so much.

I’ve narrowed down the issue and I’ve discovered that I’m facing issues in the uart put function which I’m using for sending/printing the data via UART.
First of all, I have design an small function because I wasn’t able to map printf to uart, so I used the app uart API.
The problem is very strange: before reading the accumulator data, I can send as much as I want through uart print, but after reading the accumulator data, depending on the block read, different things happen. For example, if my block size is 254, it hangs after 14 prints (I have checked the pointer and array and it has all the data, so i can start in the ending samples or starting ones, only 14 prints can be achivied). If my block size is 64, prints every sample but they are at 0. Finally, if my block size is 200, I can print 17 samples, but the first ones are 0.
Probably this issue is related with memory, but I do not know exactly what is happening or how to solve it. I hope someone can help me.

Thank you so much.