Skip to content

Make SPI receive buffer 32-bit aligned on 8266 #125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

earlephilhower
Copy link

The ESP8266 SPI implementation needs a 4-byte aligned read buffer,
same as for transmit. Fix the ESP8266 SPI driver wrapper to ensure
this alignment occurs, using same method as for transmit.

Without this change, occasional LoadStoreAlignment exceptions occur
depending on the input buffer alignment.

The ESP8266 SPI implementation needs a 4-byte aligned read buffer,
same as for transmit.  Fix the ESP8266 SPI driver wrapper to ensure
this alignment occurs, using same method as for transmit.
@greiman
Copy link
Owner

greiman commented Jan 22, 2019

I will add this soon. I must add it to my local copy since the driver is used in several libraries and I update them with scripts.

@greiman greiman closed this Jan 22, 2019
@greiman
Copy link
Owner

greiman commented Jan 22, 2019

I looked a 2.4.0 and it does byte transfers for input and doesn't need 32-bit alignment.

2.5.0beta uses 32-bit transfers but looks like it may input too many bytes.

See this.

Here is the code. Note the two lines near the bottom with <<<<<<<.

/**
 * Note:
 *  in and out need to be aligned to 32Bit
 *  or you get an Fatal exception (9)
 * @param out uint8_t *
 * @param in  uint8_t *
 * @param size uint32_t
 */
void SPIClass::transferBytes(const uint8_t * out, uint8_t * in, uint32_t size) {
    while(size) {
        if(size > 64) {
            transferBytes_(out, in, 64);
            size -= 64;
            if(out) out += 64;
            if(in) in += 64;
        } else {
            transferBytes_(out, in, size);
            size = 0;
        }
    }
}

/**
 * Note:
 *  in and out need to be aligned to 32Bit
 *  or you get an Fatal exception (9)
 * @param out uint8_t *
 * @param in  uint8_t *
 * @param size uint8_t (max 64)
 */
void SPIClass::transferBytes_(const uint8_t * out, uint8_t * in, uint8_t size) {
    while(SPI1CMD & SPIBUSY) {}
    // Set in/out Bits to transfer

    setDataBits(size * 8);

    volatile uint32_t * fifoPtr = &SPI1W0;
    uint8_t dataSize = ((size + 3) / 4);

    if(out) {
        uint32_t * dataPtr = (uint32_t*) out;
        while(dataSize--) {
            *fifoPtr = *dataPtr;
            dataPtr++;
            fifoPtr++;
        }
    } else {
        // no out data only read fill with dummy data!
        while(dataSize--) {
            *fifoPtr = 0xFFFFFFFF;
            fifoPtr++;
        }
    }

    SPI1CMD |= SPIBUSY;
    while(SPI1CMD & SPIBUSY) {}

    if(in) {
        uint32_t * dataPtr = (uint32_t*) in;
        fifoPtr = &SPI1W0;
        dataSize = ((size + 3) / 4);  <<<<<<< Will round up size
        while(dataSize--) {
            *dataPtr = *fifoPtr;  <<<<<< 32-bit store
            dataPtr++;
            fifoPtr++;
        }
    }
}

@earlephilhower
Copy link
Author

Yes, you're correct. We're going w/the 32bit mode from now on with that specific call for 2.5.x and forward.

That extra write is a good catch, I didn't note it as I was eyeballing things. The "correct" way now on 8266 is to use void SPIClass::transfer(void *buf, uint16_t count) which handles this.

Would you like me to submit a PR w/those changes? (basically replace transferBytes with transfer() on the 8266 SPI code)

@greiman
Copy link
Owner

greiman commented Jan 23, 2019

I don't think void SPIClass::transfer(void *buf, uint16_t count) will work.

It it appears to send the content of the buffer then receive into the buffer. SD cards look at the bytes sent on the SPI bus for a command while sending bytes to the master on a read.

You should send 0XFF over the SPI bus when receiving from an SD.

You end a multi-sector read by sending CMD12, which is 0X4C. The SD looks for the 0X40 bit so 0X00 might work.

See the lines with <<<<<<

void SPIClass::transfer(void *buf, uint16_t count) {
    uint8_t *cbuf = reinterpret_cast<uint8_t*>(buf);

    // cbuf may not be 32bits-aligned
    for (; (((unsigned long)cbuf) & 3) && count; cbuf++, count--)
        *cbuf = transfer(*cbuf);    <<<<<<<<<<<<<<

    // cbuf is now aligned
    // count may not be a multiple of 4
    uint16_t count4 = count & ~3;
    transferBytes(cbuf, cbuf, count4);  <<<<<<<<<<<<<

    // finish the last <4 bytes
    cbuf += count4;
    count -= count4;
    for (; count; cbuf++, count--)
        *cbuf = transfer(*cbuf);   <<<<<<<<<<<<<<<<
}

@greiman
Copy link
Owner

greiman commented Jan 23, 2019

I did the following quick fix to SdSpiESP8266.cpp.

uint8_t SdAltSpiDriver::receive(uint8_t* buf, size_t n) {
  // Adjust to 32-bit alignment.
  while ((reinterpret_cast<uintptr_t>(buf) & 0X3) && n) {
    *buf++ = SPI.transfer(0xff);
    n--;
  }
  // Buff now 32-bit aligned
  size_t n4 = 4*(n/4);
  // Do multiple of four byte transfers.
  SPI.transferBytes(0, buf, n4);
  buf +=n4;
  n -= n4;
  while (n) {
    *buf++ = SPI.transfer(0xff);
    n--;
  }
  return 0;
}

I did some benchmarks with 2.4.0 and 2.5.0beta2. 32-bit helps some with read.

2.4.0, SdFat V1:

File size 5 MB
Buffer size 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
405.06,9367,1048,1262
424.82,8792,1082,1204

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1000.14,1009,491,510
1000.14,1165,491,510

2.5.0beta2, SdFat V1:

File size 5 MB
Buffer size 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
422.31,8799,1090,1211
421.88,8828,1090,1212

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1200.12,837,410,425
1200.40,1007,410,425

The new version of SdFat with dedicated SPI access is much faster. Write is about six times faster for 512 byte writes. Read is about two times faster.

2.5.0beta2, SdFat V2 dedicated SPI:

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
2787.07,4709,177,182
2828.05,10556,177,180

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
2547.12,420,199,200
2547.12,412,199,199

@earlephilhower
Copy link
Author

Great, that's basically what I was going to do (and what transfer() does, albeit with unwanted reads). We've got an issue open to clean up the transferBytes call, but it looks like it will not make it in 2.5.0-rel. When that's done, basically the same code you've implemented will be moved into transferBytes proper.

earlephilhower added a commit to earlephilhower/ESP8266SdFat that referenced this pull request Jan 28, 2019
The ESP8266 SPI implementation needs a 4-byte aligned read buffer,
same as for transmit.  Fix the ESP8266 SPI driver wrapper to ensure
this alignment occurs.

Includes fix from @grieman to ensure only no overwrite of non-by-
four transfers
greiman#125 (comment)
earlephilhower added a commit to earlephilhower/Arduino that referenced this pull request Jan 28, 2019
@earlephilhower earlephilhower deleted the spi8266 branch February 1, 2019 21:08
@earlephilhower
Copy link
Author

Just FYI, esp8266/Arduino#5709 was merged allowing unaligned transfers in and out of SPI::transferBytes as a result of this. Your existing code works, as well as the quick and dirty patch you suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants