-
Notifications
You must be signed in to change notification settings - Fork 7.5k
USB device transfer #5983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB device transfer #5983
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5983 +/- ##
==========================================
- Coverage 53.06% 52.88% -0.19%
==========================================
Files 430 412 -18
Lines 41071 40274 -797
Branches 7911 7801 -110
==========================================
- Hits 21795 21297 -498
+ Misses 16052 15761 -291
+ Partials 3224 3216 -8
Continue to review full report at Codecov.
|
* | ||
* @return enpoint max packet size (mps) | ||
*/ | ||
int usb_dc_ep_mps(u8_t ep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is misleading since wMaxPacketSize is not always equal to buffer or fifo size of a endpoint. Can we rename it to max buffer size, usb_dc_ep_mbs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really need the Max Packet Size here, in order to know how the transfer will be split, what is the size of a short packet, is Zero lenght Packet requested, etc...
subsys/usb/usb_device.c
Outdated
@@ -826,6 +828,9 @@ int usb_set_config(struct usb_cfg_data *config) | |||
if (!config) | |||
return -EINVAL; | |||
|
|||
if (usb_dev.config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here breaks USB DFU sample. The configuration may change again at runtime. if (usb_dev.config == config) {SYS_LOG_WRN(...); return -EALREADY;}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for highlighting this issue, I thought we had to balance calls to usb_set_config and usb_deconfig. But indeed documentation doesn't say we have to 'deconfigure' prior reconfiguring.
subsys/usb/usb_device.c
Outdated
@@ -950,6 +974,10 @@ int usb_enable(struct usb_cfg_data *config) | |||
config->endpoint[i].ep_cb); | |||
if (ret < 0) | |||
return ret; | |||
|
|||
/* assign endpoint transfer slot */ | |||
usb_dev.transfer[i].ep = config->endpoint[i].ep_addr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dynamic assignment might be more flexible, control endpoint does not really need it. Maybe memory pool or similar can be used to store usb_transfer_data? Or better a function or class should initialize a usb_transfer_data structure and then use it as parameter for usb_transfer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better a function or class should initialize a usb_transfer_data structure and then use it as parameter for usb_transfer?
I'm not against this, But I would like to keep some underlying data private (e.g semaphore, transfer counter...). To keep it simple, I'm going to change for dynamic assignment but keep the parameters untouched (for now).
subsys/usb/usb_device.c
Outdated
int i; | ||
|
||
for (i = 0; i < ARRAY_SIZE(usb_dev.transfer); i++) { | ||
if (usb_dev.transfer[i].ep == ep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
subsys/usb/usb_device.c
Outdated
trans->status = status; | ||
k_sem_give(&trans->sem); | ||
|
||
if (trans->cb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{}
subsys/usb/usb_device.c
Outdated
/** Transfer buffer size */ | ||
size_t bsize; | ||
/** Transferred size */ | ||
size_t size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/size/tsize?
include/usb/usb_device.h
Outdated
usb_transfer_callback cb); | ||
|
||
/** | ||
* @brief Start a transfer and wait for completion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of "wait", please mention it as "block wait" which makes it a bit clear on the wait part.
subsys/usb/usb_device.c
Outdated
/* we need to temporaly change the ep callback */ | ||
usb_dc_ep_set_callback(ep, usb_handle_transfer); | ||
|
||
key = irq_lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns of using the IRQ locking here. Please correct me if my understanding is incorrect.
As per my understanding, when it comes to usb_dc_ep_write(), I think the current drivers (both DW and STM32) use a straight forward method of writing into the HW registers word by word and so things are very much synchronous and simple. I'm currently developing a driver for Nordic's nRF52840 which has a partially offloaded architecture that uses DMA to copy to/from HW local buffer and then HW takes care all the read and write operations. For example, the DMA copy done and successful USB write on the bus are indicated as events through interrupts.
If we go by the current method having the IRQ locked, then I have to disable the respective interrupts and implement polling mechanism to wait for the events instead of handling them from interrupt context - making the whole thing synchronous defeating the purpose of having the DMA and HW offload advantages.
Could you please take the DMA and partial HW offload architecture use cases into consideration as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If usb_handle_transfer()
is supposed to be a synchronous function then I am not convinced this is a good idea, doing a full transfer of bytes with interrupts disabled. Disabling interrupts for an extended period of time is a serious issue for the hard real-time components of the RTOS, such as the BLE Link Layer when combined with the USB driver in order to provide a "USB Bluetooth Dongle" build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experiments with BT-USB so far, I have found irq_lock() has been causing packet drops and even caused asserts in the BT link layer. I had to modify my code to lock by disabling only USB peripheral interrupts. That way the critical code within the USB driver is still safe to execute while allowing other interrupts to happen elsewhere.
I have a little thought here. Can we have a lock for USB subsystem alone as a driver API so that the USB stack can make use of it?
For example, we can call the locking APIs as usb_dc_irq_enable() and usb_dc_irq_disable().
In most architectures we have a dedicated IRQ line for USB device controller that can be enabled or disabled as and when it's required.
void usb_dc_irq_enable(void)
{
irq_enable(ARCH_SPECIFIC_IRQ_LINE);
}
void usb_dc_irq_disable(void)
{
irq_disable(ARCH_SPECIFIC_IRQ_LINE);
}
But these too should only be used for short period of time from the USB stack, not during the whole write() operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to change the callback without irq_lock could go wrong, line 1163. @loicpoulain maybe you can make it simpler, at least for now. Do not change the endpoint callback here, but set it directly to usb_handle_transfer in your class for example when initializing it. Change the function usb_transfer so that usb_transfer_data can be used as parameter and call usb_dc_ep_write first time from usb_transfer. Then IN transfer should go without irq_lock (I think so 😄). OUT is not that easy, assume that the host can start a transfer every time after enumeration. You can initialize the OUT transfer from usb_handle_transfer. I suspect that it can not be so generic, do you have a branch with the example where you use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently developing a driver for Nordic's nRF52840 which has a partially offloaded architecture that uses DMA to copy to/from HW local buffer and then HW takes care all the read and write operations. For example, the DMA copy done and successful USB write on the bus are indicated as events through interrupts.
If I understand correctly, your dc_ep_write and dc_ep_read functions are blocking operations => so they can not be executed in atomic/IRQ context, correct ? What about usb_ep_callback, is it call from IRQ or thread context ? To be honest I would expect usb_write 'atomic' and fully asynchronous since the usb_ep_callback is called on completion.
If usb_handle_transfer() is supposed to be a synchronous function then I am not convinced this is a good idea, doing a full transfer of bytes with interrupts disabled. Disabling interrupts for an extended period of time is a serious issue for the hard real-time components of the RTOS, such as the BLE Link Layer when combined with the USB driver in order to provide a "USB Bluetooth Dongle" build.
Here I assume (maybe wrongly) that ep_read/ep_write are short operations.
usb_dc_ep_set_callback (usb_handle_transfer) is called from IRQ context (at least with stm32 and dw drivers). We can not offload this to a thread/bottom-half cause usb_dc_ep_read needs to be done into the callback itself (data is lost after that).
@loicpoulain maybe you can make it simpler, at least for now. Do not change the endpoint callback here, but set it directly to usb_handle_transfer in your class for example when initializing it.
Good idea.
OUT is not that easy, assume that the host can start a transfer every time after enumeration. You can initialize the OUT transfer from usb_handle_transfer. I suspect that it can not be so generic, do you have a branch with the example where you use it?
Once class driver is ready (e.g. after initialization), a read transfer can be triggered. I can prepare a branch or add example to this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, your dc_ep_write and dc_ep_read functions are blocking operations => so they can not be executed in atomic/IRQ context, correct ? What about usb_ep_callback, is it call from IRQ or thread context ? To be honest I would expect usb_write 'atomic' and fully asynchronous since the usb_ep_callback is called on completion.
In nRF52840, the usb_dc_ep_write() usb_dc_ep_read() functions are not blocking operations. They are fully asynchronous and would call the usb_ep_callback after write is complete in case of write operation.
The callback would be called from thread context because we don't want huge data processing to happen in interrupt context as this would affect further interrupts to be held waiting at the HW. Application could do anything in its write or read callback handler.
Our driver's write() does something like this:
a) Prepares the DMA engine to copy from RAM to HW's local buffer (with USBD interrupts disabled)
b) Initiate the IN transaction (with USBD interrupts disabled)
c) Return to caller (USBD interrupts are enabled)
d) USBD interrupt indicates the DMA completion followed by write completion. Driver handles them in worker thread context.
e) Worker thread calls the write callback to indicate write complete.
So, the irq_lock() added before calling usb_handler_transfer will affect the interrupt handling at the driver layer.
subsys/usb/usb_device.c
Outdated
ret = trans->status; | ||
} | ||
|
||
/* restaure ep callback */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"restaure" is the French word for restore? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops ;-)
subsys/usb/usb_device.c
Outdated
int bytes; | ||
|
||
if (!trans->bsize) { /* transfer complete */ | ||
if (trans->zero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use braces here
subsys/usb/usb_device.c
Outdated
/* we need to temporaly change the ep callback */ | ||
usb_dc_ep_set_callback(ep, usb_handle_transfer); | ||
|
||
key = irq_lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, your dc_ep_write and dc_ep_read functions are blocking operations => so they can not be executed in atomic/IRQ context, correct ? What about usb_ep_callback, is it call from IRQ or thread context ? To be honest I would expect usb_write 'atomic' and fully asynchronous since the usb_ep_callback is called on completion.
In nRF52840, the usb_dc_ep_write() usb_dc_ep_read() functions are not blocking operations. They are fully asynchronous and would call the usb_ep_callback after write is complete in case of write operation.
The callback would be called from thread context because we don't want huge data processing to happen in interrupt context as this would affect further interrupts to be held waiting at the HW. Application could do anything in its write or read callback handler.
Our driver's write() does something like this:
a) Prepares the DMA engine to copy from RAM to HW's local buffer (with USBD interrupts disabled)
b) Initiate the IN transaction (with USBD interrupts disabled)
c) Return to caller (USBD interrupts are enabled)
d) USBD interrupt indicates the DMA completion followed by write completion. Driver handles them in worker thread context.
e) Worker thread calls the write callback to indicate write complete.
So, the irq_lock() added before calling usb_handler_transfer will affect the interrupt handling at the driver layer.
@sundar-subramaniyan, I've fully reworked the code according to the different comments, but need to test on both existing controllers, I will push the new PR by tomorrow. |
8cf8835
to
23f5753
Compare
subsys/usb/usb_device.c
Outdated
|
||
/* some buggy? hardware/sw send short packet with | ||
null byte instead of ZLP, so discard the null byte */ | ||
if (bytes == 1 && *(trans->buffer -1) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the last byte in the transfer is a real '\0'?
In that case, bytes % usb_dc_ep_mps(ep) == 1 and bytes == 1 and we might incorrectly discard it.
If the SW/HW is buggy, is trans->bsize == 0 always? If so, you can add that check in here rather than above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the last byte in the transfer is a real '\0'?
I think this is a gray area, Linux wrongly supposes that hardware is buggy and send a NULL byte instead of a ZLP by default, at least for usbnet (cf http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/net/usb/usbnet.c#L1391), so we have to live with it. Maybe I could introduce a flag that class drivers have to set in order to manually discard this byte.
307cb88
to
97586fd
Compare
@loicpoulain please rebase your patches on top of master instead of having merge commits in the PR |
7be55f0
to
5e38406
Compare
@sundar-subramaniyan , @jfischer-phytec-iot, I reworked the PR accoding your comments. |
@loicpoulain Looks good to me so far, the IRQ locking concerns are now addressed. I'll look at the other changes later today and update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked at the usb_transfer() and usb_transfer_sync() changes in usb_device.c and they look good to me.
@jfischer-phytec-iot @finikorg , any comment ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I like this idea, but at the moment we do not handle case when we have 65 bytes packet (counted on USB endpoint) with last byte zero. (For ICMPv6 payload would be 3 bytes)
$ ping6 -I enx00005e005301 fe80::200:5eff:fe00:5300 -s 3 -p 0 -c 1
tested with echo_server with ECM configuration. We have recent discussion about this issue in PR #6368
Maybe you merge that patch to your ECM changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see that issue with zero byte padding be addressed, after extensive changes it would be hard to get clean merge.
5e38406
to
f07baed
Compare
This reverts commit b23d599. Signed-off-by: Loic Poulain <[email protected]>
This function returns endpoint max packet size (mps). Signed-off-by: Loic Poulain <[email protected]>
The transfer API provides 'high' level functions to manage sending and reception of USB data. A USB (class) driver has to register the generic usb_transfer_ep_callback as endpoint status callback in order to use the API. With this API, the class driver does not need to take care of low-level usb transfer management (packet splitting, ZLP, synchronization...). The usb_transfer methods will split transfer into multiple transactions depending endpoint max size and controller capabilities. Once the transfer is completed, class driver is notified by a callback. The usb_transfer method can be executed in IRQ/atomic context. A usb_transfer synchronous helper exists which block-waits until transfer completion. In write case, a transfer is complete when all data has been sent. In read case, a transfer is complete when the exact amount of data requested has been received or if a short-pkt (including ZLP) is received. transfer methods are thread-safe. A transfer can be cancelled at any time. Signed-off-by: Loic Poulain <[email protected]>
This value can be used in sub-functions for buffer mgmt. Signed-off-by: Loic Poulain <[email protected]>
For now tx-fifo sizes are not configured (cf usb_dw_set_fifo). Default fifo size is 136 bytes and sending more than 136 bytes makes the data split into incorrect chunk size. This patch prevents this by reducing available fifo size to a multiple of the endpoint max packet size. Signed-off-by: Loic Poulain <[email protected]>
Reading is managed by transfer API. Signed-off-by: Loic Poulain <[email protected]>
Add new status event indicating an interface has been selected. Interface and its endpoint(s) are enabled. Signed-off-by: Loic Poulain <[email protected]>
The interface and its enpoints are only enabled once interface is selected by the host. Signed-off-by: Loic Poulain <[email protected]>
There are 174 RX/TX net pkt reserved, this make build fail with some boards due to ram overflow (e.g quark_se_c1000). Fix this by reducing NET_PKT RX/TX reserve to 100, which seems fair enough for this use case. Moreover, don't think there is any reason to have more NET_PKT than NET_BUF. Signed-off-by: Loic Poulain <[email protected]>
Replace low-level usb_read/usb_write with transfer functions. Signed-off-by: Loic Poulain <[email protected]>
f07baed
to
16b67d8
Compare
@finikorg I've reworked and rebased the ECM part to take care of this extra byte re-using the function extracting the pkt size from ethernet/ip header. |
@jfischer-phytec-iot, @finikorg , could you please review the new version. |
goto done; | ||
} | ||
|
||
if (trans->flags & USB_TRANS_WRITE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I did not get the code right, but should we have here while(true){} loop? we can break loop (instead of goto) on !trans->bsize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the following change I get it working with my USB sample. (I used _sync version of API).
diff --git a/subsys/usb/usb_device.c b/subsys/usb/usb_device.c
index ce94b9ddf4..a5639e0d4a 100644
--- a/subsys/usb/usb_device.c
+++ b/subsys/usb/usb_device.c
@@ -1073,6 +1073,8 @@ static void usb_transfer_work(struct k_work *item)
goto done;
}
+ while (true) {
+
if (trans->flags & USB_TRANS_WRITE) {
if (!trans->bsize) {
if (!(trans->flags & USB_TRANS_NO_ZLP)) {
@@ -1116,6 +1118,8 @@ static void usb_transfer_work(struct k_work *item)
usb_dc_ep_read_continue(ep);
}
+ }
+
done:
if (trans->status != -EBUSY && trans->cb) { /* Transfer complete */
usb_transfer_callback cb = trans->cb;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why introducing this while ? the transfer work is 'normally' called each time we have data to read or to write, so that we are driven by events without active polling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call usb_transfer_sync() with WRITE operation callback never gets called without while loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I used in the EEM rework (included in this PR), it was working. Do you have a specific sample I could try? On device write operation complete, device driver calls the endpoint callback from which we call this work, then we can continue writing the next data chunk and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say we have one small data chunk (i.e. Bluetooth), you have a check for !trans->bsize
before actual write, so that trans->status == -EBUSY
and callback does not get called. You can change hci_usb sample with:
switch (bt_buf_get_type(buf)) {
case BT_BUF_EVT:
- try_write(BTUSB_ENDP_INT, buf);
+ usb_transfer_sync(BTUSB_ENDP_INT, buf->data, buf->len,
+ USB_TRANS_WRITE);
break;
case BT_BUF_ACL_IN:
- try_write(BTUSB_ENDP_BULK_IN, buf);
+ usb_transfer_sync(BTUSB_ENDP_BULK_IN, buf->data,
+ buf->len, USB_TRANS_WRITE);
break;
default:
SYS_LOG_ERR("Unknown type %u", bt_buf_get_type(buf));
break;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, Actually when you want to use transfer method, you have to configure your endpoint accordingly by setting ep callback to the existing transfer callback, this is something which has been discussed with @jfischer-phytec-iot previously, but maybe the documentation in usb_device.h should be clarified The following changes work on my side:
--- a/samples/bluetooth/hci_usb/src/main.c
+++ b/samples/bluetooth/hci_usb/src/main.c
@@ -480,7 +480,7 @@ static void btusb_iso_in(u8_t ep, enum usb_dc_ep_cb_status_code ep_status)
/* Describe EndPoints configuration */
static struct usb_ep_cfg_data btusb_ep[] = {
{
- .ep_cb = btusb_int_in,
+ .ep_cb = usb_transfer_ep_callback,
.ep_addr = BTUSB_ENDP_INT
},
{
@@ -488,7 +488,7 @@ static struct usb_ep_cfg_data btusb_ep[] = {
.ep_addr = BTUSB_ENDP_BULK_OUT
},
{
- .ep_cb = btusb_bulk_in,
+ .ep_cb = usb_transfer_ep_callback,
.ep_addr = BTUSB_ENDP_BULK_IN
},
{
@@ -689,10 +689,12 @@ void main(void)
switch (bt_buf_get_type(buf)) {
case BT_BUF_EVT:
- try_write(BTUSB_ENDP_INT, buf);
+ usb_transfer_sync(BTUSB_ENDP_INT, buf->data, buf->len,
+ USB_TRANS_WRITE);
break;
case BT_BUF_ACL_IN:
- try_write(BTUSB_ENDP_BULK_IN, buf);
+ usb_transfer_sync(BTUSB_ENDP_BULK_IN, buf->data,
+ buf->len, USB_TRANS_WRITE);
break;
default:
SYS_LOG_ERR("Unknown type %u", bt_buf_get_type(buf));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification, now I see that it should work, will test also later.
@jfischer-phytec-iot can you take a second look at this? |
@carlescufi will do |
@@ -420,6 +420,12 @@ static int usb_dw_tx(u8_t ep, const u8_t *const data, | |||
return -EAGAIN; | |||
} | |||
|
|||
/* For now tx-fifo sizes are not configured (cf usb_dw_set_fifo). Here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subsys/usb/usb_device.c
Outdated
@@ -1223,7 +1227,7 @@ int usb_transfer(u8_t ep, u8_t *data, size_t dlen, unsigned int flags, | |||
k_work_submit(&trans->work); | |||
} else { | |||
/* ready to read, clear NAK */ | |||
usb_dc_ep_read_continue(ep); | |||
ret = usb_dc_ep_read_continue(ep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It belongs to ef343a5
The transfer API provides 'high' level functions to manage sending and
reception of USB data. A USB (class) driver has to register the generic
usb_transfer_ep_callback as endpoint status callback in order to use
the API.
With this API, the class driver does not need to take care of low-level
usb transfer management (packet splitting, ZLP, synchronization...).
The usb_transfer methods will split transfer into multiple transactions
depending endpoint max size and controller capabilities.
Once the transfer is completed, class driver is notified by a callback.
The usb_transfer method can be executed in IRQ/atomic context.
A usb_transfer synchronous helper exists which block-waits until
transfer completion.
In write case, a transfer is complete when all data has been sent.
In read case, a transfer is complete when the exact amount of data
requested has been received or if a short-pkt (including ZLP) is
received.
transfer methods are thread-safe.
A transfer can be cancelled at any time.