Usb not recognized after boot #15
Labels
No labels
CI enhacement
CS10 (chromestick)
HIGH PRIOROITY
Low Priority
Solved
TODO
arm64
armhf
bug
c100 (veyron minnie)
duplicate
enhancement
good first issue
help wanted
invalid
minor bug
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ev4/PrawnOS#15
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is a regression after kernel version 4.16.18
Narrowing down which commit caused this.
Starting at core.c https://github.com/torvalds/linux/commits/v4.17/drivers/usb/dwc2/core.c
Full list of commits to dwc2 that have made it into 4.17 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/usb/dwc2?h=linux-4.17.y
params.c has some interesting commits as well https://github.com/torvalds/linux/commits/master/drivers/usb/dwc2/params.c specifically
Reverting the merge at commit 6d23ee9caa6790aea047f9aca7f3c03cb8d96eb6 fixes the issue
With and without the commit, there are only a few differences in the dmesg logs with usb_dwc2_debug enabled.
The major one is
dwc2 ff540000.usb: In host mode, hprt0=00001501
with the commit in question
and
dwc2 ff540000.usb: In host mode, hprt0=00001101
without the commit.
the only reference to these bits I've found is:
#define HPRT0_LNSTS_MASK (0x3 << 10)
#define HPRT0_LNSTS_SHIFT 10
HPRT may be USB host peripheral interrupt handlers
Found some dwc2 documentation in the implementation in xinu
This file describes all of the dwc2 registers, including the hprt0 or "Host Port Control and Status Register"
xinu-os/xinu@4384915cc6/system/platforms/arm-rpi/usb_dwc_regs.hFrom this reference https://www.beyondlogic.org/usbnutshell/usb2.shtml:
So, that bit difference may just be a red herring as the device I've been using for testing is a "full speed" usb 2.0 device and could have been pulling that bit high for the reasons described above.
Other potentially useful references the xinu repo has:
https://github.com/xinu-os/xinu/blob/master/system/platforms/arm-rpi/usb_dwc_regs.h
https://github.com/xinu-os/xinu/blob/master/system/platforms/arm-rpi/usb_dwc_hcd.c
Check if host even enters hibernation:
Or exits hibernation:
If so,
Test this fix from the 4.18 rc1 usb tree https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git/commit/drivers/usb/dwc2?h=fixes-for-v4.18-rc1&id=22bb5cfdf13ae70c6a34508a16cfeee48f162443
Next place to checkout is the extra reset in the log for the non-functional
Maybe try this patch as well https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git/commit/drivers/usb/dwc2?h=fixes-for-v4.17-rc3&id=438fea2a6325933868aebc20279e2669c9a21207This is already in linux-4.17.y
When a usb device is plugged in, two lines are printed:
the second line referring to the function
dwc2_handle_session_req_intr()is not printed in the functional version.This may indicate the host is putting the controller into low power mode.
dwc2_handle_session_req_intr()is called by this code indwc2_handle_common_intr()GINTSTS_SESSREQINTis bit 30 of the 32 bit binary number stored at#define GINTSTS HSOTG_REG(0x014)In the log with nonfunctional usb, upon plug in gintsts and gintmskare defiened as so (shown in hex):
so
gintsts=01000000000000000000000000000001but in the functional log the are defined as:
so
gintstsshould begintsts=00000101000000000000000000100001gintstsin the nonfunctional log has a 1 in the 30th bit, sodwc2_handle_session_req_int()is called.and in the functional log,
gintstshas ones in the 5th, 24th and 26th which are defined inhw.has:GINTSTS_PRTINTis port interrupt, defined by the xinu project asGINTSTS_NPTXFEMPandGINTSTS_PTXFEMPare something about TX fifo empty?So, what we know now is that
gintstsandgintmskare both being set incorrectly, causing the usb controller to not set up the new device properly. The thing to track next seems to be GINTSTS_PRTINT, as that should be set to 1 when a new device is plugged in.In
dwc2_handle_hcd_intr()ifGINTSTS_PRTINTis set,dwc2_port_intr()is called.dwc2_port_intr()states that four HPRT0 bits have to be cleared to clearGINTSTS_PRTINT.This could indicate that
GINTSTS_PRTINTmay be set when some subset or all of these bits are set.This is cleared up by the if tree right below the bit clear which shows
GINTSTS_PRTINTmay be set byHPRT0_ENA, orHPRT0_ENACHGorHPRT0_OVRCURRCHG, withHPRT0_ENAbeing handled only ifHPRT0_ENACHGis set as well.Unfortunately these only print if verbose_debug is enabled in the kernel, which is nearly unusable if you happen to boot from usb as well as every block transfer spams dmesg with 30 lines.
TODO: Edit these debug messages to use the
dev_dbgfunction instead ofdev_vdbgand get logsWe can guess that
HPRT0_CONNDETis the like likely suspect, as we are connecting a device however if the port defaults to disabled,HPRT0_ENACHGmay be set and checked beforeHPRT0_CONNDET.Either way, we'll look into
HPRT0_CONNDETfirst.Call stack when usb device plugged in:
dwc2_handle_common_intr()
dwc2_handle_session_req_intr()
Add dev_dbg(hsotg->dev, "HAL DEBUG func=%s, GINTSTS=%08x\n",func,gintsts);
to everywhere that touches hsotg->regs + HPRT0, sometimes just a variable called hprt, in the functional version to get an idea of what the call stack is supposed to look like.
In fucntional
GUSBCFGis:0x20001408aka100000000000000001010000001000In the non functional, it is:
0x2000140Faka100000000000000001010000001111Try reversing this part of the commit:
Despite this being for gadgets, maybe theres some weird interactions.
The only other things I see sticking out are these lines:
I'd assume these have to be restored right away when a usb device is plugged back in, however they are not.
change this to normal debug as well. Located in hcd_intr.c:
In
hcd_intr.ccalls
dwc2_port_intr()and sends it down the correct, functional path.
and the only line shared by the functional and non-functional versions is printed by:
Which was called by
which ends up calling
in the non functional version
So the process in the non functional one seems to be correct up until
Specifically here:
dwc2_handle_session_req_intrshould be checking two things, which LPM is being used and if the device is in host or gadget mode.If
lx_stateisDWC_LPM2thendwc2_exit_hibernationshould be used, notdwc2_exit_partial_power_down.Only
dwc2_host_enter_hibernationanddwc2_gadget_enter_hibernationsetlx_statetoDWC_LPM2Specifically,
dwc2_gadget_enter_hibernationstates:dwc2_enter_partial_power_downdoes not:This is further supported by fields in
struct dwc2_core_paramsincore.hdwc2_exit_partial_power_downshows partial power down is marked by:which is defined as:
Interestingly,
dwc2_enter_partial_power_downis less pickyTODO: This is probably a bugEdit: its not,
_dwc2_hcd_suspend()which callsdwc2_enter_partial_power_downcheck forDWC2_POWER_DOWN_PARAM_PARTIAL:Not sure whether to pass
resetas true or false to `dwc2_exit_hibernation:Only difference is the delay??
TODO: Solve this mystery.
dwc2_enter_hibernationcheckspower_downas well:issue,
hsotg->lx_stateis set toDWC2_L2bydwc2_host_enter_hibernationand by_dwc2_hcd_suspendwhich is the only placedwc2_enter_partial_power_downis called for the host.long story short,
hsotg->lx_state == DWC2_L2cannot be used it differentiate these states.dwc2_enter_hibernation()is only called in two places. Once incore_intr.cfor gadgets, so not relevant.The other place is in
hcd.cin the functiondwc2_hcd_hub_control()So, which is being called and setting
hsotg->lx_state = DWC2_L2?Once we know that, we'll have a better idea of how
dwc2_handle_session_req_intr()should be maing a decision.dwc2_hcd_hub_control()is called by_dwc2_hcd_hub_control()and is.hub_controlin thedwc2_hc_driverstruct.dwc2_hc_driver.hub_controlis never explicitly called from within the dwc2 driver, but it's possible it is called by the over arching usb driver?so
dwc2_enter_partial_power_downis called by_dwc2_hcd_suspend()which then setshsotg->lx_state = DWC2_L2_dwc2_hcd_suspend()is.bus_suspendin thedwc2_hc_driverstruct, as you can see above.dwc2_hc_driver.bus_suspendis never explicitly called from within the dwc2 driver, but it's possible it is called by the over arching usb driver?recall:
Either one of the two "enter" functions is not setting
GINTSTS/ 'GINTMSK' properly, ordwc2_handle_session_req_intr()is not correct.TODO: Determine which "enter" function is used.
Looking closer, hibernation has a print statement not found in the logs:
and since
are only printed by those two enter functions,
dwc2_enter_partial_power_downmust be responsible.param.power_downis defined like this:And also this
TODO: test partially reverting this change
Re-include
dwc2_force_mode(hsotg, true);but without the return variableThe changes to
dwc2_enter_partial_power_down()were minimal, basically just renaming and changing it to depend onparam.power_downinstead ofparam.hibernate_dwc2_hcd_suspend()which callsdwc2_enter_partial_power_downalso is changed to call
dwc2_vbus_supply_exit(hsotg)as well as depending on the new paramThe addition of
dwc2_vbus_supply_exit(hsotg)may be interestingdwc2_vbus_supply_initonly gets called if_dwc2_hcd_resumeis called, so if for some reason it hasn't been the vbus isn't brought back.It could also be the case that
_dwc2_hcd_resumeis getting called, but doesnt get todwc2_exit_partial_power_down()for one of these reasons:this seems plausible, making image to test
Build image that will test:
WhetherIt is not :(_dwc2_hcd_resumeis called when a usb device is plugged inWhether
_dwc2_hcd_suspendgets to callingdwc2_exit_partial_power_down(): It does!WhetherThis didn't help on its owndwc2_force_mode(hsotg, true);is required indwc2_force_dr_modeas described in https://github.com/SolidHal/Librean/issues/15#issuecomment-406034545Confirm that
dwc2_enter_partial_power_downis the "enter" function used for https://github.com/SolidHal/Librean/issues/15#issuecomment-406009031In another image, test the following instead of the
dwc2_force_modetestdwc2_vbus_supply_exitis to blame, by commenting it out.In a third image, test these together:
dwc2_vbus_supply_exitis to blame, by commenting it out.dwc2_force_mode(hsotg, true);is required indwc2_force_dr_modeas described in https://github.com/SolidHal/Librean/issues/15#issuecomment-406034545 (This didn't help on its own)SOOOO:
_dwc2_hcd_resumecallsdwc2_exit_partial_power_down!But,
_dwc2_hcd_resumedoesnt calldwc2_read_common_intrwhich prints the first of the two linesdwc2_read_common_intris called bydwc2_handle_common_interand gets called 5 times before it prints the gintsts messageSynopsys has requested I try this patch: https://www.spinics.net/lists/linux-usb/msg169664.html
Edit: Didn't help
Tool for getting the caller function:
Finally got onto the linux usb mailing list: https://marc.info/?l=linux-usb&m=153204494813919&w=2
The fix was to include:
usb: dwc2:in the subject line. The silly spam filters were almost the death of meBuilding two images to compare
gintstsandgintmskbefore and after suspend/partial power down with and without the blammed commit.The relevant lines, in the non functional version:
In the non functional only one line is printed in
_dwc2_hcd_suspend()No return line, no lines about backing up registers or entering hibernation (what partial power down was called in this commit)
SO!
There are a number of reasons for partial powerdown to get skipped:
(I added the debug messages just now)
Only one of them prints....
Which means it goes to;
Since the param
hibernationwas swapped out forpower_down, this no longer functions properlySo, mystery solved. Now, how to fix it?
As I had found above:
So something is setting
hw_params.power_optimizedand
hsotg->hw_paramsis defined as:and set in
params.clike this:For example/comparison hibernation is here too:
Aside: The double exclamation mark confused me for a bit but is used for:
And in
hw.hwe find:In
params.cFIXED IT!!
Mailing list discussion:
Re: REGRESSION: usb: dwc2: USB device not seen after boot
Mailing list patch thread:
[PATCH] usb: dwc2: disable power_down on rockchip devices
Keeping this issue open until the patch is merged into mainline
Fixed the patch to use tabs instead of spaces, can be found in
patches-testedPatch has been confirmed working. closing