e1000-devel Mailing List for Formerly Intel Ethernet Drivers

Moved to github.com/intel

Brought to you by: aloktion, anguy11, asunderr, emiltan, and 21 others

e1000-devel — Discussion of the Intel Ethernet out-of-tree drivers

You can subscribe to this list here.

2002	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (2)	Nov (1)	Dec
2003	Jan	Feb	Mar (1)	Apr (9)	May (3)	Jun	Jul (3)	Aug (6)	Sep	Oct (7)	Nov	Dec
2004	Jan	Feb (5)	Mar (10)	Apr (2)	May (22)	Jun (8)	Jul (4)	Aug (8)	Sep (3)	Oct	Nov (36)	Dec (52)
2005	Jan (9)	Feb (13)	Mar (9)	Apr	May (14)	Jun (5)	Jul (20)	Aug (31)	Sep (2)	Oct (3)	Nov (18)	Dec (18)
2006	Jan (36)	Feb (16)	Mar (76)	Apr (78)	May (32)	Jun (30)	Jul (67)	Aug (43)	Sep (54)	Oct (116)	Nov (223)	Dec (158)
2007	Jan (180)	Feb (71)	Mar (110)	Apr (114)	May (203)	Jun (100)	Jul (238)	Aug (191)	Sep (177)	Oct (171)	Nov (211)	Dec (159)
2008	Jan (227)	Feb (288)	Mar (197)	Apr (253)	May (132)	Jun (152)	Jul (109)	Aug (143)	Sep (157)	Oct (198)	Nov (121)	Dec (147)
2009	Jan (105)	Feb (61)	Mar (191)	Apr (161)	May (118)	Jun (172)	Jul (166)	Aug (67)	Sep (86)	Oct (79)	Nov (118)	Dec (181)
2010	Jan (136)	Feb (154)	Mar (92)	Apr (83)	May (101)	Jun (66)	Jul (118)	Aug (78)	Sep (134)	Oct (131)	Nov (132)	Dec (104)
2011	Jan (79)	Feb (104)	Mar (144)	Apr (145)	May (130)	Jun (169)	Jul (146)	Aug (76)	Sep (113)	Oct (82)	Nov (145)	Dec (122)
2012	Jan (132)	Feb (106)	Mar (145)	Apr (238)	May (140)	Jun (162)	Jul (166)	Aug (147)	Sep (80)	Oct (148)	Nov (192)	Dec (90)
2013	Jan (139)	Feb (162)	Mar (174)	Apr (81)	May (261)	Jun (301)	Jul (106)	Aug (175)	Sep (305)	Oct (222)	Nov (95)	Dec (120)
2014	Jan (196)	Feb (171)	Mar (146)	Apr (118)	May (127)	Jun (93)	Jul (175)	Aug (66)	Sep (85)	Oct (120)	Nov (81)	Dec (192)
2015	Jan (141)	Feb (133)	Mar (189)	Apr (126)	May (59)	Jun (117)	Jul (56)	Aug (97)	Sep (44)	Oct (48)	Nov (33)	Dec (87)
2016	Jan (37)	Feb (56)	Mar (72)	Apr (65)	May (66)	Jun (65)	Jul (98)	Aug (54)	Sep (84)	Oct (68)	Nov (69)	Dec (60)
2017	Jan (30)	Feb (38)	Mar (53)	Apr (6)	May (2)	Jun (5)	Jul (15)	Aug (15)	Sep (7)	Oct (18)	Nov (23)	Dec (6)
2018	Jan (39)	Feb (5)	Mar (34)	Apr (26)	May (27)	Jun (5)	Jul (12)	Aug (4)	Sep	Oct (4)	Nov (4)	Dec (4)
2019	Jan (7)	Feb (10)	Mar (21)	Apr (26)	May (4)	Jun (5)	Jul (11)	Aug (6)	Sep (7)	Oct (13)	Nov (3)	Dec (17)
2020	Jan	Feb (3)	Mar (3)	Apr (5)	May (2)	Jun (5)	Jul	Aug	Sep (6)	Oct (7)	Nov (2)	Dec (7)
2021	Jan (9)	Feb (10)	Mar (18)	Apr (1)	May (3)	Jun	Jul (16)	Aug (2)	Sep	Oct	Nov (9)	Dec (2)
2022	Jan (3)	Feb	Mar (9)	Apr (8)	May (5)	Jun (6)	Jul (1)	Aug	Sep (1)	Oct	Nov (7)	Dec (2)
2023	Jan (7)	Feb (2)	Mar (6)	Apr	May (4)	Jun (2)	Jul (4)	Aug (3)	Sep (4)	Oct (2)	Nov (4)	Dec (10)
2024	Jan (4)	Feb (2)	Mar (1)	Apr	May (1)	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
	1	2	3	4 (1)	5	6
7	8 (2)	9 (2)	10	11	12	13
14 (1)	15	16 (2)	17 (3)	18 (5)	19	20
21	22 (1)	23 (4)	24	25	26 (4)	27
28	29 (1)	30

Flat | Threaded

1 2 > >> (Page 1 of 2)

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Hans de G. <hde...@re...> - 2019-04-29 15:02:21

Hi,

On 22-04-19 12:20, Semyon Verchenko wrote:
> 
> On 18.04.2019 18:12, Hans de Goede wrote:
>> Hi Semyon,
>>
>> On 18-04-19 15:26, Semyon Verchenko wrote:
>>>
>>> On 18.04.2019 16:09, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 09-04-19 17:31, Andy Shevchenko wrote:
>>
>> <snip>
>>
>>>> Ok.
>>>>
>>>> Семен Верченко, this means that we are going to need DMI info from
>>>> the board in question. I thought we already had that, but I now see that
>>>> you original report did not have that a
>>>>
>>>> Please run as root:
>>>>
>>>> dmidecode &> dmidecode.log
>>>>
>>>> And then reply to this email with the generated dmidecode.log file
>>>> attached. Once I have that file I can prepare a patch fixing this.
>>
>> Thank you for the DMI decode.
>>
>> Attached are 3 patches, can you please test if adding those 3 patches
>> to your kernel fixes the problem?
>>
>> Thanks & Regards,
>>
>> Hans
>>
> Hi Hans,
> It seems that these patches fix the problem (at least interfaces are visible through ip addr and it's possible to ping something from them).
> 
> Thanks for fixing this issue.

Thank you for testing the fix, I've submitted the patch fixing this upstream.

Regards,

Hans

Re: [E1000-devel] [PATCH 2/2] e1000e: start network tx queue only when link is up

From: Oleksandr N. <ole...@re...> - 2019-04-26 14:12:12

On Wed, Apr 17, 2019 at 11:13:20AM +0300, Konstantin Khlebnikov wrote:
> Driver does not want to keep packets in tx queue when link is lost.
> But present code only reset NIC to flush them, but does not prevent
> queuing new packets. Moreover reset sequence itself could generate
> new packets via netconsole and NIC falls into endless reset loop.
> 
> This patch wakes tx queue only when NIC is ready to send packets.
> 
> This is proper fix for problem addressed by commit 0f9e980bf5ee
> ("e1000e: fix cyclic resets at link up with active tx").
> 
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Suggested-by: Alexander Duyck <ale...@gm...>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index ba96e52aa8d1..fe643d66aa10 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4209,7 +4209,7 @@ void e1000e_up(struct e1000_adapter *adapter)
>  		e1000_configure_msix(adapter);
>  	e1000_irq_enable(adapter);
>  
> -	netif_start_queue(adapter->netdev);
> +	/* tx queue started by watchdog timer when link is up */
>  
>  	e1000e_trigger_lsc(adapter);
>  }
> @@ -4607,6 +4607,7 @@ int e1000e_open(struct net_device *netdev)
>  	pm_runtime_get_sync(&pdev->dev);
>  
>  	netif_carrier_off(netdev);
> +	netif_stop_queue(netdev);
>  
>  	/* allocate transmit descriptors */
>  	err = e1000e_setup_tx_resources(adapter->tx_ring);
> @@ -4667,7 +4668,6 @@ int e1000e_open(struct net_device *netdev)
>  	e1000_irq_enable(adapter);
>  
>  	adapter->tx_hang_recheck = false;
> -	netif_start_queue(netdev);
>  
>  	hw->mac.get_link_status = true;
>  	pm_runtime_put(&pdev->dev);
> @@ -5289,6 +5289,7 @@ static void e1000_watchdog_task(struct work_struct *work)
>  			if (phy->ops.cfg_on_link_up)
>  				phy->ops.cfg_on_link_up(hw);
>  
> +			netif_wake_queue(netdev);
>  			netif_carrier_on(netdev);
>  
>  			if (!test_bit(__E1000_DOWN, &adapter->state))
> @@ -5302,6 +5303,7 @@ static void e1000_watchdog_task(struct work_struct *work)
>  			/* Link status message must follow this format */
>  			pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>  			netif_carrier_off(netdev);
> +			netif_stop_queue(netdev);
>  			if (!test_bit(__E1000_DOWN, &adapter->state))
>  				mod_timer(&adapter->phy_info_timer,
>  					  round_jiffies(jiffies + 2 * HZ));
> 

Tested-by: Oleksandr Natalenko <ole...@re...>

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

Re: [E1000-devel] [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"

From: Oleksandr N. <ole...@re...> - 2019-04-26 14:11:37

On Wed, Apr 17, 2019 at 11:13:16AM +0300, Konstantin Khlebnikov wrote:
> This reverts commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61.
> 
> That change cased false-positive warning about hardware hang:
> 
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
>    TDH                  <0>
>    TDT                  <1>
>    next_to_use          <1>
>    next_to_clean        <0>
> buffer_info[next_to_clean]:
>    time_stamp           <fffba7a7>
>    next_to_watch        <0>
>    jiffies              <fffbb140>
>    next_to_watch.status <0>
> MAC Status             <40080080>
> PHY Status             <7949>
> PHY 1000BASE-T Status  <0>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> 
> Besides warning everything works fine.
> Original issue will be fixed property in following patch.
> 
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Reported-by: Joseph Yasi <joe...@gm...>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |   15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7acc61e4f645..ba96e52aa8d1 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -5309,13 +5309,8 @@ static void e1000_watchdog_task(struct work_struct *work)
>  			/* 8000ES2LAN requires a Rx packet buffer work-around
>  			 * on link down event; reset the controller to flush
>  			 * the Rx packet buffer.
> -			 *
> -			 * If the link is lost the controller stops DMA, but
> -			 * if there is queued Tx work it cannot be done.  So
> -			 * reset the controller to flush the Tx packet buffers.
>  			 */
> -			if ((adapter->flags & FLAG_RX_NEEDS_RESTART) ||
> -			    e1000_desc_unused(tx_ring) + 1 < tx_ring->count)
> +			if (adapter->flags & FLAG_RX_NEEDS_RESTART)
>  				adapter->flags |= FLAG_RESTART_NOW;
>  			else
>  				pm_schedule_suspend(netdev->dev.parent,
> @@ -5338,6 +5333,14 @@ static void e1000_watchdog_task(struct work_struct *work)
>  	adapter->gotc_old = adapter->stats.gotc;
>  	spin_unlock(&adapter->stats64_lock);
>  
> +	/* If the link is lost the controller stops DMA, but
> +	 * if there is queued Tx work it cannot be done.  So
> +	 * reset the controller to flush the Tx packet buffers.
> +	 */
> +	if (!netif_carrier_ok(netdev) &&
> +	    (e1000_desc_unused(tx_ring) + 1 < tx_ring->count))
> +		adapter->flags |= FLAG_RESTART_NOW;
> +
>  	/* If reset is necessary, do it outside of interrupt context. */
>  	if (adapter->flags & FLAG_RESTART_NOW) {
>  		schedule_work(&adapter->reset_task);
> 

Tested-by: Oleksandr Natalenko <ole...@re...>

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Senior Software Maintenance Engineer

Re: [E1000-devel] [PATCH 2/2] e1000e: start network tx queue only when link is up

From: Brown, A. F <aar...@in...> - 2019-04-26 00:05:11

> From: Konstantin Khlebnikov [mailto:khl...@ya...]
> Sent: Wednesday, April 17, 2019 1:13 AM
> To: ne...@vg...; int...@li...; linux-
> ke...@vg...; Kirsher, Jeffrey T <jef...@in...>
> Cc: Sasha Levin <sa...@ke...>; Joseph Yasi <joe...@gm...>;
> Brown, Aaron F <aar...@in...>; Alexander Duyck
> <ale...@gm...>; e10...@li...
> Subject: [PATCH 2/2] e1000e: start network tx queue only when link is up
> 
> Driver does not want to keep packets in tx queue when link is lost.
> But present code only reset NIC to flush them, but does not prevent
> queuing new packets. Moreover reset sequence itself could generate
> new packets via netconsole and NIC falls into endless reset loop.
> 
> This patch wakes tx queue only when NIC is ready to send packets.
> 
> This is proper fix for problem addressed by commit 0f9e980bf5ee
> ("e1000e: fix cyclic resets at link up with active tx").
> 
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Suggested-by: Alexander Duyck <ale...@gm...>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
Tested-by: Aaron Brown <aar...@in...>
Again, more of a regression check than a test that the patch solves the problem as I did not manage to trigger the hang.

Re: [E1000-devel] [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"

From: Brown, A. F <aar...@in...> - 2019-04-26 00:00:52

> From: Konstantin Khlebnikov [mailto:khl...@ya...]
> Sent: Wednesday, April 17, 2019 1:13 AM
> To: ne...@vg...; int...@li...; linux-
> ke...@vg...; Kirsher, Jeffrey T <jef...@in...>
> Cc: Sasha Levin <sa...@ke...>; Joseph Yasi <joe...@gm...>;
> Brown, Aaron F <aar...@in...>; Alexander Duyck
> <ale...@gm...>; e10...@li...
> Subject: [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"
> 
> This reverts commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61.
> 
> That change cased false-positive warning about hardware hang:
> 
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
>    TDH                  <0>
>    TDT                  <1>
>    next_to_use          <1>
>    next_to_clean        <0>
> buffer_info[next_to_clean]:
>    time_stamp           <fffba7a7>
>    next_to_watch        <0>
>    jiffies              <fffbb140>
>    next_to_watch.status <0>
> MAC Status             <40080080>
> PHY Status             <7949>
> PHY 1000BASE-T Status  <0>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> 
> Besides warning everything works fine.
> Original issue will be fixed property in following patch.
> 
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Reported-by: Joseph Yasi <joe...@gm...>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |   15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)

Tested-by: Aaron Brown <aar...@in...>
This was more of a regression check as I never did manage to replicate the tx hang, even with seemingly the same hardware.

Re: [E1000-devel] [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"

From: Joseph Y. <joe...@gm...> - 2019-04-23 00:51:27

On Wed, Apr 17, 2019 at 4:13 AM Konstantin Khlebnikov
<khl...@ya...> wrote:
>
> This reverts commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61.
>
> That change cased false-positive warning about hardware hang:
>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
>    TDH                  <0>
>    TDT                  <1>
>    next_to_use          <1>
>    next_to_clean        <0>
> buffer_info[next_to_clean]:
>    time_stamp           <fffba7a7>
>    next_to_watch        <0>
>    jiffies              <fffbb140>
>    next_to_watch.status <0>
> MAC Status             <40080080>
> PHY Status             <7949>
> PHY 1000BASE-T Status  <0>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>
> Besides warning everything works fine.
> Original issue will be fixed property in following patch.
>
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Reported-by: Joseph Yasi <joe...@gm...>
Tested-by: Joseph Yasi <joe...@gm...>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |   15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7acc61e4f645..ba96e52aa8d1 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -5309,13 +5309,8 @@ static void e1000_watchdog_task(struct work_struct *work)
>                         /* 8000ES2LAN requires a Rx packet buffer work-around
>                          * on link down event; reset the controller to flush
>                          * the Rx packet buffer.
> -                        *
> -                        * If the link is lost the controller stops DMA, but
> -                        * if there is queued Tx work it cannot be done.  So
> -                        * reset the controller to flush the Tx packet buffers.
>                          */
> -                       if ((adapter->flags & FLAG_RX_NEEDS_RESTART) ||
> -                           e1000_desc_unused(tx_ring) + 1 < tx_ring->count)
> +                       if (adapter->flags & FLAG_RX_NEEDS_RESTART)
>                                 adapter->flags |= FLAG_RESTART_NOW;
>                         else
>                                 pm_schedule_suspend(netdev->dev.parent,
> @@ -5338,6 +5333,14 @@ static void e1000_watchdog_task(struct work_struct *work)
>         adapter->gotc_old = adapter->stats.gotc;
>         spin_unlock(&adapter->stats64_lock);
>
> +       /* If the link is lost the controller stops DMA, but
> +        * if there is queued Tx work it cannot be done.  So
> +        * reset the controller to flush the Tx packet buffers.
> +        */
> +       if (!netif_carrier_ok(netdev) &&
> +           (e1000_desc_unused(tx_ring) + 1 < tx_ring->count))
> +               adapter->flags |= FLAG_RESTART_NOW;
> +
>         /* If reset is necessary, do it outside of interrupt context. */
>         if (adapter->flags & FLAG_RESTART_NOW) {
>                 schedule_work(&adapter->reset_task);
>

Re: [E1000-devel] [PATCH 2/2] e1000e: start network tx queue only when link is up

From: Joseph Y. <joe...@gm...> - 2019-04-23 00:51:07

On Wed, Apr 17, 2019 at 4:13 AM Konstantin Khlebnikov
<khl...@ya...> wrote:
>
> Driver does not want to keep packets in tx queue when link is lost.
> But present code only reset NIC to flush them, but does not prevent
> queuing new packets. Moreover reset sequence itself could generate
> new packets via netconsole and NIC falls into endless reset loop.
>
> This patch wakes tx queue only when NIC is ready to send packets.
>
> This is proper fix for problem addressed by commit 0f9e980bf5ee
> ("e1000e: fix cyclic resets at link up with active tx").
>
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Suggested-by: Alexander Duyck <ale...@gm...>
Tested-by: Joseph Yasi <joe...@gm...>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index ba96e52aa8d1..fe643d66aa10 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4209,7 +4209,7 @@ void e1000e_up(struct e1000_adapter *adapter)
>                 e1000_configure_msix(adapter);
>         e1000_irq_enable(adapter);
>
> -       netif_start_queue(adapter->netdev);
> +       /* tx queue started by watchdog timer when link is up */
>
>         e1000e_trigger_lsc(adapter);
>  }
> @@ -4607,6 +4607,7 @@ int e1000e_open(struct net_device *netdev)
>         pm_runtime_get_sync(&pdev->dev);
>
>         netif_carrier_off(netdev);
> +       netif_stop_queue(netdev);
>
>         /* allocate transmit descriptors */
>         err = e1000e_setup_tx_resources(adapter->tx_ring);
> @@ -4667,7 +4668,6 @@ int e1000e_open(struct net_device *netdev)
>         e1000_irq_enable(adapter);
>
>         adapter->tx_hang_recheck = false;
> -       netif_start_queue(netdev);
>
>         hw->mac.get_link_status = true;
>         pm_runtime_put(&pdev->dev);
> @@ -5289,6 +5289,7 @@ static void e1000_watchdog_task(struct work_struct *work)
>                         if (phy->ops.cfg_on_link_up)
>                                 phy->ops.cfg_on_link_up(hw);
>
> +                       netif_wake_queue(netdev);
>                         netif_carrier_on(netdev);
>
>                         if (!test_bit(__E1000_DOWN, &adapter->state))
> @@ -5302,6 +5303,7 @@ static void e1000_watchdog_task(struct work_struct *work)
>                         /* Link status message must follow this format */
>                         pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>                         netif_carrier_off(netdev);
> +                       netif_stop_queue(netdev);
>                         if (!test_bit(__E1000_DOWN, &adapter->state))
>                                 mod_timer(&adapter->phy_info_timer,
>                                           round_jiffies(jiffies + 2 * HZ));
>

Re: [E1000-devel] [PATCH 2/2] e1000e: start network tx queue only when link is up

From: Joseph Y. <joe...@gm...> - 2019-04-23 00:50:04

On Wed, Apr 17, 2019 at 4:13 AM Konstantin Khlebnikov <
khl...@ya...> wrote:

> Driver does not want to keep packets in tx queue when link is lost.
> But present code only reset NIC to flush them, but does not prevent
> queuing new packets. Moreover reset sequence itself could generate
> new packets via netconsole and NIC falls into endless reset loop.
>
> This patch wakes tx queue only when NIC is ready to send packets.
>
> This is proper fix for problem addressed by commit 0f9e980bf5ee
> ("e1000e: fix cyclic resets at link up with active tx").
>
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Suggested-by: Alexander Duyck <ale...@gm...>
>
Tested-by: Joseph Yasi <joe...@gm...>

> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index ba96e52aa8d1..fe643d66aa10 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4209,7 +4209,7 @@ void e1000e_up(struct e1000_adapter *adapter)
>                 e1000_configure_msix(adapter);
>         e1000_irq_enable(adapter);
>
> -       netif_start_queue(adapter->netdev);
> +       /* tx queue started by watchdog timer when link is up */
>
>         e1000e_trigger_lsc(adapter);
>  }
> @@ -4607,6 +4607,7 @@ int e1000e_open(struct net_device *netdev)
>         pm_runtime_get_sync(&pdev->dev);
>
>         netif_carrier_off(netdev);
> +       netif_stop_queue(netdev);
>
>         /* allocate transmit descriptors */
>         err = e1000e_setup_tx_resources(adapter->tx_ring);
> @@ -4667,7 +4668,6 @@ int e1000e_open(struct net_device *netdev)
>         e1000_irq_enable(adapter);
>
>         adapter->tx_hang_recheck = false;
> -       netif_start_queue(netdev);
>
>         hw->mac.get_link_status = true;
>         pm_runtime_put(&pdev->dev);
> @@ -5289,6 +5289,7 @@ static void e1000_watchdog_task(struct work_struct
> *work)
>                         if (phy->ops.cfg_on_link_up)
>                                 phy->ops.cfg_on_link_up(hw);
>
> +                       netif_wake_queue(netdev);
>                         netif_carrier_on(netdev);
>
>                         if (!test_bit(__E1000_DOWN, &adapter->state))
> @@ -5302,6 +5303,7 @@ static void e1000_watchdog_task(struct work_struct
> *work)
>                         /* Link status message must follow this format */
>                         pr_info("%s NIC Link is Down\n",
> adapter->netdev->name);
>                         netif_carrier_off(netdev);
> +                       netif_stop_queue(netdev);
>                         if (!test_bit(__E1000_DOWN, &adapter->state))
>                                 mod_timer(&adapter->phy_info_timer,
>                                           round_jiffies(jiffies + 2 * HZ));
>
>

Re: [E1000-devel] [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"

From: Joseph Y. <joe...@gm...> - 2019-04-23 00:49:31

On Wed, Apr 17, 2019 at 4:13 AM Konstantin Khlebnikov <
khl...@ya...> wrote:

> This reverts commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61.
>
> That change cased false-positive warning about hardware hang:
>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
>    TDH                  <0>
>    TDT                  <1>
>    next_to_use          <1>
>    next_to_clean        <0>
> buffer_info[next_to_clean]:
>    time_stamp           <fffba7a7>
>    next_to_watch        <0>
>    jiffies              <fffbb140>
>    next_to_watch.status <0>
> MAC Status             <40080080>
> PHY Status             <7949>
> PHY 1000BASE-T Status  <0>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>
> Besides warning everything works fine.
> Original issue will be fixed property in following patch.
>
> Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
> Reported-by: Joseph Yasi <joe...@gm...>
>
Tested-by: Joseph Yasi <joe...@gm...>

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c |   15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7acc61e4f645..ba96e52aa8d1 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -5309,13 +5309,8 @@ static void e1000_watchdog_task(struct work_struct
> *work)
>                         /* 8000ES2LAN requires a Rx packet buffer
> work-around
>                          * on link down event; reset the controller to
> flush
>                          * the Rx packet buffer.
> -                        *
> -                        * If the link is lost the controller stops DMA,
> but
> -                        * if there is queued Tx work it cannot be done.
> So
> -                        * reset the controller to flush the Tx packet
> buffers.
>                          */
> -                       if ((adapter->flags & FLAG_RX_NEEDS_RESTART) ||
> -                           e1000_desc_unused(tx_ring) + 1 <
> tx_ring->count)
> +                       if (adapter->flags & FLAG_RX_NEEDS_RESTART)
>                                 adapter->flags |= FLAG_RESTART_NOW;
>                         else
>                                 pm_schedule_suspend(netdev->dev.parent,
> @@ -5338,6 +5333,14 @@ static void e1000_watchdog_task(struct work_struct
> *work)
>         adapter->gotc_old = adapter->stats.gotc;
>         spin_unlock(&adapter->stats64_lock);
>
> +       /* If the link is lost the controller stops DMA, but
> +        * if there is queued Tx work it cannot be done.  So
> +        * reset the controller to flush the Tx packet buffers.
> +        */
> +       if (!netif_carrier_ok(netdev) &&
> +           (e1000_desc_unused(tx_ring) + 1 < tx_ring->count))
> +               adapter->flags |= FLAG_RESTART_NOW;
> +
>         /* If reset is necessary, do it outside of interrupt context. */
>         if (adapter->flags & FLAG_RESTART_NOW) {
>                 schedule_work(&adapter->reset_task);
>
>

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Semyon V. <sem...@fa...> - 2019-04-22 10:20:31

On 18.04.2019 18:12, Hans de Goede wrote:
> Hi Semyon,
>
> On 18-04-19 15:26, Semyon Verchenko wrote:
>>
>> On 18.04.2019 16:09, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 09-04-19 17:31, Andy Shevchenko wrote:
>
> <snip>
>
>>> Ok.
>>>
>>> Семен Верченко, this means that we are going to need DMI info from
>>> the board in question. I thought we already had that, but I now see 
>>> that
>>> you original report did not have that a
>>>
>>> Please run as root:
>>>
>>> dmidecode &> dmidecode.log
>>>
>>> And then reply to this email with the generated dmidecode.log file
>>> attached. Once I have that file I can prepare a patch fixing this.
>
> Thank you for the DMI decode.
>
> Attached are 3 patches, can you please test if adding those 3 patches
> to your kernel fixes the problem?
>
> Thanks & Regards,
>
> Hans
>
Hi Hans,
It seems that these patches fix the problem (at least interfaces are 
visible through ip addr and it's possible to ping something from them).

Thanks for fixing this issue.

Re: [E1000-devel] Regarding LVMMC error of Intel igb driver

From: Fujinaka, T. <tod...@in...> - 2019-04-18 17:45:21

It's hard to tell what you're doing, but I'd suggest downloading the datasheet and looking at the bits in the register you just referred to.

The easiest way to get the data sheet is to google for "i350 data sheet" and select the version on the intel web site. You may want to download the specification update as well.

A quick check of 0x3400 0000 looks to show a problem in queue 1, a malicious driver behavior from a VLAN Insertion error. Sounds like you're doing something odd. 0x7400 0000 is the same error on a different queue.

Todd Fujinaka
Software Application Engineer
Datacenter Engineering Group
Intel Corporation
tod...@in...


-----Original Message-----
From: Wyborny, Carolyn [mailto:car...@in...] 
Sent: Thursday, April 18, 2019 9:04 AM
To: KyungWon Park <kw....@gm...>
Cc: e10...@li...
Subject: Re: [E1000-devel] Regarding LVMMC error of Intel igb driver

Hello,

I do not work on that driver anymore, but our current support can be obtained using the e1000-devel list I’ve copied above.

If you do not get support through this email, please let me know.

Thanks,

Carolyn

From: KyungWon Park [mailto:kw....@gm...]
Sent: Thursday, April 18, 2019 6:00 PM
To: Wyborny, Carolyn <car...@in...>
Subject: Regarding LVMMC error of Intel igb driver

Hello

I've found your e-mail address on
https://git.digitalstrom.org/bsp/linux/commit/1516f0a6492a3d1bd9fbebeac331950986ec9a9b

I was running a server in home which has Linux(ubuntu 18.04) installed with KVM.
I've passed intel igb vf to my VM through sr-iov but

"igb 0000:82:00.1 enp130s0f1: malformed Tx packet detected and dropped, LVMMC:0x34000000"

pops up every 2 seconds on dmesg.

I am using intel i350-T4 nic for sr-iov

I was searching on the internet what is "LVMMC" and what "0x34000000" means but only thing I could find was the link above.
Sometimes "LVMMC:0x74000000" errors pops up too.

What triggers LVMMC? I've turned off vf spoofcheck through ip utility in linux but seems like it doesn't work. (Still kernel warns me that VM tried to change MAC address which was set administratively, and reload the vf driver)

Thank you in advance.

Best.





_______________________________________________
E1000-devel mailing list
E10...@li...
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

Re: [E1000-devel] Regarding LVMMC error of Intel igb driver

From: Wyborny, C. <car...@in...> - 2019-04-18 16:38:17

Hello,

I do not work on that driver anymore, but our current support can be obtained using the e1000-devel list I’ve copied above.

If you do not get support through this email, please let me know.

Thanks,

Carolyn

From: KyungWon Park [mailto:kw....@gm...]
Sent: Thursday, April 18, 2019 6:00 PM
To: Wyborny, Carolyn <car...@in...>
Subject: Regarding LVMMC error of Intel igb driver

Hello

I've found your e-mail address on
https://git.digitalstrom.org/bsp/linux/commit/1516f0a6492a3d1bd9fbebeac331950986ec9a9b

I was running a server in home which has
Linux(ubuntu 18.04) installed with KVM.
I've passed intel igb vf to my VM through sr-iov
but

"igb 0000:82:00.1 enp130s0f1: malformed Tx packet detected and dropped, LVMMC:0x34000000"

pops up every 2 seconds on dmesg.

I am using intel i350-T4 nic for sr-iov

I was searching on the internet what is "LVMMC" and
what "0x34000000" means but only thing I could find was the link above.
Sometimes "LVMMC:0x74000000" errors pops up too.

What triggers LVMMC? I've turned off vf spoofcheck through ip utility in linux but seems like it doesn't work. (Still kernel warns me that VM tried to change MAC address which was set administratively, and reload the vf driver)

Thank you in advance.

Best.

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Hans de G. <hde...@re...> - 2019-04-18 15:13:07

Hi Semyon,

On 18-04-19 15:26, Semyon Verchenko wrote:
> 
> On 18.04.2019 16:09, Hans de Goede wrote:
>> Hi,
>>
>> On 09-04-19 17:31, Andy Shevchenko wrote:

<snip>

>> Ok.
>>
>> Семен Верченко, this means that we are going to need DMI info from
>> the board in question. I thought we already had that, but I now see that
>> you original report did not have that a
>>
>> Please run as root:
>>
>> dmidecode &> dmidecode.log
>>
>> And then reply to this email with the generated dmidecode.log file
>> attached. Once I have that file I can prepare a patch fixing this.

Thank you for the DMI decode.

Attached are 3 patches, can you please test if adding those 3 patches
to your kernel fixes the problem?

Thanks & Regards,

Hans

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Semyon V. <sem...@fa...> - 2019-04-18 13:26:58

On 18.04.2019 16:09, Hans de Goede wrote:
> Hi,
>
> On 09-04-19 17:31, Andy Shevchenko wrote:
>> On Mon, Apr 08, 2019 at 08:43:10PM +0200, Hans de Goede wrote:
>>> On 08-04-19 19:21, Andy Shevchenko wrote:
>>>> On Thu, Apr 04, 2019 at 04:43:03PM +0200, Hans de Goede wrote:
>>>>> On 29-03-19 16:53, Hans de Goede wrote:
>>>>>> On 3/29/19 2:59 PM, Семен Верченко wrote:
>>>>
>>>>>> Hmm, so 4 ethernet cards and 4 enabled / marked as critical clocks.
>>>>>>
>>>>>> Supporting this through get_clk is going to require a DMI table 
>>>>>> in the igb driver
>>>>>> combined with checking which PCI "slot" the card is to get the 
>>>>>> correct clock
>>>>>> for each ethernet controller.
>>>>>>
>>>>>> I believe tht just restoring the old behavior to mark all clocks 
>>>>>> enabled
>>>>>> on boot as critical, but then limited to this system based on a 
>>>>>> dmi match,
>>>>>> is the best solution here.
>>>>>>
>>>>>> Andy?
>>>>>
>>>>> Andy? Now that we've the patch ready for the other system which 
>>>>> needs to
>>>>> have the CLK_IS_CRITICAL workaround and enables this based on DMI 
>>>>> info,
>>>>> I believe the best fix for this system is to simply add it to that 
>>>>> DMI
>>>>> table?
>>>>
>>>> I reviewed v4, supposed to go via CLK tree.
>>>
>>> Right, but that patch adds the quirk for the system with the USB hub,
>>> do you agree, that given that each ethernet controller seems to be
>>> using its own clock, it is best to use a DMI quirk for this case too?
>>>
>>> If you agree then someone needs to prepare a follow-up patch on top of
>>> v4 which adds the DMI info for this board to the table.
>>
>> I hope we may find a better solution in the future, but for now as a 
>> quick fix
>> the proposed can be done.
>
> Ok.
>
> Семен Верченко, this means that we are going to need DMI info from
> the board in question. I thought we already had that, but I now see that
> you original report did not have that a
>
> Please run as root:
>
> dmidecode &> dmidecode.log
>
> And then reply to this email with the generated dmidecode.log file
> attached. Once I have that file I can prepare a patch fixing this.
>
> Regards,
>
> Hans
>

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Hans de G. <hde...@re...> - 2019-04-18 13:09:25

Hi,

On 09-04-19 17:31, Andy Shevchenko wrote:
> On Mon, Apr 08, 2019 at 08:43:10PM +0200, Hans de Goede wrote:
>> On 08-04-19 19:21, Andy Shevchenko wrote:
>>> On Thu, Apr 04, 2019 at 04:43:03PM +0200, Hans de Goede wrote:
>>>> On 29-03-19 16:53, Hans de Goede wrote:
>>>>> On 3/29/19 2:59 PM, Семен Верченко wrote:
>>>
>>>>> Hmm, so 4 ethernet cards and 4 enabled / marked as critical clocks.
>>>>>
>>>>> Supporting this through get_clk is going to require a DMI table in the igb driver
>>>>> combined with checking which PCI "slot" the card is to get the correct clock
>>>>> for each ethernet controller.
>>>>>
>>>>> I believe tht just restoring the old behavior to mark all clocks enabled
>>>>> on boot as critical, but then limited to this system based on a dmi match,
>>>>> is the best solution here.
>>>>>
>>>>> Andy?
>>>>
>>>> Andy? Now that we've the patch ready for the other system which needs to
>>>> have the CLK_IS_CRITICAL workaround and enables this based on DMI info,
>>>> I believe the best fix for this system is to simply add it to that DMI
>>>> table?
>>>
>>> I reviewed v4, supposed to go via CLK tree.
>>
>> Right, but that patch adds the quirk for the system with the USB hub,
>> do you agree, that given that each ethernet controller seems to be
>> using its own clock, it is best to use a DMI quirk for this case too?
>>
>> If you agree then someone needs to prepare a follow-up patch on top of
>> v4 which adds the DMI info for this board to the table.
> 
> I hope we may find a better solution in the future, but for now as a quick fix
> the proposed can be done.

Ok.

Семен Верченко, this means that we are going to need DMI info from
the board in question. I thought we already had that, but I now see that
you original report did not have that a

Please run as root:

dmidecode &> dmidecode.log

And then reply to this email with the generated dmidecode.log file
attached. Once I have that file I can prepare a patch fixing this.

Regards,

Hans

[E1000-devel] [PATCH 2/2] e1000e: start network tx queue only when link is up

From: Konstantin K. <khl...@ya...> - 2019-04-17 08:29:18

Driver does not want to keep packets in tx queue when link is lost.
But present code only reset NIC to flush them, but does not prevent
queuing new packets. Moreover reset sequence itself could generate
new packets via netconsole and NIC falls into endless reset loop.

This patch wakes tx queue only when NIC is ready to send packets.

This is proper fix for problem addressed by commit 0f9e980bf5ee
("e1000e: fix cyclic resets at link up with active tx").

Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
Suggested-by: Alexander Duyck <ale...@gm...>
---
 drivers/net/ethernet/intel/e1000e/netdev.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index ba96e52aa8d1..fe643d66aa10 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4209,7 +4209,7 @@ void e1000e_up(struct e1000_adapter *adapter)
 		e1000_configure_msix(adapter);
 	e1000_irq_enable(adapter);
 
-	netif_start_queue(adapter->netdev);
+	/* tx queue started by watchdog timer when link is up */
 
 	e1000e_trigger_lsc(adapter);
 }
@@ -4607,6 +4607,7 @@ int e1000e_open(struct net_device *netdev)
 	pm_runtime_get_sync(&pdev->dev);
 
 	netif_carrier_off(netdev);
+	netif_stop_queue(netdev);
 
 	/* allocate transmit descriptors */
 	err = e1000e_setup_tx_resources(adapter->tx_ring);
@@ -4667,7 +4668,6 @@ int e1000e_open(struct net_device *netdev)
 	e1000_irq_enable(adapter);
 
 	adapter->tx_hang_recheck = false;
-	netif_start_queue(netdev);
 
 	hw->mac.get_link_status = true;
 	pm_runtime_put(&pdev->dev);
@@ -5289,6 +5289,7 @@ static void e1000_watchdog_task(struct work_struct *work)
 			if (phy->ops.cfg_on_link_up)
 				phy->ops.cfg_on_link_up(hw);
 
+			netif_wake_queue(netdev);
 			netif_carrier_on(netdev);
 
 			if (!test_bit(__E1000_DOWN, &adapter->state))
@@ -5302,6 +5303,7 @@ static void e1000_watchdog_task(struct work_struct *work)
 			/* Link status message must follow this format */
 			pr_info("%s NIC Link is Down\n", adapter->netdev->name);
 			netif_carrier_off(netdev);
+			netif_stop_queue(netdev);
 			if (!test_bit(__E1000_DOWN, &adapter->state))
 				mod_timer(&adapter->phy_info_timer,
 					  round_jiffies(jiffies + 2 * HZ));

[E1000-devel] [PATCH 1/2] Revert "e1000e: fix cyclic resets at link up with active tx"

From: Konstantin K. <khl...@ya...> - 2019-04-17 08:29:17

This reverts commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61.

That change cased false-positive warning about hardware hang:

e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
   TDH                  <0>
   TDT                  <1>
   next_to_use          <1>
   next_to_clean        <0>
buffer_info[next_to_clean]:
   time_stamp           <fffba7a7>
   next_to_watch        <0>
   jiffies              <fffbb140>
   next_to_watch.status <0>
MAC Status             <40080080>
PHY Status             <7949>
PHY 1000BASE-T Status  <0>
PHY Extended Status    <3000>
PCI Status             <10>
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Besides warning everything works fine.
Original issue will be fixed property in following patch.

Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
Reported-by: Joseph Yasi <joe...@gm...>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
---
 drivers/net/ethernet/intel/e1000e/netdev.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 7acc61e4f645..ba96e52aa8d1 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5309,13 +5309,8 @@ static void e1000_watchdog_task(struct work_struct *work)
 			/* 8000ES2LAN requires a Rx packet buffer work-around
 			 * on link down event; reset the controller to flush
 			 * the Rx packet buffer.
-			 *
-			 * If the link is lost the controller stops DMA, but
-			 * if there is queued Tx work it cannot be done.  So
-			 * reset the controller to flush the Tx packet buffers.
 			 */
-			if ((adapter->flags & FLAG_RX_NEEDS_RESTART) ||
-			    e1000_desc_unused(tx_ring) + 1 < tx_ring->count)
+			if (adapter->flags & FLAG_RX_NEEDS_RESTART)
 				adapter->flags |= FLAG_RESTART_NOW;
 			else
 				pm_schedule_suspend(netdev->dev.parent,
@@ -5338,6 +5333,14 @@ static void e1000_watchdog_task(struct work_struct *work)
 	adapter->gotc_old = adapter->stats.gotc;
 	spin_unlock(&adapter->stats64_lock);
 
+	/* If the link is lost the controller stops DMA, but
+	 * if there is queued Tx work it cannot be done.  So
+	 * reset the controller to flush the Tx packet buffers.
+	 */
+	if (!netif_carrier_ok(netdev) &&
+	    (e1000_desc_unused(tx_ring) + 1 < tx_ring->count))
+		adapter->flags |= FLAG_RESTART_NOW;
+
 	/* If reset is necessary, do it outside of interrupt context. */
 	if (adapter->flags & FLAG_RESTART_NOW) {
 		schedule_work(&adapter->reset_task);

Re: [E1000-devel] [e1000e REGRESSION BISECTED] Detected Hardware Unit Hang with 5.0.7

From: Konstantin K. <khl...@ya...> - 2019-04-17 06:03:32

On 16.04.2019 20:12, Alexander Duyck wrote:
> On Mon, Apr 15, 2019 at 11:22 AM Joseph Yasi <joe...@gm...> wrote:
>>
>> Hello,
>> I reported a regression that happened after upgrading from 5.0.6 to 5.0.7:
>> https://bugzilla.kernel.org/show_bug.cgi?id=203175
>>
>> This is fixed by reverting commit
>> 7f0a3a436e88a71b96694c029f01a9a8eade3d5d e1000e: fix cyclic resets at link
>> up with active tx. A few others have reported the same hang in bugzilla.
>>
>> Thanks,
>> Joe Yasi
>>
>> dmesg of hang:
>> [Sat Apr  6 00:12:10 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
>> Duplex, Flow Control: Rx/Tx
>> [Sat Apr  6 00:12:10 2019] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link
>> becomes ready
>> [Sat Apr  6 00:12:12 2019] e1000e 0000:00:1f.6 eth0: Detected Hardware Unit
>> Hang:
>>
>>                               TDH                  <0>
>>
>>                               TDT                  <1>
>>
>>                               next_to_use          <1>
>>
>>                               next_to_clean        <0>
>>
>>                             buffer_info[next_to_clean]:
>>
>>                               time_stamp           <fffba7a7>
>>
>>                               next_to_watch        <0>
>>                               jiffies              <fffbb140>
>>                               next_to_watch.status <0>
>>                             MAC Status             <40080080>
>>                             PHY Status             <7949>
>>                             PHY 1000BASE-T Status  <0>
>>                             PHY Extended Status    <3000>
>>                             PCI Status             <10>
>> [Sat Apr  6 00:12:14 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
>> Duplex, Flow Control: Rx/Tx
>>
>> lspci -vv
>> 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
>> I219-V
>>          Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V
>>          Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx+
>>          Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>          Latency: 0
>>          Interrupt: pin A routed to IRQ 145
>>          Region 0: Memory at df400000 (32-bit, non-prefetchable) [size=128K]
>>          Capabilities: [c8] Power Management version 3
>>                  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
>>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                  Address: 00000000fee00518  Data: 0000
>>          Capabilities: [e0] PCI Advanced Features
>>                  AFCap: TP+ FLR+
>>                  AFCtrl: FLR-
>>                  AFStatus: TP-
>>          Kernel driver in use: e1000e
>>          Kernel modules: e1000
>>
> 
> So the commit ID you reported doesn't match up to the value in the
> kernel. I believe the patch you are talking about is:
> commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61
> Author: Konstantin Khlebnikov <khl...@ya...>
> Date:   Mon Jan 14 16:29:30 2019 +0300
> 
>      e1000e: fix cyclic resets at link up with active tx
> 
>      I'm seeing series of e1000e resets (sometimes endless) at system boot
>      if something generates tx traffic at this time. In my case this is
>      netconsole who sends message "e1000e 0000:02:00.0: Some CPU C-states
>      have been disabled in order to enable jumbo frames" from e1000e itself.
>      As result e1000_watchdog_task sees used tx buffer while carrier is off
>      and start this reset cycle again.
> 
>      [   17.794359] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
>      [   17.794714] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
>      [   22.936455] e1000e 0000:02:00.0 eth1: changing MTU from 1500 to 9000
>      [   23.033336] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   26.102364] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
>      [   27.174495] 8021q: 802.1Q VLAN Support v1.8
>      [   27.174513] 8021q: adding VLAN 0 to HW filter on device eth1
>      [   30.671724] cgroup: cgroup: disabling cgroup2 socket matching
> due to net_prio or net_cls activation
>      [   30.898564] netpoll: netconsole: local port 6666
>      [   30.898566] netpoll: netconsole: local IPv6 address
> 2a02:6b8:0:80b:beae:c5ff:fe28:23f8
>      [   30.898567] netpoll: netconsole: interface 'eth1'
>      [   30.898568] netpoll: netconsole: remote port 6666
>      [   30.898568] netpoll: netconsole: remote IPv6 address
> 2a02:6b8:b000:605c:e61d:2dff:fe03:3790
>      [   30.898569] netpoll: netconsole: remote ethernet address
> b0:a8:6e:f4:ff:c0
>      [   30.917747] console [netcon0] enabled
>      [   30.917749] netconsole: network logging started
>      [   31.453353] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.185730] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.321840] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.465822] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.597423] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.745417] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   34.877356] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   35.005441] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   35.157376] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   35.289362] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   35.417441] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>      [   37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
> 
>      This patch flushes tx buffers only once when carrier is off
>      rather than at each watchdog iteration.
> 
>      Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
>      Tested-by: Aaron Brown <aar...@in...>
>      Signed-off-by: Jeff Kirsher <jef...@in...>
> 
> A quick review of the patch shows that it is fundamentally flawed
> since all it is doing is moving reset to the path where the link goes
> down. However that doesn't even really resolve the original issue
> since the complaint was that  the NIC was resetting because netconsole
> was queueing packets while the link down. Without the reset the
> packets are just going to queue up on the interface and the first time
> the interface comes up it will trigger a Tx hang message as has been
> seen here.

Not exactly, reset itself adds new packets into tx queue and
this triggers new NIC reset at next watchdog iteration.
Link state stays down because uplink switch reacts with some delay.
And looks like each reset restarts this delay.

> 
> I would recommend reverting the above patch and then addressing the
> original problem. The question we should be asking is why are we
> enqueueing packets on a ring of the device when it doesn't have link?
> 
> A better fix might be to remove the netif_start_queue in the e1000e_up
> call, replace it with netif_stop_queue in e1000e_open, place a call to
> netif_wake_queue just before the netif_carrier_on in the watchdog
> task, and to add a call to netif_stop_queue just after the
> netif_carrier_off in the watchdog task. That should prevent us from
> enqueuing packets on a interface with no link, and would still allow
> us to flush packets out if they somehow got by all that and were still
> enqueued to the Tx queue.
> 

Yep, this looks like proper solution.

Re: [E1000-devel] [e1000e REGRESSION BISECTED] Detected Hardware Unit Hang with 5.0.7

From: Joseph Y. <joe...@gm...> - 2019-04-16 17:19:01

On Tue, Apr 16, 2019, 1:12 PM Alexander Duyck <ale...@gm...>
wrote:

> On Mon, Apr 15, 2019 at 11:22 AM Joseph Yasi <joe...@gm...> wrote:
> >
> > Hello,
> > I reported a regression that happened after upgrading from 5.0.6 to
> 5.0.7:
> > https://bugzilla.kernel.org/show_bug.cgi?id=203175
> >
> > This is fixed by reverting commit
> > 7f0a3a436e88a71b96694c029f01a9a8eade3d5d e1000e: fix cyclic resets at
> link
> > up with active tx. A few others have reported the same hang in bugzilla.
> >
> > Thanks,
> > Joe Yasi
> >
> > dmesg of hang:
> > [Sat Apr  6 00:12:10 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
> > Duplex, Flow Control: Rx/Tx
> > [Sat Apr  6 00:12:10 2019] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link
> > becomes ready
> > [Sat Apr  6 00:12:12 2019] e1000e 0000:00:1f.6 eth0: Detected Hardware
> Unit
> > Hang:
> >
> >                              TDH                  <0>
> >
> >                              TDT                  <1>
> >
> >                              next_to_use          <1>
> >
> >                              next_to_clean        <0>
> >
> >                            buffer_info[next_to_clean]:
> >
> >                              time_stamp           <fffba7a7>
> >
> >                              next_to_watch        <0>
> >                              jiffies              <fffbb140>
> >                              next_to_watch.status <0>
> >                            MAC Status             <40080080>
> >                            PHY Status             <7949>
> >                            PHY 1000BASE-T Status  <0>
> >                            PHY Extended Status    <3000>
> >                            PCI Status             <10>
> > [Sat Apr  6 00:12:14 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
> > Duplex, Flow Control: Rx/Tx
> >
> > lspci -vv
> > 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
> > I219-V
> >         Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V
> >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr-
> > Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > <TAbort- <MAbort- >SERR- <PERR- INTx-
> >         Latency: 0
> >         Interrupt: pin A routed to IRQ 145
> >         Region 0: Memory at df400000 (32-bit, non-prefetchable)
> [size=128K]
> >         Capabilities: [c8] Power Management version 3
> >                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> > PME(D0+,D1-,D2-,D3hot+,D3cold+)
> >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> >         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> >                 Address: 00000000fee00518  Data: 0000
> >         Capabilities: [e0] PCI Advanced Features
> >                 AFCap: TP+ FLR+
> >                 AFCtrl: FLR-
> >                 AFStatus: TP-
> >         Kernel driver in use: e1000e
> >         Kernel modules: e1000
> >
>
> So the commit ID you reported doesn't match up to the value in the
> kernel. I believe the patch you are talking about is:
> commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61
> Author: Konstantin Khlebnikov <khl...@ya...>
>

It matches the commit ID in the linux-5.0.y stable branch.
Yes, 0f9e980bf5ee1a97e2e401c846b2af989eb21c61 is the upstream commit ID.

Date:   Mon Jan 14 16:29:30 2019 +0300
>
>     e1000e: fix cyclic resets at link up with active tx
>
>     I'm seeing series of e1000e resets (sometimes endless) at system boot
>     if something generates tx traffic at this time. In my case this is
>     netconsole who sends message "e1000e 0000:02:00.0: Some CPU C-states
>     have been disabled in order to enable jumbo frames" from e1000e itself.
>     As result e1000_watchdog_task sees used tx buffer while carrier is off
>     and start this reset cycle again.
>
>     [   17.794359] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
>     [   17.794714] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
>     [   22.936455] e1000e 0000:02:00.0 eth1: changing MTU from 1500 to 9000
>     [   23.033336] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   26.102364] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
>     [   27.174495] 8021q: 802.1Q VLAN Support v1.8
>     [   27.174513] 8021q: adding VLAN 0 to HW filter on device eth1
>     [   30.671724] cgroup: cgroup: disabling cgroup2 socket matching
> due to net_prio or net_cls activation
>     [   30.898564] netpoll: netconsole: local port 6666
>     [   30.898566] netpoll: netconsole: local IPv6 address
> 2a02:6b8:0:80b:beae:c5ff:fe28:23f8
>     [   30.898567] netpoll: netconsole: interface 'eth1'
>     [   30.898568] netpoll: netconsole: remote port 6666
>     [   30.898568] netpoll: netconsole: remote IPv6 address
> 2a02:6b8:b000:605c:e61d:2dff:fe03:3790
>     [   30.898569] netpoll: netconsole: remote ethernet address
> b0:a8:6e:f4:ff:c0
>     [   30.917747] console [netcon0] enabled
>     [   30.917749] netconsole: network logging started
>     [   31.453353] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.185730] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.321840] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.465822] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.597423] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.745417] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   34.877356] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   35.005441] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   35.157376] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   35.289362] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   35.417441] e1000e 0000:02:00.0: Some CPU C-states have been
> disabled in order to enable jumbo frames
>     [   37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
> Flow Control: None
>
>     This patch flushes tx buffers only once when carrier is off
>     rather than at each watchdog iteration.
>
>     Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
>     Tested-by: Aaron Brown <aar...@in...>
>     Signed-off-by: Jeff Kirsher <jef...@in...>
>
> A quick review of the patch shows that it is fundamentally flawed
> since all it is doing is moving reset to the path where the link goes
> down. However that doesn't even really resolve the original issue
> since the complaint was that  the NIC was resetting because netconsole
> was queueing packets while the link down. Without the reset the
> packets are just going to queue up on the interface and the first time
> the interface comes up it will trigger a Tx hang message as has been
> seen here.
>
> I would recommend reverting the above patch and then addressing the
> original problem. The question we should be asking is why are we
> enqueueing packets on a ring of the device when it doesn't have link?
>
> A better fix might be to remove the netif_start_queue in the e1000e_up
> call, replace it with netif_stop_queue in e1000e_open, place a call to
> netif_wake_queue just before the netif_carrier_on in the watchdog
> task, and to add a call to netif_stop_queue just after the
> netif_carrier_off in the watchdog task. That should prevent us from
> enqueuing packets on a interface with no link, and would still allow
> us to flush packets out if they somehow got by all that and were still
> enqueued to the Tx queue.
>

Re: [E1000-devel] [e1000e REGRESSION BISECTED] Detected Hardware Unit Hang with 5.0.7

From: Alexander D. <ale...@gm...> - 2019-04-16 17:12:38

On Mon, Apr 15, 2019 at 11:22 AM Joseph Yasi <joe...@gm...> wrote:
>
> Hello,
> I reported a regression that happened after upgrading from 5.0.6 to 5.0.7:
> https://bugzilla.kernel.org/show_bug.cgi?id=203175
>
> This is fixed by reverting commit
> 7f0a3a436e88a71b96694c029f01a9a8eade3d5d e1000e: fix cyclic resets at link
> up with active tx. A few others have reported the same hang in bugzilla.
>
> Thanks,
> Joe Yasi
>
> dmesg of hang:
> [Sat Apr  6 00:12:10 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
> Duplex, Flow Control: Rx/Tx
> [Sat Apr  6 00:12:10 2019] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link
> becomes ready
> [Sat Apr  6 00:12:12 2019] e1000e 0000:00:1f.6 eth0: Detected Hardware Unit
> Hang:
>
>                              TDH                  <0>
>
>                              TDT                  <1>
>
>                              next_to_use          <1>
>
>                              next_to_clean        <0>
>
>                            buffer_info[next_to_clean]:
>
>                              time_stamp           <fffba7a7>
>
>                              next_to_watch        <0>
>                              jiffies              <fffbb140>
>                              next_to_watch.status <0>
>                            MAC Status             <40080080>
>                            PHY Status             <7949>
>                            PHY 1000BASE-T Status  <0>
>                            PHY Extended Status    <3000>
>                            PCI Status             <10>
> [Sat Apr  6 00:12:14 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
> Duplex, Flow Control: Rx/Tx
>
> lspci -vv
> 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
> I219-V
>         Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0
>         Interrupt: pin A routed to IRQ 145
>         Region 0: Memory at df400000 (32-bit, non-prefetchable) [size=128K]
>         Capabilities: [c8] Power Management version 3
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                 Address: 00000000fee00518  Data: 0000
>         Capabilities: [e0] PCI Advanced Features
>                 AFCap: TP+ FLR+
>                 AFCtrl: FLR-
>                 AFStatus: TP-
>         Kernel driver in use: e1000e
>         Kernel modules: e1000
>

So the commit ID you reported doesn't match up to the value in the
kernel. I believe the patch you are talking about is:
commit 0f9e980bf5ee1a97e2e401c846b2af989eb21c61
Author: Konstantin Khlebnikov <khl...@ya...>
Date:   Mon Jan 14 16:29:30 2019 +0300

    e1000e: fix cyclic resets at link up with active tx

    I'm seeing series of e1000e resets (sometimes endless) at system boot
    if something generates tx traffic at this time. In my case this is
    netconsole who sends message "e1000e 0000:02:00.0: Some CPU C-states
    have been disabled in order to enable jumbo frames" from e1000e itself.
    As result e1000_watchdog_task sees used tx buffer while carrier is off
    and start this reset cycle again.

    [   17.794359] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: None
    [   17.794714] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
    [   22.936455] e1000e 0000:02:00.0 eth1: changing MTU from 1500 to 9000
    [   23.033336] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   26.102364] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: None
    [   27.174495] 8021q: 802.1Q VLAN Support v1.8
    [   27.174513] 8021q: adding VLAN 0 to HW filter on device eth1
    [   30.671724] cgroup: cgroup: disabling cgroup2 socket matching
due to net_prio or net_cls activation
    [   30.898564] netpoll: netconsole: local port 6666
    [   30.898566] netpoll: netconsole: local IPv6 address
2a02:6b8:0:80b:beae:c5ff:fe28:23f8
    [   30.898567] netpoll: netconsole: interface 'eth1'
    [   30.898568] netpoll: netconsole: remote port 6666
    [   30.898568] netpoll: netconsole: remote IPv6 address
2a02:6b8:b000:605c:e61d:2dff:fe03:3790
    [   30.898569] netpoll: netconsole: remote ethernet address
b0:a8:6e:f4:ff:c0
    [   30.917747] console [netcon0] enabled
    [   30.917749] netconsole: network logging started
    [   31.453353] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.185730] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.321840] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.465822] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.597423] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.745417] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   34.877356] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   35.005441] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   35.157376] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   35.289362] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   35.417441] e1000e 0000:02:00.0: Some CPU C-states have been
disabled in order to enable jumbo frames
    [   37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: None

    This patch flushes tx buffers only once when carrier is off
    rather than at each watchdog iteration.

    Signed-off-by: Konstantin Khlebnikov <khl...@ya...>
    Tested-by: Aaron Brown <aar...@in...>
    Signed-off-by: Jeff Kirsher <jef...@in...>

A quick review of the patch shows that it is fundamentally flawed
since all it is doing is moving reset to the path where the link goes
down. However that doesn't even really resolve the original issue
since the complaint was that  the NIC was resetting because netconsole
was queueing packets while the link down. Without the reset the
packets are just going to queue up on the interface and the first time
the interface comes up it will trigger a Tx hang message as has been
seen here.

I would recommend reverting the above patch and then addressing the
original problem. The question we should be asking is why are we
enqueueing packets on a ring of the device when it doesn't have link?

A better fix might be to remove the netif_start_queue in the e1000e_up
call, replace it with netif_stop_queue in e1000e_open, place a call to
netif_wake_queue just before the netif_carrier_on in the watchdog
task, and to add a call to netif_stop_queue just after the
netif_carrier_off in the watchdog task. That should prevent us from
enqueuing packets on a interface with no link, and would still allow
us to flush packets out if they somehow got by all that and were still
enqueued to the Tx queue.

[E1000-devel] [e1000e REGRESSION BISECTED] Detected Hardware Unit Hang with 5.0.7

From: Joseph Y. <joe...@gm...> - 2019-04-14 16:44:01

Hello,
I reported a regression that happened after upgrading from 5.0.6 to 5.0.7:
https://bugzilla.kernel.org/show_bug.cgi?id=203175

This is fixed by reverting commit
7f0a3a436e88a71b96694c029f01a9a8eade3d5d e1000e: fix cyclic resets at link
up with active tx. A few others have reported the same hang in bugzilla.

Thanks,
Joe Yasi

dmesg of hang:
[Sat Apr  6 00:12:10 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: Rx/Tx
[Sat Apr  6 00:12:10 2019] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link
becomes ready
[Sat Apr  6 00:12:12 2019] e1000e 0000:00:1f.6 eth0: Detected Hardware Unit
Hang:

                             TDH                  <0>

                             TDT                  <1>

                             next_to_use          <1>

                             next_to_clean        <0>

                           buffer_info[next_to_clean]:

                             time_stamp           <fffba7a7>

                             next_to_watch        <0>
                             jiffies              <fffbb140>
                             next_to_watch.status <0>
                           MAC Status             <40080080>
                           PHY Status             <7949>
                           PHY 1000BASE-T Status  <0>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
[Sat Apr  6 00:12:14 2019] e1000e: eth0 NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: Rx/Tx

lspci -vv
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
I219-V
        Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 145
        Region 0: Memory at df400000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00518  Data: 0000
        Capabilities: [e0] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: e1000e
        Kernel modules: e1000

[E1000-devel] Hallo

From: Mark W. <ns...@ce...> - 2019-04-09 21:28:21

Hallo,

Ich arbeite bei Credit Suisse im London Investment Banking. Ich habe Ihren Kontakt während meiner privaten Suche gesehen. Ich bin fest davon überzeugt, dass Sie sehr ehrlich, engagiert und in der Lage sind, mich in diesem Geschäft zu unterstützen.

Ich nehme Kontakt mit Ihnen auf, um als unser Begünstigter für unseren verstorbenen Kunden der Bank zu fungieren, der sein Geld im Wert von 15.800.000 £ (fünfzehn Millionen, achthunderttausend Pfund) hinterließ, so dass der Gesamtbetrag seiner Einlage freigegeben und an Sie ausbezahlt wird Begünstigter des Verstorbenen.

Ich werde Ihnen mehr Informationen zu diesem Fonds geben, sobald Sie mit Ihren Angaben unten antworten.

Vollständiger Name
Handynummer

Warten auf Ihre Mail.

Grüße,
Mark Woolston

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Andy S. <and...@li...> - 2019-04-09 15:31:38

On Mon, Apr 08, 2019 at 08:43:10PM +0200, Hans de Goede wrote:
> On 08-04-19 19:21, Andy Shevchenko wrote:
> > On Thu, Apr 04, 2019 at 04:43:03PM +0200, Hans de Goede wrote:
> > > On 29-03-19 16:53, Hans de Goede wrote:
> > > > On 3/29/19 2:59 PM, Семен Верченко wrote:
> > 
> > > > Hmm, so 4 ethernet cards and 4 enabled / marked as critical clocks.
> > > > 
> > > > Supporting this through get_clk is going to require a DMI table in the igb driver
> > > > combined with checking which PCI "slot" the card is to get the correct clock
> > > > for each ethernet controller.
> > > > 
> > > > I believe tht just restoring the old behavior to mark all clocks enabled
> > > > on boot as critical, but then limited to this system based on a dmi match,
> > > > is the best solution here.
> > > > 
> > > > Andy?
> > > 
> > > Andy? Now that we've the patch ready for the other system which needs to
> > > have the CLK_IS_CRITICAL workaround and enables this based on DMI info,
> > > I believe the best fix for this system is to simply add it to that DMI
> > > table?
> > 
> > I reviewed v4, supposed to go via CLK tree.
> 
> Right, but that patch adds the quirk for the system with the USB hub,
> do you agree, that given that each ethernet controller seems to be
> using its own clock, it is best to use a DMI quirk for this case too?
> 
> If you agree then someone needs to prepare a follow-up patch on top of
> v4 which adds the DMI info for this board to the table.

I hope we may find a better solution in the future, but for now as a quick fix
the proposed can be done.

-- 
With Best Regards,
Andy Shevchenko

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Hans de G. <hde...@re...> - 2019-04-08 18:43:22

Hi,

On 08-04-19 19:21, Andy Shevchenko wrote:
> On Thu, Apr 04, 2019 at 04:43:03PM +0200, Hans de Goede wrote:
>> On 29-03-19 16:53, Hans de Goede wrote:
>>> On 3/29/19 2:59 PM, Семен Верченко wrote:
> 
>>> Hmm, so 4 ethernet cards and 4 enabled / marked as critical clocks.
>>>
>>> Supporting this through get_clk is going to require a DMI table in the igb driver
>>> combined with checking which PCI "slot" the card is to get the correct clock
>>> for each ethernet controller.
>>>
>>> I believe tht just restoring the old behavior to mark all clocks enabled
>>> on boot as critical, but then limited to this system based on a dmi match,
>>> is the best solution here.
>>>
>>> Andy?
>>
>> Andy? Now that we've the patch ready for the other system which needs to
>> have the CLK_IS_CRITICAL workaround and enables this based on DMI info,
>> I believe the best fix for this system is to simply add it to that DMI
>> table?
> 
> I reviewed v4, supposed to go via CLK tree.

Right, but that patch adds the quirk for the system with the USB hub,
do you agree, that given that each ethernet controller seems to be
using its own clock, it is best to use a DMI quirk for this case too?

If you agree then someone needs to prepare a follow-up patch on top of
v4 which adds the DMI info for this board to the table.

Regards,

Hans

Re: [E1000-devel] igb driver with Intel Atom Bay Trail issue

From: Andy S. <and...@li...> - 2019-04-08 17:21:22

On Thu, Apr 04, 2019 at 04:43:03PM +0200, Hans de Goede wrote:
> On 29-03-19 16:53, Hans de Goede wrote:
> > On 3/29/19 2:59 PM, Семен Верченко wrote:

> > Hmm, so 4 ethernet cards and 4 enabled / marked as critical clocks.
> > 
> > Supporting this through get_clk is going to require a DMI table in the igb driver
> > combined with checking which PCI "slot" the card is to get the correct clock
> > for each ethernet controller.
> > 
> > I believe tht just restoring the old behavior to mark all clocks enabled
> > on boot as critical, but then limited to this system based on a dmi match,
> > is the best solution here.
> > 
> > Andy?
> 
> Andy? Now that we've the patch ready for the other system which needs to
> have the CLK_IS_CRITICAL workaround and enables this based on DMI info,
> I believe the best fix for this system is to simply add it to that DMI
> table?

I reviewed v4, supposed to go via CLK tree.

-- 
With Best Regards,
Andy Shevchenko

Flat | Threaded

1 2 > >> (Page 1 of 2)

S	M	T	W	T	F	S
	1	2	3	4 (1)	5	6
7	8 (2)	9 (2)	10	11	12	13
14 (1)	15	16 (2)	17 (3)	18 (5)	19	20
21	22 (1)	23 (4)	24	25	26 (4)	27
28	29 (1)	30