e1000-devel Mailing List for Formerly Intel Ethernet Drivers

Moved to github.com/intel

Brought to you by: aloktion, anguy11, asunderr, emiltan, and 21 others

e1000-devel — Discussion of the Intel Ethernet out-of-tree drivers

You can subscribe to this list here.

2002	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (2)	Nov (1)	Dec
2003	Jan	Feb	Mar (1)	Apr (9)	May (3)	Jun	Jul (3)	Aug (6)	Sep	Oct (7)	Nov	Dec
2004	Jan	Feb (5)	Mar (10)	Apr (2)	May (22)	Jun (8)	Jul (4)	Aug (8)	Sep (3)	Oct	Nov (36)	Dec (52)
2005	Jan (9)	Feb (13)	Mar (9)	Apr	May (14)	Jun (5)	Jul (20)	Aug (31)	Sep (2)	Oct (3)	Nov (18)	Dec (18)
2006	Jan (36)	Feb (16)	Mar (76)	Apr (78)	May (32)	Jun (30)	Jul (67)	Aug (43)	Sep (54)	Oct (116)	Nov (223)	Dec (158)
2007	Jan (180)	Feb (71)	Mar (110)	Apr (114)	May (203)	Jun (100)	Jul (238)	Aug (191)	Sep (177)	Oct (171)	Nov (211)	Dec (159)
2008	Jan (227)	Feb (288)	Mar (197)	Apr (253)	May (132)	Jun (152)	Jul (109)	Aug (143)	Sep (157)	Oct (198)	Nov (121)	Dec (147)
2009	Jan (105)	Feb (61)	Mar (191)	Apr (161)	May (118)	Jun (172)	Jul (166)	Aug (67)	Sep (86)	Oct (79)	Nov (118)	Dec (181)
2010	Jan (136)	Feb (154)	Mar (92)	Apr (83)	May (101)	Jun (66)	Jul (118)	Aug (78)	Sep (134)	Oct (131)	Nov (132)	Dec (104)
2011	Jan (79)	Feb (104)	Mar (144)	Apr (145)	May (130)	Jun (169)	Jul (146)	Aug (76)	Sep (113)	Oct (82)	Nov (145)	Dec (122)
2012	Jan (132)	Feb (106)	Mar (145)	Apr (238)	May (140)	Jun (162)	Jul (166)	Aug (147)	Sep (80)	Oct (148)	Nov (192)	Dec (90)
2013	Jan (139)	Feb (162)	Mar (174)	Apr (81)	May (261)	Jun (301)	Jul (106)	Aug (175)	Sep (305)	Oct (222)	Nov (95)	Dec (120)
2014	Jan (196)	Feb (171)	Mar (146)	Apr (118)	May (127)	Jun (93)	Jul (175)	Aug (66)	Sep (85)	Oct (120)	Nov (81)	Dec (192)
2015	Jan (141)	Feb (133)	Mar (189)	Apr (126)	May (59)	Jun (117)	Jul (56)	Aug (97)	Sep (44)	Oct (48)	Nov (33)	Dec (87)
2016	Jan (37)	Feb (56)	Mar (72)	Apr (65)	May (66)	Jun (65)	Jul (98)	Aug (54)	Sep (84)	Oct (68)	Nov (69)	Dec (60)
2017	Jan (30)	Feb (38)	Mar (53)	Apr (6)	May (2)	Jun (5)	Jul (15)	Aug (15)	Sep (7)	Oct (18)	Nov (23)	Dec (6)
2018	Jan (39)	Feb (5)	Mar (34)	Apr (26)	May (27)	Jun (5)	Jul (12)	Aug (4)	Sep	Oct (4)	Nov (4)	Dec (4)
2019	Jan (7)	Feb (10)	Mar (21)	Apr (26)	May (4)	Jun (5)	Jul (11)	Aug (6)	Sep (7)	Oct (13)	Nov (3)	Dec (17)
2020	Jan	Feb (3)	Mar (3)	Apr (5)	May (2)	Jun (5)	Jul	Aug	Sep (6)	Oct (7)	Nov (2)	Dec (7)
2021	Jan (9)	Feb (10)	Mar (18)	Apr (1)	May (3)	Jun	Jul (16)	Aug (2)	Sep	Oct	Nov (9)	Dec (2)
2022	Jan (3)	Feb	Mar (9)	Apr (8)	May (5)	Jun (6)	Jul (1)	Aug	Sep (1)	Oct	Nov (7)	Dec (2)
2023	Jan (7)	Feb (2)	Mar (6)	Apr	May (4)	Jun (2)	Jul (4)	Aug (3)	Sep (4)	Oct (2)	Nov (4)	Dec (10)
2024	Jan (4)	Feb (2)	Mar (1)	Apr	May (1)	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
				1 (2)	2 (7)	3 (7)
4 (1)	5 (2)	6 (5)	7 (1)	8 (6)	9 (8)	10
11 (1)	12 (2)	13 (3)	14 (10)	15 (8)	16 (2)	17
18 (2)	19 (3)	20 (1)	21 (5)	22 (5)	23 (1)	24 (1)
25	26 (6)	27 (20)	28 (5)	29	30 (4)	31

Flat | Threaded

1 2 3 .. 5 > >> (Page 1 of 5)

Re: [E1000-devel] Sometimes e1000e doesn't find all interfaces

From: Allan, B. W <bru...@in...> - 2010-07-30 20:59:25

On Thursday, July 22, 2010 1:19 PM, Brian De Wolf wrote:
> On Thu, 22 Jul 2010 11:41:51 -0700
> "Allan, Bruce W" <bru...@in...> wrote:
> 
>> 
>> Strange that the console kernel parameters apparently cause this.
>> 
> 
> It doesn't look like it's that deterministic, unfortunately.  While
> getting the data you asked for, it worked once with the console
> parameters added, and it also failed once with the console
> parameters removed. These seem to be exceptions, though, as all the
> other times it acted as I expected. For now, it serves as a good
> trigger for the condition, at least.
> 
>> Can you provide the full (not just the ethernet devices) lspci -t and
>> lspci -vvv outputs for when it is working and when it is not?  One
>> or more PCI bridges might not be recognized by the system and if
>> that is the case there is no way any PCI device hanging off the
>> bridge will be detected (not much can be done about that by the
>> driver). 
> 
> Alright, I made these two sets of data by running these two commands
> while rebooting:
> lspci -vvv &> /root/lspci-$(ifconfig -a | wc -l)-$(date +%s)
> lspci -t &> /root/lspci-t-$(ifconfig -a | wc -l)-$(date +%s)
> 
> The full outputs for lspci -vvv got to around 50k, so I gzipped them.

Apparently in the case where your 80003ES2LAN dual-port adapter is not showing up, the PCI Express downstream port (enumerated as 02:02.0 in the good case) to which the adapter is attached is not even detected (in the bad case).  It looks like the downstream PCIe port may have caused a master abort on the upstream PCI bridge (00:02.0).  I'm not sure there is anything we can do from the driver perspective when the PCIe port is not properly detected and operational.  I assume this is an on-board adapter (LOM) and not a NIC which is unfortunate since you cannot swap it to another PCIe slot.

You might want to contact your hardware vendor (since this may be a hardware issue), or the lin...@vg...erel.or and/or lin...@vg... mailing lists to bring this up with the PCI and ACPI developers/maintainers respectively (they, too, will probably want to see the lspci outputs and maybe output from dmidecode).  Feel free to keep e1000-devel on the distribution list if you'd like.

Sorry I couldn't be of more help,
Bruce.

Re: [E1000-devel] e1000e crashes with 2.6.34.x and ThinkPad T60

From: Allan, B. W <bru...@in...> - 2010-07-30 17:42:13

On Friday, July 30, 2010 5:56 AM, Marc Haber wrote:
> On Mon, Jul 26, 2010 at 09:13:45AM -0700, Allan, Bruce W wrote:
>> Adding e1000-devel (the Intel LAN developers list).
>> 
>> Please supply the full dmesg you meant to attach with the original
>> report, as well as the output of lspci -vvv.
> 
> Stupid me.
> 
> Greetings
> Marc

Please also provide an eeprom dump from the wired LOM via 'ethtool -e ethX'.

Thanks,
Bruce.

Re: [E1000-devel] e1000e crashes with 2.6.34.x and ThinkPad T60

From: Marc H. <mh+...@zu...> - 2010-07-30 13:36:01

Attachments: lspci-vvv dmesg.e1000e

On Mon, Jul 26, 2010 at 09:13:45AM -0700, Allan, Bruce W wrote:
> Adding e1000-devel (the Intel LAN developers list).
> 
> Please supply the full dmesg you meant to attach with the original
> report, as well as the output of lspci -vvv.

Stupid me.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Mannheim, Germany  |  lose things."    Winona Ryder | Fon: *49 621 72739834
Nordisch by Nature |  How to make an American Quilt | Fax: *49 3221 2323190

Re: [E1000-devel] Problem with e1000e, 802.1Q VLAN's and IPMI

From: Andrey P. <pa...@ce...> - 2010-07-30 07:41:45

On 208, 07 27, 2010 at 04:05:52 +0400, Andrey Panin wrote:
> On 207, 07 26, 2010 at 09:45:06AM -0700, Brandeburg, Jesse wrote:
> > 
> > On Mon, 26 Jul 2010, Andrey Panin wrote:
> > > I have a problem using IPMI on Super Micro X7SBL motherboard (AOC-IPMI20-E BMC).
> > > BMC shares ethernet port with onboard e1000e. It works when IPMI traffic is 
> > > untagged (CrcStripping=0 module option used), but if I try to use 802.1Q vlan
> > > for IPMI traffic BMC stops responding right after "ifup eth0". 
> > 
> > We've heard of this issue (or similar) before.  I believe for the other 
> > guys they had to run IPMI traffic untagged, if your main network is 
> > running untagged.  The problem in this case is that the tags are being 
> > stripped in hardware even for the SMBUS packets, at which point the BMC 
> > that is stupid and doesn't understand offloading hardware gets confused 
> > that the traffic it is receiving doesn't have a vlan tag.
> 
> I suspected something like that. Unfortunately Super Micro isn't interested
> in fixing their BMC, at least latest firmware doesn't fix the issue.
> 
> > > I tested 2.6.26 (debian stable kernel) and 2.6.35-rc6.
> > > Looks like there is some problem with 802.1Q tags stripping/inserting.
> > 
> > Thanks for testing the latest kernel.
> > 
> > > There was no such problem with e1000 driver from 2.6.18.
> > 
> > Hm, that is an interesting statement.  The major changes to the driver 
> > include stripping tags all the time when not in promisc mode, and there 
> > have been some fixes and attempts to fix the ipmi issues some 
> > (particularly supermicro) BMCs have.  I'm quite surprised it worked for 
> > you at all (there have been quite a few issues in this area)
> 
> After some sleep I'm not so sure that it was really working :(
> I'll retest it more thoroughly this evening.

After retest I can confirm that tagged IPMI works with 2.6.18 e1000 driver.

Re: [E1000-devel] getting max throughput from AMD Opteron w/igb-2.2.9

From: Alexander D. <ale...@in...> - 2010-07-28 17:05:45

Ed Ravin wrote:
> On Tue, Jul 27, 2010 at 09:42:16AM -0700, Alexander Duyck wrote:
> 
>> The fact that the ring size seems to effect the number of packets  
>> dropped per second implies that there may be some sort of latency issue.  
>>  One thing you might try is different values for rx-usecs via ethtool  
>> -C.  You may find that fixing the value at something fairly low like 33  
>> usecs per interrupt may help to reduce the number of rx_fifo_errors.
> 
> "ethtool -C eth0 rx-usecs 33" is accepted, but "ethtool -c eth0" shows
> the values unchanged.  This is with igb-2.2.9.

I tried to reproduce the issue with rx-usecs here but didn't have much 
luck.  What version of ethtool are you currently running?  Perhaps the 
driver is having an issue with a specific version of ethtool.

>> Another factor you will need to take into account is that the ring  
>> memory should be allocated on the same node the hardware is on.  You  
>> should be able to accomplish that by using taskset with the correct CPU  
>> mask for the physical ID you are using when calling modprobe/insmod and  
>> the ifconfig commands to bring up the interfaces.  This should help to  
>> decrease the memory latency and increase the throughput available to the  
>> adapter.
> 
> The taskset/modprobe trick along with putting all the queues on the same
> physical CPU seems to provide the best performance, when using these
> igb-2.2.9 settings:
> 
>  taskset 0002 modprobe igb RSS=0,0 InterruptThrottleRate=3,3
> 
>  root@big-tester:~# eth_affinity_tool show eth0 eth1
>  16 CPUs detected
> 
>  eth0:  ffff 0001 0002 0004 0008 0010 0020 0040 0080
>  eth1:  ffff 0001 0002 0004 0008 0010 0020 0040 0080
> 

Based on this it sounds like the PCIe bus for the network interface is 
likely connected to node 0 on your system.  So for performance reasons 
we will want to keep everything in the mask cpu mask 0xFF to keep both 
the memory and the PCIe bus local to the node.

>> Thanks for the information.  One other item I would be interested in  
>> seeing is the kind of numbers we are talking about.  If you could  
>> provide me with an ethtool -S dump from 10 seconds of one of your tests  
>> that might be useful for me to better understand the kind of pressures  
>> the system is under.
> 
> Here's a sample 10-second run of "ethtool -S" on the receiving interface,
> after being piped through beforeafter, so these are 10 seconds of each
> counter.  Counters that are missing were zero, i.e., no changes in the
> 10 second run.  So dividing these numbers by 10 gives you the per-second
> rate.
> 
> In this interval, we're sending far more packets than
> can be processed:
> 
> Interval from 20100727.174619 to 20100727.174629
> 
> NIC statistics:
>      rx_packets: 11595293
>      rx_bytes: 742098688
>      rx_long_byte_count: 742098688
>      rx_fifo_errors: 163800
>      rx_queue_0_packets: 754216
>      rx_queue_0_bytes: 45252960
>      rx_queue_0_drops: 20475
>      rx_queue_1_packets: 760734
>      rx_queue_1_bytes: 45644040
>      rx_queue_1_drops: 20475
>      rx_queue_2_packets: 736546
>      rx_queue_2_bytes: 44192760
>      rx_queue_2_drops: 20475
>      rx_queue_3_packets: 742368
>      rx_queue_3_bytes: 44542080
>      rx_queue_3_drops: 20475
>      rx_queue_4_packets: 661758
>      rx_queue_4_bytes: 39705480
>      rx_queue_4_drops: 20475
>      rx_queue_5_packets: 713095
>      rx_queue_5_bytes: 42785706
>      rx_queue_5_drops: 20475
>      rx_queue_6_packets: 696702
>      rx_queue_6_bytes: 41802120
>      rx_queue_6_drops: 20475
>      rx_queue_7_packets: 705726
>      rx_queue_7_bytes: 42343560
>      rx_queue_7_drops: 20475
> 
> And in this interval, we're sending packets just a bit faster than they
> can be processed:
> 
> Interval from 20100727.175747 to 20100727.175757
> 
> NIC statistics:
>      rx_packets: 3877553
>      rx_bytes: 248163392
>      rx_long_byte_count: 248163392
>      rx_fifo_errors: 6515
>      rx_queue_0_packets: 484608
>      rx_queue_0_bytes: 29076480
>      rx_queue_0_drops: 81
>      rx_queue_1_packets: 484690
>      rx_queue_1_bytes: 29081400
>      rx_queue_2_packets: 484690
>      rx_queue_2_bytes: 29081400
>      rx_queue_3_packets: 484360
>      rx_queue_3_bytes: 29061600
>      rx_queue_3_drops: 342
>      rx_queue_4_packets: 483207
>      rx_queue_4_bytes: 28992420
>      rx_queue_4_drops: 1497
>      rx_queue_5_packets: 480183
>      rx_queue_5_bytes: 28810980
>      rx_queue_5_drops: 4521
>      rx_queue_6_packets: 484615
>      rx_queue_6_bytes: 29076900
>      rx_queue_6_drops: 74
>      rx_queue_7_packets: 484690
>      rx_queue_7_bytes: 29081400
> 
> Again, to keep things readable, my counter-processing script doesn't
> list statistics that didn't increment during the 10 second interval,
> so the stats from "ethtool -S" not listed were all zero.
> 
> Note that the traffic I'm using to get these numbers (lots of small UDP
> packets)  is a denial-of-service scenario - we're more interested
> in real-world performance for routing but that's harder to simulate, and
> we need to be able to handle DoS attacks so this is the benchmark we're
> using.

Based on this I would think that a single CPU should be able to handle 
routing at this rate.  One thought that occurred to me is that we might 
be spreading the load too wide and that could possibly be causing extra 
latency since the queues are generating interrupts instead of polling. 
One other thing to try would be to try reducing the number of queues to 
either 1 or 2 and probably disable QueuePairs.  You might try RX on 
CPU0, TX on CPU1, and if that consumes an entire CPU you might try 
putting another RX on CPU2 and TX on CPU3.

Thanks,

Alex

Re: [E1000-devel] how do I disable sending of carrier on 82576 and 82599 XAUI?

From: Chris F. <chr...@ge...> - 2010-07-28 16:12:06

On 07/27/2010 02:50 PM, Peter P Waskiewicz Jr wrote:

> There is no proper way to shut down XAUI for 82599 unfortunately.  What
> we've recommended to other people is to set AUTOC.LMS to 000b (1G link,
> no auto-neg).  That should be enough to "break" the physical link on the
> wire, then write AUTOC.Restart_AN.

In our system it appears that setting AUTOC.LMS to 000b ends up causing
the far end to synchronize at 1GB.  Too smart for its own good, I guess.

Some good news though...apparently my earlier experiments were being
done at the wrong location in the code and were being overridden by
subsequent code.  Once I put it in a suitable place, setting AUTOC.LMS
to IXGBE_AUTOC_LMS_10G_SERIAL was enough to "break" the link.

Thanks for the help,

Chris

-- 
Chris Friesen
Software Developer
GENBAND
chr...@ge...
www.genband.com

Re: [E1000-devel] getting max throughput from AMD Opteron w/igb-2.2.9

From: Ed R. <er...@pa...> - 2010-07-28 05:01:13

On Tue, Jul 27, 2010 at 09:42:16AM -0700, Alexander Duyck wrote:

> The fact that the ring size seems to effect the number of packets  
> dropped per second implies that there may be some sort of latency issue.  
>  One thing you might try is different values for rx-usecs via ethtool  
> -C.  You may find that fixing the value at something fairly low like 33  
> usecs per interrupt may help to reduce the number of rx_fifo_errors.

"ethtool -C eth0 rx-usecs 33" is accepted, but "ethtool -c eth0" shows
the values unchanged.  This is with igb-2.2.9.


> After looking over your lspci dump I am assuming you are running a  
> Supermicro motherboard with the AMD SR5690/SP5100 chipset.  If that is  
> the case you will probably find that one physical ID works much better  
> than the other for network performance because the SR5690 that the 82576  
> is connected to is going to be node local for one of the sockets and  
> remote for the other.
>
> Another factor you will need to take into account is that the ring  
> memory should be allocated on the same node the hardware is on.  You  
> should be able to accomplish that by using taskset with the correct CPU  
> mask for the physical ID you are using when calling modprobe/insmod and  
> the ifconfig commands to bring up the interfaces.  This should help to  
> decrease the memory latency and increase the throughput available to the  
> adapter.

The taskset/modprobe trick along with putting all the queues on the same
physical CPU seems to provide the best performance, when using these
igb-2.2.9 settings:

 taskset 0002 modprobe igb RSS=0,0 InterruptThrottleRate=3,3

 root@big-tester:~# eth_affinity_tool show eth0 eth1
 16 CPUs detected

 eth0:  ffff 0001 0002 0004 0008 0010 0020 0040 0080
 eth1:  ffff 0001 0002 0004 0008 0010 0020 0040 0080

> Thanks for the information.  One other item I would be interested in  
> seeing is the kind of numbers we are talking about.  If you could  
> provide me with an ethtool -S dump from 10 seconds of one of your tests  
> that might be useful for me to better understand the kind of pressures  
> the system is under.

Here's a sample 10-second run of "ethtool -S" on the receiving interface,
after being piped through beforeafter, so these are 10 seconds of each
counter.  Counters that are missing were zero, i.e., no changes in the
10 second run.  So dividing these numbers by 10 gives you the per-second
rate.

In this interval, we're sending far more packets than
can be processed:

Interval from 20100727.174619 to 20100727.174629

NIC statistics:
     rx_packets: 11595293
     rx_bytes: 742098688
     rx_long_byte_count: 742098688
     rx_fifo_errors: 163800
     rx_queue_0_packets: 754216
     rx_queue_0_bytes: 45252960
     rx_queue_0_drops: 20475
     rx_queue_1_packets: 760734
     rx_queue_1_bytes: 45644040
     rx_queue_1_drops: 20475
     rx_queue_2_packets: 736546
     rx_queue_2_bytes: 44192760
     rx_queue_2_drops: 20475
     rx_queue_3_packets: 742368
     rx_queue_3_bytes: 44542080
     rx_queue_3_drops: 20475
     rx_queue_4_packets: 661758
     rx_queue_4_bytes: 39705480
     rx_queue_4_drops: 20475
     rx_queue_5_packets: 713095
     rx_queue_5_bytes: 42785706
     rx_queue_5_drops: 20475
     rx_queue_6_packets: 696702
     rx_queue_6_bytes: 41802120
     rx_queue_6_drops: 20475
     rx_queue_7_packets: 705726
     rx_queue_7_bytes: 42343560
     rx_queue_7_drops: 20475

And in this interval, we're sending packets just a bit faster than they
can be processed:

Interval from 20100727.175747 to 20100727.175757

NIC statistics:
     rx_packets: 3877553
     rx_bytes: 248163392
     rx_long_byte_count: 248163392
     rx_fifo_errors: 6515
     rx_queue_0_packets: 484608
     rx_queue_0_bytes: 29076480
     rx_queue_0_drops: 81
     rx_queue_1_packets: 484690
     rx_queue_1_bytes: 29081400
     rx_queue_2_packets: 484690
     rx_queue_2_bytes: 29081400
     rx_queue_3_packets: 484360
     rx_queue_3_bytes: 29061600
     rx_queue_3_drops: 342
     rx_queue_4_packets: 483207
     rx_queue_4_bytes: 28992420
     rx_queue_4_drops: 1497
     rx_queue_5_packets: 480183
     rx_queue_5_bytes: 28810980
     rx_queue_5_drops: 4521
     rx_queue_6_packets: 484615
     rx_queue_6_bytes: 29076900
     rx_queue_6_drops: 74
     rx_queue_7_packets: 484690
     rx_queue_7_bytes: 29081400

Again, to keep things readable, my counter-processing script doesn't
list statistics that didn't increment during the 10 second interval,
so the stats from "ethtool -S" not listed were all zero.

Note that the traffic I'm using to get these numbers (lots of small UDP
packets)  is a denial-of-service scenario - we're more interested
in real-world performance for routing but that's harder to simulate, and
we need to be able to handle DoS attacks so this is the benchmark we're
using.

Thanks,

	-- Ed

Re: [E1000-devel] [patch -next] ixgbe: potential null dereference

From: David M. <da...@da...> - 2010-07-28 03:48:40

From: Jeff Kirsher <jef...@in...>
Date: Tue, 27 Jul 2010 17:10:02 -0700

> On Tue, Jul 27, 2010 at 03:05, Dan Carpenter <er...@gm...> wrote:
>> The e_dev_err() macro dereferences "adapter" which is NULL here.
>>
>> Signed-off-by: Dan Carpenter <er...@gm...>
> 
> Acked-by: Jeff Kirsher <jef...@in...>

Applied.

Re: [E1000-devel] [patch -next] ixgbe: potential null dereference

From: Jeff K. <jef...@in...> - 2010-07-28 00:10:09

On Tue, Jul 27, 2010 at 03:05, Dan Carpenter <er...@gm...> wrote:
> The e_dev_err() macro dereferences "adapter" which is NULL here.
>
> Signed-off-by: Dan Carpenter <er...@gm...>

Acked-by: Jeff Kirsher <jef...@in...>

Re: [E1000-devel] [PATCHv3] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Jeff K. <jef...@in...> - 2010-07-27 23:08:25

On Tue, Jul 27, 2010 at 16:05, Sonny Rao <son...@us...> wrote:
> On Tue, Jul 27, 2010 at 03:45:42PM -0700, Jeff Kirsher wrote:
>>
>> You also seem to be missing igb.
>
> This patch is similar to what was fixed in ixgbe in this patch:
>
> http://marc.info/?l=e1000-devel&m=126593062701537&w=3
>
> We should add read memory barriers to all the similar cases across the
> Intel ethernet driver family.  In the case of ixgbevf, igb, and igbvf
> I've also added a missing barrier to the clean_tx_irq path because I
> missed it in my last patch.
>
> Without the barrier a processor can speculate a load ahead of the load
> which looks at the status bit and get stale information causing a
> number of different issues including invalid packet length, NULL
> pointers, or bad data since checksumming was assumed to be done
> in hardware.
>
> v2: I missed the e100 the first time
> v3: I missed igb and igbvf, third time's the charm?
>
> Signed-off-by: Milton Miller <mi...@bg...>
> Signed-off-by: Sonny Rao <son...@us...>
> cc: stable <st...@ke...>
>

Thanks, I have added the patch to my queue.

-- 
Cheers,
Jeff

[E1000-devel] [PATCHv3] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Sonny R. <son...@us...> - 2010-07-27 23:04:44

On Tue, Jul 27, 2010 at 03:45:42PM -0700, Jeff Kirsher wrote:
> 
> You also seem to be missing igb.

This patch is similar to what was fixed in ixgbe in this patch:

http://marc.info/?l=e1000-devel&m=126593062701537&w=3

We should add read memory barriers to all the similar cases across the
Intel ethernet driver family.  In the case of ixgbevf, igb, and igbvf
I've also added a missing barrier to the clean_tx_irq path because I
missed it in my last patch.

Without the barrier a processor can speculate a load ahead of the load
which looks at the status bit and get stale information causing a
number of different issues including invalid packet length, NULL
pointers, or bad data since checksumming was assumed to be done
in hardware.

v2: I missed the e100 the first time
v3: I missed igb and igbvf, third time's the charm?

Signed-off-by: Milton Miller <mi...@bg...>
Signed-off-by: Sonny Rao <son...@us...>
cc: stable <st...@ke...>

Index: linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000/e1000_main.c	2010-07-27 16:15:18.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c	2010-07-27 16:22:21.000000000 -0500
@@ -3638,6 +3638,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -3844,6 +3845,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/e1000e/netdev.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000e/netdev.c	2010-07-27 16:22:38.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000e/netdev.c	2010-07-27 16:25:23.000000000 -0500
@@ -774,6 +774,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -1081,6 +1082,7 @@ static bool e1000_clean_rx_irq_ps(struct
 			break;
 		(*work_done)++;
 		skb = buffer_info->skb;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		/* in the packet split case this is header only */
 		prefetch(skb->data - NET_IP_ALIGN);
@@ -1280,6 +1282,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:27:23.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:41:57.000000000 -0500
@@ -1977,6 +1977,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *a
 			break;
 
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 		status = rx_desc->status;
 		skb = buffer_info->skb;
 		buffer_info->skb = NULL;
Index: linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:30:51.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:40:19.000000000 -0500
@@ -231,6 +231,7 @@ static bool ixgbevf_clean_tx_irq(struct 
 	while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
 	       (count < tx_ring->work_limit)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			struct sk_buff *skb;
 			tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
@@ -518,6 +519,7 @@ static bool ixgbevf_clean_rx_irq(struct 
 			break;
 		(*work_done)++;
 
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 		if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
 			hdr_info = le16_to_cpu(ixgbevf_get_hdr_info(rx_desc));
 			len = (hdr_info & IXGBE_RXDADV_HDRBUFLEN_MASK) >>
Index: linux-2.6.35-rc5/drivers/net/e100.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e100.c	2010-07-27 17:36:44.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e100.c	2010-07-27 17:41:38.000000000 -0500
@@ -1928,6 +1928,7 @@ static int e100_rx_indicate(struct nic *
 
 	netif_printk(nic, rx_status, KERN_DEBUG, nic->netdev,
 		     "status=0x%04X\n", rfd_status);
+	rmb(); /* read size after status bit */
 
 	/* If data isn't ready, nothing to indicate */
 	if (unlikely(!(rfd_status & cb_complete))) {
Index: linux-2.6.35-rc5/drivers/net/igb/igb_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/igb/igb_main.c	2010-07-27 17:50:47.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/igb/igb_main.c	2010-07-27 17:57:11.000000000 -0500
@@ -5335,6 +5335,7 @@ static bool igb_clean_tx_irq(struct igb_
 
 	while ((eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
+		rmb();	/* read buffer_info after eop_desc status */
 		for (cleaned = false; !cleaned; count++) {
 			tx_desc = E1000_TX_DESC_ADV(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
@@ -5540,6 +5541,7 @@ static bool igb_clean_rx_irq_adv(struct 
 		if (*work_done >= budget)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		skb = buffer_info->skb;
 		prefetch(skb->data - NET_IP_ALIGN);
Index: linux-2.6.35-rc5/drivers/net/igbvf/netdev.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/igbvf/netdev.c	2010-07-27 17:51:00.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/igbvf/netdev.c	2010-07-27 17:59:15.000000000 -0500
@@ -248,6 +248,7 @@ static bool igbvf_clean_rx_irq(struct ig
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		buffer_info = &rx_ring->buffer_info[i];
 
@@ -780,6 +781,7 @@ static bool igbvf_clean_tx_irq(struct ig
 
 	while ((eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
+		rmb();	/* read buffer_info after eop_desc status */
 		for (cleaned = false; !cleaned; count++) {
 			tx_desc = IGBVF_TX_DESC_ADV(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];

Re: [E1000-devel] [PATCHv2] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Sonny R. <son...@us...> - 2010-07-27 22:50:29

On Tue, Jul 27, 2010 at 03:45:42PM -0700, Jeff Kirsher wrote:
> > --
> 
> You also seem to be missing igb.

You're right, at first I looked at igb and didn't see the analgous bit
of code but I apparently didn't look hard enough.. stand by for
version 3 

-- 
Sonny Rao, LTC OzLabs, BML team

Re: [E1000-devel] [PATCH] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Jeff K. <jef...@in...> - 2010-07-27 22:49:57

On Tue, Jul 27, 2010 at 15:46, Sonny Rao <son...@us...> wrote:
> On Tue, Jul 27, 2010 at 03:41:46PM -0700, Jeff Kirsher wrote:
>> On Tue, Jul 27, 2010 at 15:34, Sonny Rao <son...@us...> wrote:
>> > This patch is similar to what was fixed in ixgbe in this patch:
>> >
>> > http://marc.info/?l=e1000-devel&m=126593062701537&w=3
>> >
>> > We should add read memory barriers to all the similar cases across the
>> > Intel ethernet driver family.  In the case of ixgbevf I've also added
>> > a missing barrier to the clean_tx_irq path because I missed it in my
>> > last patch.
>> >
>> > Without the barrier a processor can speculate a load ahead of the load
>> > which looks at the status bit and get stale information causing a
>> > number of different issues including invalid packet length, NULL
>> > pointers, or bad data since checksumming was assumed to be done
>> > in hardware.
>> >
>> > Signed-off-by: Milton Miller <mi...@bg...>
>> > Signed-off-by: Sonny Rao <son...@us...>
>> > cc: stable <st...@ke...>
>> >
>>
>> I already have a similar patch in my queue from you Sonny, although I
>> see that this patch has made a few more changes.  Is this version 2?
>
> Well, the previous one was for the clean_tx_irq functions this one is
> for the clean_rx_irq functions.  I'd gotten the two confused when I
> referenced Anton's original patch -- which was also a clean_rx_irq
> patch.  So they are touching different code paths but fixing similar
> problems.
>
>
> --
> Sonny Rao, LTC OzLabs, BML team
> --

Ok, just wanted to make sure.  In the first patch (which I already
have in my queue) that cleans up clean_tx_irq, your missing igb driver
as well.

-- 
Cheers,
Jeff

Re: [E1000-devel] [PATCHv2] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Jeff K. <jef...@in...> - 2010-07-27 22:45:50

On Tue, Jul 27, 2010 at 15:44, Sonny Rao <son...@us...> wrote:
> Add missing memory barriers to clean_rx_irq functions in Intel Drivers
>
> This patch is similar to what was fixed in ixgbe in this patch:
>
> http://marc.info/?l=e1000-devel&m=126593062701537&w=3
>
> We should add read memory barriers to all the similar cases across the
> Intel ethernet driver family.  In the case of ixgbevf I've also added
> a missing barrier to the clean_tx_irq path because I missed it in my
> last patch.
>
> Without the barrier a processor can speculate a load ahead of the load
> which looks at the status bit and get stale information causing a
> number of different issues including invalid packet length, NULL
> pointers, or bad data since checksumming was assumed to be done
> in hardware.
>
> v2: I missed the e100 the first time
>
> Signed-off-by: Milton Miller <mi...@bg...>
> Signed-off-by: Sonny Rao <son...@us...>
> cc: stable <st...@ke...>
>
> Index: linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.35-rc5.orig/drivers/net/e1000/e1000_main.c        2010-07-27 16:15:18.000000000 -0500
> +++ linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c     2010-07-27 16:22:21.000000000 -0500
> @@ -3638,6 +3638,7 @@ static bool e1000_clean_jumbo_rx_irq(str
>                if (*work_done >= work_to_do)
>                        break;
>                (*work_done)++;
> +               rmb(); /* read descriptor and rx_buffer_info after status DD */
>
>                status = rx_desc->status;
>                skb = buffer_info->skb;
> @@ -3844,6 +3845,7 @@ static bool e1000_clean_rx_irq(struct e1
>                if (*work_done >= work_to_do)
>                        break;
>                (*work_done)++;
> +               rmb(); /* read descriptor and rx_buffer_info after status DD */
>
>                status = rx_desc->status;
>                skb = buffer_info->skb;
> Index: linux-2.6.35-rc5/drivers/net/e1000e/netdev.c
> ===================================================================
> --- linux-2.6.35-rc5.orig/drivers/net/e1000e/netdev.c   2010-07-27 16:22:38.000000000 -0500
> +++ linux-2.6.35-rc5/drivers/net/e1000e/netdev.c        2010-07-27 16:25:23.000000000 -0500
> @@ -774,6 +774,7 @@ static bool e1000_clean_rx_irq(struct e1
>                if (*work_done >= work_to_do)
>                        break;
>                (*work_done)++;
> +               rmb();  /* read descriptor and rx_buffer_info after status DD */
>
>                status = rx_desc->status;
>                skb = buffer_info->skb;
> @@ -1081,6 +1082,7 @@ static bool e1000_clean_rx_irq_ps(struct
>                        break;
>                (*work_done)++;
>                skb = buffer_info->skb;
> +               rmb();  /* read descriptor and rx_buffer_info after status DD */
>
>                /* in the packet split case this is header only */
>                prefetch(skb->data - NET_IP_ALIGN);
> @@ -1280,6 +1282,7 @@ static bool e1000_clean_jumbo_rx_irq(str
>                if (*work_done >= work_to_do)
>                        break;
>                (*work_done)++;
> +               rmb();  /* read descriptor and rx_buffer_info after status DD */
>
>                status = rx_desc->status;
>                skb = buffer_info->skb;
> Index: linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c
> ===================================================================
> --- linux-2.6.35-rc5.orig/drivers/net/ixgb/ixgb_main.c  2010-07-27 16:27:23.000000000 -0500
> +++ linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c       2010-07-27 16:41:57.000000000 -0500
> @@ -1977,6 +1977,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *a
>                        break;
>
>                (*work_done)++;
> +               rmb();  /* read descriptor and rx_buffer_info after status DD */
>                status = rx_desc->status;
>                skb = buffer_info->skb;
>                buffer_info->skb = NULL;
> Index: linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c
> ===================================================================
> --- linux-2.6.35-rc5.orig/drivers/net/ixgbevf/ixgbevf_main.c    2010-07-27 16:30:51.000000000 -0500
> +++ linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c 2010-07-27 16:40:19.000000000 -0500
> @@ -231,6 +231,7 @@ static bool ixgbevf_clean_tx_irq(struct
>        while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
>               (count < tx_ring->work_limit)) {
>                bool cleaned = false;
> +               rmb(); /* read buffer_info after eop_desc */
>                for ( ; !cleaned; count++) {
>                        struct sk_buff *skb;
>                        tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
> @@ -518,6 +519,7 @@ static bool ixgbevf_clean_rx_irq(struct
>                        break;
>                (*work_done)++;
>
> +               rmb(); /* read descriptor and rx_buffer_info after status DD */
>                if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
>                        hdr_info = le16_to_cpu(ixgbevf_get_hdr_info(rx_desc));
>                        len = (hdr_info & IXGBE_RXDADV_HDRBUFLEN_MASK) >>
> Index: linux-2.6.35-rc5/drivers/net/e100.c
> ===================================================================
> --- linux-2.6.35-rc5.orig/drivers/net/e100.c    2010-07-27 17:36:44.000000000 -0500
> +++ linux-2.6.35-rc5/drivers/net/e100.c 2010-07-27 17:41:38.000000000 -0500
> @@ -1928,6 +1928,7 @@ static int e100_rx_indicate(struct nic *
>
>        netif_printk(nic, rx_status, KERN_DEBUG, nic->netdev,
>                     "status=0x%04X\n", rfd_status);
> +       rmb(); /* read size after status bit */
>
>        /* If data isn't ready, nothing to indicate */
>        if (unlikely(!(rfd_status & cb_complete))) {
> --

You also seem to be missing igb.

-- 
Cheers,
Jeff

Re: [E1000-devel] [PATCH] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Sonny R. <son...@us...> - 2010-07-27 22:45:32

On Tue, Jul 27, 2010 at 03:41:46PM -0700, Jeff Kirsher wrote:
> On Tue, Jul 27, 2010 at 15:34, Sonny Rao <son...@us...> wrote:
> > This patch is similar to what was fixed in ixgbe in this patch:
> >
> > http://marc.info/?l=e1000-devel&m=126593062701537&w=3
> >
> > We should add read memory barriers to all the similar cases across the
> > Intel ethernet driver family.  In the case of ixgbevf I've also added
> > a missing barrier to the clean_tx_irq path because I missed it in my
> > last patch.
> >
> > Without the barrier a processor can speculate a load ahead of the load
> > which looks at the status bit and get stale information causing a
> > number of different issues including invalid packet length, NULL
> > pointers, or bad data since checksumming was assumed to be done
> > in hardware.
> >
> > Signed-off-by: Milton Miller <mi...@bg...>
> > Signed-off-by: Sonny Rao <son...@us...>
> > cc: stable <st...@ke...>
> >
> 
> I already have a similar patch in my queue from you Sonny, although I
> see that this patch has made a few more changes.  Is this version 2?

Well, the previous one was for the clean_tx_irq functions this one is 
for the clean_rx_irq functions.  I'd gotten the two confused when I
referenced Anton's original patch -- which was also a clean_rx_irq
patch.  So they are touching different code paths but fixing similar
problems.


-- 
Sonny Rao, LTC OzLabs, BML team

[E1000-devel] [PATCHv2] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Sonny R. <son...@us...> - 2010-07-27 22:43:22

Add missing memory barriers to clean_rx_irq functions in Intel Drivers

This patch is similar to what was fixed in ixgbe in this patch:

http://marc.info/?l=e1000-devel&m=126593062701537&w=3

We should add read memory barriers to all the similar cases across the
Intel ethernet driver family.  In the case of ixgbevf I've also added 
a missing barrier to the clean_tx_irq path because I missed it in my
last patch.

Without the barrier a processor can speculate a load ahead of the load
which looks at the status bit and get stale information causing a
number of different issues including invalid packet length, NULL
pointers, or bad data since checksumming was assumed to be done
in hardware.

v2: I missed the e100 the first time

Signed-off-by: Milton Miller <mi...@bg...>
Signed-off-by: Sonny Rao <son...@us...>
cc: stable <st...@ke...>

Index: linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000/e1000_main.c	2010-07-27 16:15:18.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c	2010-07-27 16:22:21.000000000 -0500
@@ -3638,6 +3638,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -3844,6 +3845,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/e1000e/netdev.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000e/netdev.c	2010-07-27 16:22:38.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000e/netdev.c	2010-07-27 16:25:23.000000000 -0500
@@ -774,6 +774,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -1081,6 +1082,7 @@ static bool e1000_clean_rx_irq_ps(struct
 			break;
 		(*work_done)++;
 		skb = buffer_info->skb;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		/* in the packet split case this is header only */
 		prefetch(skb->data - NET_IP_ALIGN);
@@ -1280,6 +1282,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:27:23.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:41:57.000000000 -0500
@@ -1977,6 +1977,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *a
 			break;
 
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 		status = rx_desc->status;
 		skb = buffer_info->skb;
 		buffer_info->skb = NULL;
Index: linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:30:51.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:40:19.000000000 -0500
@@ -231,6 +231,7 @@ static bool ixgbevf_clean_tx_irq(struct 
 	while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
 	       (count < tx_ring->work_limit)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			struct sk_buff *skb;
 			tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
@@ -518,6 +519,7 @@ static bool ixgbevf_clean_rx_irq(struct 
 			break;
 		(*work_done)++;
 
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 		if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
 			hdr_info = le16_to_cpu(ixgbevf_get_hdr_info(rx_desc));
 			len = (hdr_info & IXGBE_RXDADV_HDRBUFLEN_MASK) >>
Index: linux-2.6.35-rc5/drivers/net/e100.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e100.c	2010-07-27 17:36:44.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e100.c	2010-07-27 17:41:38.000000000 -0500
@@ -1928,6 +1928,7 @@ static int e100_rx_indicate(struct nic *
 
 	netif_printk(nic, rx_status, KERN_DEBUG, nic->netdev,
 		     "status=0x%04X\n", rfd_status);
+	rmb(); /* read size after status bit */
 
 	/* If data isn't ready, nothing to indicate */
 	if (unlikely(!(rfd_status & cb_complete))) {

Re: [E1000-devel] [PATCH] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Jeff K. <jef...@in...> - 2010-07-27 22:42:14

On Tue, Jul 27, 2010 at 15:34, Sonny Rao <son...@us...> wrote:
> This patch is similar to what was fixed in ixgbe in this patch:
>
> http://marc.info/?l=e1000-devel&m=126593062701537&w=3
>
> We should add read memory barriers to all the similar cases across the
> Intel ethernet driver family.  In the case of ixgbevf I've also added
> a missing barrier to the clean_tx_irq path because I missed it in my
> last patch.
>
> Without the barrier a processor can speculate a load ahead of the load
> which looks at the status bit and get stale information causing a
> number of different issues including invalid packet length, NULL
> pointers, or bad data since checksumming was assumed to be done
> in hardware.
>
> Signed-off-by: Milton Miller <mi...@bg...>
> Signed-off-by: Sonny Rao <son...@us...>
> cc: stable <st...@ke...>
>

I already have a similar patch in my queue from you Sonny, although I
see that this patch has made a few more changes.  Is this version 2?

-- 
Cheers,
Jeff

[E1000-devel] [PATCH] Add missing memory barriers to clean_rx_irq functions in Intel Drivers

From: Sonny R. <son...@us...> - 2010-07-27 22:33:56

This patch is similar to what was fixed in ixgbe in this patch:

http://marc.info/?l=e1000-devel&m=126593062701537&w=3

We should add read memory barriers to all the similar cases across the
Intel ethernet driver family.  In the case of ixgbevf I've also added 
a missing barrier to the clean_tx_irq path because I missed it in my
last patch.

Without the barrier a processor can speculate a load ahead of the load
which looks at the status bit and get stale information causing a
number of different issues including invalid packet length, NULL
pointers, or bad data since checksumming was assumed to be done
in hardware.

Signed-off-by: Milton Miller <mi...@bg...>
Signed-off-by: Sonny Rao <son...@us...>
cc: stable <st...@ke...>

Index: linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000/e1000_main.c	2010-07-27 16:15:18.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000/e1000_main.c	2010-07-27 16:22:21.000000000 -0500
@@ -3638,6 +3638,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -3844,6 +3845,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/e1000e/netdev.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/e1000e/netdev.c	2010-07-27 16:22:38.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/e1000e/netdev.c	2010-07-27 16:25:23.000000000 -0500
@@ -774,6 +774,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -1081,6 +1082,7 @@ static bool e1000_clean_rx_irq_ps(struct
 			break;
 		(*work_done)++;
 		skb = buffer_info->skb;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		/* in the packet split case this is header only */
 		prefetch(skb->data - NET_IP_ALIGN);
@@ -1280,6 +1282,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
Index: linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:27:23.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgb/ixgb_main.c	2010-07-27 16:41:57.000000000 -0500
@@ -1977,6 +1977,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *a
 			break;
 
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 		status = rx_desc->status;
 		skb = buffer_info->skb;
 		buffer_info->skb = NULL;
Index: linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c
===================================================================
--- linux-2.6.35-rc5.orig/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:30:51.000000000 -0500
+++ linux-2.6.35-rc5/drivers/net/ixgbevf/ixgbevf_main.c	2010-07-27 16:40:19.000000000 -0500
@@ -231,6 +231,7 @@ static bool ixgbevf_clean_tx_irq(struct 
 	while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
 	       (count < tx_ring->work_limit)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			struct sk_buff *skb;
 			tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
@@ -518,6 +519,7 @@ static bool ixgbevf_clean_rx_irq(struct 
 			break;
 		(*work_done)++;
 
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 		if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
 			hdr_info = le16_to_cpu(ixgbevf_get_hdr_info(rx_desc));
 			len = (hdr_info & IXGBE_RXDADV_HDRBUFLEN_MASK) >>

Re: [E1000-devel] how do I disable sending of carrier on 82576 and 82599 XAUI?

From: Peter P W. Jr <pet...@in...> - 2010-07-27 20:48:52

On Tue, 2010-07-27 at 08:44 -0700, Chris Friesen wrote:
> Hi,
> 

Hi Chris,

> In order to simplify our network management, I've been asked to modify
> the igb and ixgbe drivers so that the far end doesn't detect link
> carrier when the local devices are downed using "ifdown" or equivalent.

There has been a seemingly large surge of this question coming into us
recently...

> We have two types of devices, 82576 and 82599 (using XAUI).
> 
> On the 82576 I'm using the PCTRL register.  I'm setting bit 11 (power
> down phy), clearing bit 12 (disabling autonegotiation), and then setting
> bit 9 (restarting autonegotiation).  This seems to work--are there any
> gotchas of which I should be aware, or any better options?

This is the process we've given to a few other folks looking to do the
same thing on 82576.  This should work just fine.

> On the 82599 I can't find any way to disable the XAUI link as such.  I
> tried telling it to use a different protocol to force a mismatch but
> this resulted in the device itself losing carrier, but the far end still
> claiming to see carrier.  (Specifically, I tried changing the
> IXGBE_AUTOC_LMS_10G_LINK_NO_AN bits in AUTOC to
> IXGBE_AUTOC_LMS_10G_SERIAL and then set the IXGBE_AUTOC_AN_RESTART bit.
>  I've tried other LMS bits as well as various 10G_PMA_PMD_PARALLEL bits
> but they all behaved the same.)  Is there a "proper" way to shut down
> the MAUI interface such that the other end cannot possibly detect a link
> as present?

There is no proper way to shut down XAUI for 82599 unfortunately.  What
we've recommended to other people is to set AUTOC.LMS to 000b (1G link,
no auto-neg).  That should be enough to "break" the physical link on the
wire, then write AUTOC.Restart_AN.

Let me know if this helps.

Cheers,
-PJ Waskiewicz

Re: [E1000-devel] [net-next-2.6 PATCH] igbvf, ixgbevf: use dev_hw_addr_random

From: David M. <da...@da...> - 2010-07-27 20:18:27

From: Stefan Assmann <sas...@re...>
Date: Tue, 27 Jul 2010 11:24:50 +0200

> From: Stefan Assmann <sas...@re...>
> 
> Both igbvf and ixgbevf should set addr_assign_type to NET_ADDR_RANDOM
> so udev creates persistent net rules by matching the device path.
> Do this by using the dev_hw_addr_random helper function.
> 
> Signed-off-by: Stefan Assmann <sas...@re...>

Applied.

Re: [E1000-devel] getting max throughput from AMD Opteron w/igb-2.2.9

From: Alexander D. <ale...@in...> - 2010-07-27 16:42:24

Ed Ravin wrote:
> On Mon, Jul 26, 2010 at 08:52:20AM -0700, Alexander Duyck wrote:
>> My recommendations are to do the following.:
>> 1.  Set the RX and TX ring sizes to 256.  This makes it so that all the
>> descriptors for each ring fit within a 4K single page.
> 
> Tried that. rx_fifo_errors went up to over 6000 per second.

The fact that the ring size seems to effect the number of packets 
dropped per second implies that there may be some sort of latency issue. 
  One thing you might try is different values for rx-usecs via ethtool 
-C.  You may find that fixing the value at something fairly low like 33 
usecs per interrupt may help to reduce the number of rx_fifo_errors.

>> 2.  You may want to just stack all of the same queues on the same CPU,
>> so rx/tx 0 for both ports on CPU 0, rx/tx1 on CPU 1, etc.  This way you
>> can keep the memory local and reduce cross cpu and cross node allocation
>> and free.
> 
> This is the default layout after loading the 2.2.9 driver:
> 
> # eth_affinity_tool show --verbose
> 16 CPUs detected
> 
> eth0:
>         25: eth0: ffff
>         26: eth0-rx-0: 0001
>         27: eth0-rx-1: 0002
>         28: eth0-rx-2: 0004
>         29: eth0-rx-3: 0008
>         30: eth0-tx-0: 0001
>         31: eth0-tx-1: 0002
>         32: eth0-tx-2: 0004
>         33: eth0-tx-3: 0008
> 
> eth1:
>         34: eth1: ffff
>         35: eth1-rx-0: 0001
>         36: eth1-rx-1: 0002
>         37: eth1-rx-2: 0004
>         38: eth1-rx-3: 0008
>         39: eth1-tx-0: 0001
>         40: eth1-tx-1: 0002
>         41: eth1-tx-2: 0004
>         42: eth1-tx-3: 0008
> 

This looks okay, but you may want to try running with QueuePairs on so 
that you have 8 RX/TX queues and spread the work over more cores.

>> 3.  You could probably also set the RSS value to 0 and see how many
>> queues this gives you.
> 
> With the stock 2.6.34.1 igb driver, version 2.1.0-k2, which does not seem
> to be tuneable, I get 8 paired RX/TX queues.   With 2.2.9, I can get
> either 4 RX and 4 TX queues per interface, or with RSS=0 and not setting
> QueuePairs to 0, I get 8 RX/TX queues per interface.

Since you are running a pair of 8 core processors it might be best to 
just set things up for 8 queues and spread them over a single physical 
processor ID.  I realize that didn't give you the best performance but I 
suspect the reason for that is that the adapter is actually closer to 
one of the CPUs than the other.

>> Depending on what hardware you have there may be
>> more queues available and if the CPUS contain a stack of queues as I
>> suggested in item 2 then spreading it out over more CPUs would be
>> advisable.
> 
> Don't see much difference when doing that.  The worst performance was
> when everything was on CPUs with the same physical ID.  The best performance
> appears to be the 2.1.0-k2 driver in its unchangeable configuration, which
> is much different than when on an Intel Xeon non-NUMA platform - on that
> one the 2.2.9 has the best performance.

After looking over your lspci dump I am assuming you are running a 
Supermicro motherboard with the AMD SR5690/SP5100 chipset.  If that is 
the case you will probably find that one physical ID works much better 
than the other for network performance because the SR5690 that the 82576 
is connected to is going to be node local for one of the sockets and 
remote for the other.

Another factor you will need to take into account is that the ring 
memory should be allocated on the same node the hardware is on.  You 
should be able to accomplish that by using taskset with the correct CPU 
mask for the physical ID you are using when calling modprobe/insmod and 
the ifconfig commands to bring up the interfaces.  This should help to 
decrease the memory latency and increase the throughput available to the 
adapter.

>> 4.  One other thing that might be useful would be to put a static entry
>> into your ARP table for the destination IPs you are routing too.  I have
>> seen instances where this can cause packets to be dropped due to a delay
>>  in obtaining the MAC address via ARP.
> 
> Already had static ARP for the destination IPs.
> 
> I fetched numactl and numastat from the Debian repository.
> Interestingly, "numastat" says there aren't any misses:
> 
> root@big-tester:~# numastat
>                            node0           node3
> numa_hit                 3873922         1052324
> numa_miss                      0               0
> numa_foreign                   0               0
> interleave_hit              5601            5331
> local_node               3733844         1050500
> other_node                140078            1824
> 
> but "numactl" seems to have some issues so I don't know if I can trust
> any of these numbers:
> 
> root@big-tester:~# numactl -H
> available: 4 nodes (0-3)
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 2047 MB
> node 0 free: 1740 MB
> libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
> node 1 cpus:
> node 1 size: <not available>
> node 1 free: <not available>
> node 2 cpus:
> node 2 size: <not available>
> node 2 free: <not available>
> node 3 cpus: 8 9 10 11 12 13 14 15
> node 3 size: 2046 MB
> node 3 free: 1934 MB
> node distances:
> node   0   1   2   3
>   0:  10  20   0   0
>   1:   0   0   0  134529
>   2:   0  134529 -143519016 -143519016
>   3: -143519016 -143519016   0   0
> 
>> That is what I can think up off of the top of my head.  Other than that
>> if you could provide more information on the system it would be useful.
>> Perhaps an lspci -vvv, a dump of /proc/cpuinfo, and /proc/zoneinfo. With
>> that I can give more detailed steps on the layout that might provide the
>> best performance.
> 
> 
> Requested dumps below - cpuinfo, zoneinfo, lspci.  Thank you!

Thanks for the information.  One other item I would be interested in 
seeing is the kind of numbers we are talking about.  If you could 
provide me with an ethtool -S dump from 10 seconds of one of your tests 
that might be useful for me to better understand the kind of pressures 
the system is under.

Thanks,

Alex

Re: [E1000-devel] e1000e probe failure on 2.6.34

From: Tantilov, E. S <emi...@in...> - 2010-07-27 16:20:28

Fabio Varesano wrote:
> Hi guys, any progress on this? We are all still stuck with no eth0...

Could you please file a bug at e1000.sf.net with all the information (kernel config, lspci, ethtool -e from working kernel, etc).

Thanks,
Emil

[E1000-devel] how do I disable sending of carrier on 82576 and 82599 XAUI?

From: Chris F. <chr...@ge...> - 2010-07-27 15:49:58

Hi,

In order to simplify our network management, I've been asked to modify
the igb and ixgbe drivers so that the far end doesn't detect link
carrier when the local devices are downed using "ifdown" or equivalent.

We have two types of devices, 82576 and 82599 (using XAUI).

On the 82576 I'm using the PCTRL register.  I'm setting bit 11 (power
down phy), clearing bit 12 (disabling autonegotiation), and then setting
bit 9 (restarting autonegotiation).  This seems to work--are there any
gotchas of which I should be aware, or any better options?

On the 82599 I can't find any way to disable the XAUI link as such.  I
tried telling it to use a different protocol to force a mismatch but
this resulted in the device itself losing carrier, but the far end still
claiming to see carrier.  (Specifically, I tried changing the
IXGBE_AUTOC_LMS_10G_LINK_NO_AN bits in AUTOC to
IXGBE_AUTOC_LMS_10G_SERIAL and then set the IXGBE_AUTOC_AN_RESTART bit.
 I've tried other LMS bits as well as various 10G_PMA_PMD_PARALLEL bits
but they all behaved the same.)  Is there a "proper" way to shut down
the MAUI interface such that the other end cannot possibly detect a link
as present?

Thanks,

Chris

-- 
Chris Friesen
Software Developer
GENBAND
chr...@ge...
www.genband.com

Re: [E1000-devel] Problem with e1000e, 802.1Q VLAN's and IPMI

From: Andrey P. <pa...@ce...> - 2010-07-27 12:06:05

Attachments: dmesg pci dmi

On 207, 07 26, 2010 at 09:45:06AM -0700, Brandeburg, Jesse wrote:
> 
> On Mon, 26 Jul 2010, Andrey Panin wrote:
> > I have a problem using IPMI on Super Micro X7SBL motherboard (AOC-IPMI20-E BMC).
> > BMC shares ethernet port with onboard e1000e. It works when IPMI traffic is 
> > untagged (CrcStripping=0 module option used), but if I try to use 802.1Q vlan
> > for IPMI traffic BMC stops responding right after "ifup eth0". 
> 
> We've heard of this issue (or similar) before.  I believe for the other 
> guys they had to run IPMI traffic untagged, if your main network is 
> running untagged.  The problem in this case is that the tags are being 
> stripped in hardware even for the SMBUS packets, at which point the BMC 
> that is stupid and doesn't understand offloading hardware gets confused 
> that the traffic it is receiving doesn't have a vlan tag.

I suspected something like that. Unfortunately Super Micro isn't interested
in fixing their BMC, at least latest firmware doesn't fix the issue.

> > I tested 2.6.26 (debian stable kernel) and 2.6.35-rc6.
> > Looks like there is some problem with 802.1Q tags stripping/inserting.
> 
> Thanks for testing the latest kernel.
> 
> > There was no such problem with e1000 driver from 2.6.18.
> 
> Hm, that is an interesting statement.  The major changes to the driver 
> include stripping tags all the time when not in promisc mode, and there 
> have been some fixes and attempts to fix the ipmi issues some 
> (particularly supermicro) BMCs have.  I'm quite surprised it worked for 
> you at all (there have been quite a few issues in this area)

After some sleep I'm not so sure that it was really working :(
I'll retest it more thoroughly this evening.

> > Unfortunately I can't provide additional info right now, but I can provide
> > it slightly later and I'm ready to test patches.
> 
> I can send you a patch to disable vlan stripping in hardware which will 
> probably get you working, but I don't think it is production worthy.

This solution looks too drastic to me, taking into account that this box will
be used as broadband access server. I'll better change my network to keep IPMI
traffic untagged.

Best regards.

[E1000-devel] [patch -next] ixgbe: potential null dereference

From: Dan C. <er...@gm...> - 2010-07-27 10:06:30

The e_dev_err() macro dereferences "adapter" which is NULL here.

Signed-off-by: Dan Carpenter <er...@gm...>

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 9203759..0360260 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6549,8 +6549,8 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 			err = dma_set_coherent_mask(&pdev->dev,
 						    DMA_BIT_MASK(32));
 			if (err) {
-				e_dev_err("No usable DMA configuration, "
-					  "aborting\n");
+				dev_err(&pdev->dev,
+					"No usable DMA configuration, aborting\n");
 				goto err_dma;
 			}
 		}
@@ -6560,7 +6560,8 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 	err = pci_request_selected_regions(pdev, pci_select_bars(pdev,
 	                                   IORESOURCE_MEM), ixgbe_driver_name);
 	if (err) {
-		e_dev_err("pci_request_selected_regions failed 0x%x\n", err);
+		dev_err(&pdev->dev,
+			"pci_request_selected_regions failed 0x%x\n", err);
 		goto err_pci_reg;
 	}

3 messages has been excluded from this view by a project administrator.

Flat | Threaded

1 2 3 .. 5 > >> (Page 1 of 5)

S	M	T	W	T	F	S
				1 (2)	2 (7)	3 (7)
4 (1)	5 (2)	6 (5)	7 (1)	8 (6)	9 (8)	10
11 (1)	12 (2)	13 (3)	14 (10)	15 (8)	16 (2)	17
18 (2)	19 (3)	20 (1)	21 (5)	22 (5)	23 (1)	24 (1)
25	26 (6)	27 (20)	28 (5)	29	30 (4)	31