Submitter | Joe Korty |
---|---|
Date | 2010-05-14 19:11:35 |
Message ID | <20100514191135.GA3418@tsunami.ccur.com> |
Download | mbox | patch |
Permalink | /patch/1333/ |
State | Rejected |
Headers | show |
Comments
On Fri, May 14, 2010 at 1:11 PM, Joe Korty <joe.korty@ccur.com> wrote: > Promote heap sizing to first-class Kconfig citizenship. > > Changing the heap size is something that those, like me, > with large PCI device trees need to do. Therefore heap > size should appear as a normal, user-answerable question > within the Kconfig build system. I think the difference here is that you're a developer (not a user) once you start touching the code. Users shouldn't have to worry about the heap size. It should be set larger in your mainboard Kconfig if the mainboard needs more heap space. Thanks, Myles
On 5/14/10 9:19 PM, Myles Watson wrote: > On Fri, May 14, 2010 at 1:11 PM, Joe Korty <joe.korty@ccur.com> wrote: > >> Promote heap sizing to first-class Kconfig citizenship. >> >> Changing the heap size is something that those, like me, >> with large PCI device trees need to do. Therefore heap >> size should appear as a normal, user-answerable question >> within the Kconfig build system. >> > I think the difference here is that you're a developer (not a user) > once you start touching the code. Users shouldn't have to worry about > the heap size. It should be set larger in your mainboard Kconfig if > the mainboard needs more heap space. > I agree with Myles here. If the heap size is not good enough, coreboot is broken and needs to be fixed. Stefan
On Fri, May 14, 2010 at 03:19:45PM -0400, Myles Watson wrote: > On Fri, May 14, 2010 at 1:11 PM, Joe Korty <joe.korty@ccur.com> wrote: > > Promote heap sizing to first-class Kconfig citizenship. > > > > Changing the heap size is something that those, like me, > > with large PCI device trees need to do. ?Therefore heap > > size should appear as a normal, user-answerable question > > within the Kconfig build system. > > I think the difference here is that you're a developer (not a user) > once you start touching the code. Users shouldn't have to worry about > the heap size. It should be set larger in your mainboard Kconfig if > the mainboard needs more heap space. Some background: The reason I'm looking at coreboot is that standard BIOSes (apparently) run out of memory while doing the bus walk, when I plug a PCI-e expansion chassis into the motherboard and populate it. The BIOS will either lock up or the OS will boot but what the OS sees for a PCI Bus (via lspci -tv) is clearly corrupt. So my job was/is to do an experiment to see if our problems are indeed due to out-of-memory issues in standard BIOSes, and if so, if coreboot could be a useful way around this issue. And indeed, the first time I booted coreboot with a populated PCI-e chassis attached, I got an out-of-memory halt from coreboot. Increasing CONFIG_HEAP_SIZE to 0x10000 (ie, 4x) got the system to boot, and lspci -tv looks good also. I have yet to try intermediate values. Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. I haven't looked into this hang in detail yet, so I don't have much to report. But I do fear it may be something more fundamental. Regards, Joe
On Fri, May 14, 2010 at 1:49 PM, Joe Korty <joe.korty@ccur.com> wrote: > On Fri, May 14, 2010 at 03:19:45PM -0400, Myles Watson wrote: >> On Fri, May 14, 2010 at 1:11 PM, Joe Korty <joe.korty@ccur.com> wrote: >> > Promote heap sizing to first-class Kconfig citizenship. >> > >> > Changing the heap size is something that those, like me, >> > with large PCI device trees need to do. ?Therefore heap >> > size should appear as a normal, user-answerable question >> > within the Kconfig build system. >> >> I think the difference here is that you're a developer (not a user) >> once you start touching the code. Users shouldn't have to worry about >> the heap size. It should be set larger in your mainboard Kconfig if >> the mainboard needs more heap space. > > > Some background: > The reason I'm looking at coreboot is that standard BIOSes > (apparently) run out of memory while doing the bus walk, > when I plug a PCI-e expansion chassis into the motherboard > and populate it. The BIOS will either lock up or the OS > will boot but what the OS sees for a PCI Bus (via lspci > -tv) is clearly corrupt. I wonder if that could be partly due to the ACPI implementation too. > So my job was/is to do an experiment to see if our problems > are indeed due to out-of-memory issues in standard BIOSes, and > if so, if coreboot could be a useful way around this issue. > > And indeed, the first time I booted coreboot with a > populated PCI-e chassis attached, I got an out-of-memory > halt from coreboot. Increasing CONFIG_HEAP_SIZE to > 0x10000 (ie, 4x) got the system to boot, and lspci -tv > looks good also. I have yet to try intermediate values. It seems like you have a pretty specific special case. Maybe we should create a CONFIG_EXTRA_HEAP that depends on CONFIG_EXPERT that lets you add heap. > Unfortunately we have an even bigger PCI-e loaded expansion > chassis (configuration #2), for which coreboot also hangs. > It's not an out-of-memory hang; it happens (apparently) > during the bus walk. I haven't looked into this hang in > detail yet, so I don't have much to report. But I do fear > it may be something more fundamental. Sounds like fun. Thanks, Myles
On Fri, May 14, 2010 at 03:56:00PM -0400, Myles Watson wrote: > It seems like you have a pretty specific special case. :) From my point of view, large systems are the standard case and normal desktops are the oddballs..... Regards, Joe > Sounds like fun. It's been educational and mind-stretching and worth it just for that.
Joe, we have visited this type of issue from time to time. The heap size, if it is related to a mainboard (and it is) belongs in the mainboard Kconfig and should not be user-visible. The reason is that if it is visible then that visibility implies that it can be safely changed, much as the baud rate can be safely changed. That is clearly wrong: many values of heap size will result in a locked up platform. Thus, heap size can be set in mainboard kconfig, but should not be user visible. As for your pci problem, I suspect it's not out of memory for the original bios but a bug in the bios or the hardware itself. We've had lots of chipset/pci card combinations over the years that confused the bioses, badly. It just happens. thanks ron
On Fri, May 14, 2010 at 05:38:04PM -0400, ron minnich wrote: > Joe, we have visited this type of issue from time to time. The heap > size, if it is related to a mainboard (and it is) belongs in the > mainboard Kconfig and should not be user-visible. The reason is that > if it is visible then that visibility implies that it can be safely > changed, much as the baud rate can be safely changed. That is clearly > wrong: many values of heap size will result in a locked up platform. Hi Ron, Thanks for the update. I haven't had any problems increasing heap size but that could just be my motherboard. What failure modes become possible when the heap size is increased? Joe
On Fri, May 14, 2010 at 7:18 PM, Joe Korty <joe.korty@ccur.com> wrote: > What failure modes become possible when the heap size > is increased? suppose someone for whatever reason sets it to a preposterous size. Not likely but we've seen that sort of thing happen. it's not necessary to have it user visible, and that alone is a good reason not to put it there. ron
> On Fri, May 14, 2010 at 05:38:04PM -0400, ron minnich wrote: > > Joe, we have visited this type of issue from time to time. The heap > > size, if it is related to a mainboard (and it is) belongs in the > > mainboard Kconfig and should not be user-visible. The reason is that > > if it is visible then that visibility implies that it can be safely > > changed, much as the baud rate can be safely changed. That is clearly > > wrong: many values of heap size will result in a locked up platform. > > Hi Ron, > Thanks for the update. I haven't had any problems > increasing heap size but that could just be my motherboard. It's easy to run out of RAM if you increase it too much, especially if the stack gets too large. For most boards, stack*processors + heap + code = 1M. The bigger worry is that someone will decrease the RAM size, which is a boot-time failure. Build-time failures are easier to handle. Thanks, Myles
On Fri, May 14, 2010 at 1:49 PM, Joe Korty <joe.korty@ccur.com> wrote: > Some background: > The reason I'm looking at coreboot is that standard BIOSes > (apparently) run out of memory while doing the bus walk, > when I plug a PCI-e expansion chassis into the motherboard > and populate it. The BIOS will either lock up or the OS > will boot but what the OS sees for a PCI Bus (via lspci > -tv) is clearly corrupt. > > So my job was/is to do an experiment to see if our problems > are indeed due to out-of-memory issues in standard BIOSes, and > if so, if coreboot could be a useful way around this issue. > > And indeed, the first time I booted coreboot with a > populated PCI-e chassis attached, I got an out-of-memory > halt from coreboot. Increasing CONFIG_HEAP_SIZE to > 0x10000 (ie, 4x) got the system to boot, and lspci -tv > looks good also. I have yet to try intermediate values. Could you try the latest? Devices now take ~ 1/4 the space that they used to take. > Unfortunately we have an even bigger PCI-e loaded expansion > chassis (configuration #2), for which coreboot also hangs. > It's not an out-of-memory hang; it happens (apparently) > during the bus walk. I haven't looked into this hang in > detail yet, so I don't have much to report. But I do fear > it may be something more fundamental. If you send the log to the list we might be able to help. Thanks, Myles
On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote: > On Fri, May 14, 2010 at 1:49 PM, Joe Korty <joe.korty@ccur.com> wrote: > > Unfortunately we have an even bigger PCI-e loaded expansion > > chassis (configuration #2), for which coreboot also hangs. > > It's not an out-of-memory hang; it happens (apparently) > > during the bus walk. ?I haven't looked into this hang in > > detail yet, so I don't have much to report. ?But I do fear > > it may be something more fundamental. > > If you send the log to the list we might be able to help. Hi Myles, I've solved this one, kind of. It is PCI IO Space overflow, we are going over 0xffff which apparently is a hard limit. I image this is there so that inb, outw, etc instructions can be used to reference these devices. But if one doesn't use such instructions (instead using memory mapped PCI IO space), I see no reason why Linux and coreboot couldn't work with PCI IO Space addresses > 0xffff. Regards, Joe
> > If you send the log to the list we might be able to help. > > Hi Myles, > I've solved this one, kind of. It is PCI IO Space > overflow, we are going over 0xffff which apparently is > a hard limit. I image this is there so that inb, outw, > etc instructions can be used to reference these devices. > > But if one doesn't use such instructions (instead using > memory mapped PCI IO space), I see no reason why Linux > and coreboot couldn't work with PCI IO Space addresses > > 0xffff. The resource allocator doesn't care. Just find the places where the I/O flag is checked and the limit is set to 0xffff and try setting it larger. I would look in src/devices/pci_device.c and src/northbridge/your_northbridge/northbridge.c first. I'm not sure what will break, but we should be able to fix it pretty easily. Thanks, Myles
Hi Joe, On 21.05.2010 22:07, Joe Korty wrote: > On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote: > >> On Fri, May 14, 2010 at 1:49 PM, Joe Korty <joe.korty@ccur.com> wrote: >> >>> Unfortunately we have an even bigger PCI-e loaded expansion >>> chassis (configuration #2), for which coreboot also hangs. >>> It's not an out-of-memory hang; it happens (apparently) >>> during the bus walk. ?I haven't looked into this hang in >>> detail yet, so I don't have much to report. ?But I do fear >>> it may be something more fundamental. >>> >> If you send the log to the list we might be able to help. >> > > I've solved this one, kind of. It is PCI IO Space > overflow, we are going over 0xffff which apparently is > a hard limit. I image this is there so that inb, outw, > etc instructions can be used to reference these devices. > > But if one doesn't use such instructions (instead using > memory mapped PCI IO space), I see no reason why Linux > and coreboot couldn't work with PCI IO Space addresses > >> 0xffff. >> I'm interested in how you want to map port IO space to memory. Please explain. AFAIK PCI register space is totally independent of port IO space which is totally independent of memory space. You can access PCI register space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of any mechanisms to map IO ports to memory or the other way round. Thanks, Carl-Daniel
I wrote: >> Unfortunately, the latest coreboot still gets an out-of-mem condition >> when the large pci-e chassis is attached. >> >> I've attached two coreboot logs, both are the latest svn but the second >> one has heap size at 0x10000 so that I can send you the log of what a >> good boot might look like. On Fri, May 21, 2010 at 04:33:12PM -0400, Myles Watson wrote: > That's a lot of devices. So maybe we need a Kconfig option called > ADDITIONAL_HEAP that's available on the EXPERT menu. It's possible that > making links into lists will make you fit, but I'd expect someone to shrink > the default heap when there's that much extra space for everyone else. I certainly think that that would be OK. It is impractical to come up with a default heap size to cover the largest possible IO configuration, since that would be very large indeed. Heck, even my failing large IO configuration is not really very large. I am putting only one expansion chassis on the system. I expect that we will eventually get customers that will want two or even three expansion chassis (each chassis holds 20 PCI-e cards). In one sense this is kinda exciting. Mainframes have always been about large IO. That's what distinguishes them from PCs. With these tweaks, PC-like machines can start to eat away at the bottom of that market. Regards, Joe Joe
Joe Korty wrote: > I've solved this one, kind of. It is PCI IO Space > overflow, we are going over 0xffff which apparently is > a hard limit. On x86 it is very much a hard limit. Not so on other architectures. > I image this is there so that inb, outw, > etc instructions can be used to reference these devices. > > But if one doesn't use such instructions (instead using > memory mapped PCI IO space), The feasibility of that is totally device dependent. PCI devices can expose all combinations of I/O and memory, and only the device driver knows which one to use how. > I see no reason why Linux and coreboot couldn't work with PCI IO > Space addresses > 0xffff. The I/O opcodes on x86 are limited to 16 bit addresses. Since this is part of the architecture, both Linux and coreboot make this assumption on x86 systems. Joe Korty wrote: > Heck, even my failing large IO configuration is not really > very large. I am putting only one expansion chassis on > the system. Either you are just totally out of luck with the I/O space situation, or there is room for improvement in coreboot. Not at all impossible. What cards did you have in this expansion chassis? Would it be possible for you to provide lspci -vv output on that system? Does the system boot if the chassis is completely empty? > I expect that we will eventually get customers that will want two > or even three expansion chassis (each chassis holds 20 PCI-e > cards). How do the chassis connect upstream, on PCI level? How does that upstream-facing component divide address space? Does it reserve a chunk for everything that can connect downstream? How big a chunk? > In one sense this is kinda exciting. Mainframes have > always been about large IO. That's what distinguishes them > from PCs. With these tweaks, PC-like machines can start > to eat away at the bottom of that market. Only if 16 bits is enough for all I/O BARs that the plugged-in cards need. Maybe the allocation algorithm in coreboot can be optimized to pack things better into those 16 bits, but worst case you've simply hit an architecture limitation with x86. :\ //Peter
On Fri, May 21, 2010 at 05:04:26PM -0400, Carl-Daniel Hailfinger wrote: > > On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote: > >> On Fri, May 14, 2010 at 1:49 PM, Joe Korty <joe.korty@ccur.com> wrote: > > I've solved this one, kind of. It is PCI IO Space > > overflow, we are going over 0xffff which apparently is > > a hard limit. I image this is there so that inb, outw, > > etc instructions can be used to reference these devices. > > > > But if one doesn't use such instructions (instead using > > memory mapped PCI IO space), I see no reason why Linux > > and coreboot couldn't work with PCI IO Space addresses > >> 0xffff. > > I'm interested in how you want to map port IO space to memory. > Please explain. > > AFAIK PCI register space is totally independent of port IO space which > is totally independent of memory space. You can access PCI register > space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of > any mechanisms to map IO ports to memory or the other way round. Well, all I know at this point is that the Linux kernel sources have code that maps inb etc either to the instructions or to a memory dereference, and the .config for that chooses memory dereference for x86. It's gonna be fun seeing if high-IO-address space can be made to work.. Regards, Joe
>> I'm interested in how you want to map port IO space to memory. >> Please explain. >> >> AFAIK PCI register space is totally independent of port IO space which >> is totally independent of memory space. You can access PCI register >> space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of >> any mechanisms to map IO ports to memory or the other way round. > > Well, all I know at this point is that the Linux kernel > sources have code that maps inb etc either to the > instructions or to a memory dereference, and the .config > for that chooses memory dereference for x86. > > It's gonna be fun seeing if high-IO-address space can be > made to work.. We should be able to support any mapping that you can make work in Linux. It will be fun to see. Thanks, Myles
On Fri, May 21, 2010 at 09:28:55PM -0400, Peter Stuge wrote: > Joe Korty wrote: > > I've solved this one, kind of. It is PCI IO Space > > overflow, we are going over 0xffff which apparently is > > a hard limit. > ... > What cards did you have in this expansion chassis? Would it be > possible for you to provide lspci -vv output on that system? > Does the system boot if the chassis is completely empty? Hi Peter, That particular expansion chassis load is no longer accessible to me at the moment (sent elsewhere). I'm pretty sure I can reconstruct it but I first have to scrap up the parts. The real problem is PCI-e bridges rounding everything up to 4Kbyte boundaries. Doing this for IO space is a real pain, it doesn't take very many PCI-e bridges (and every PCI-e card seems to have a bridge within it) to make us go over the 0xffff IO Space limit. I saw each Quad Ethernet taking 8K of IO space. Thus it doesn't take very many of these cards to fill up IO space. The address space allocation (from /proc/ioports, from memory) follows this pattern: a000-a01f a020-a03f b000-b01f b020-b03f From the rounding it appears that this pci-e board internally has two pci-e busses, each with two ethernets, perhaps fronted with a pci-e mux giving the board its connection to the outside world. > How do the chassis connect upstream, on PCI level? How does that > upstream-facing component divide address space? Does it reserve > a chunk for everything that can connect downstream? How big a chunk? The PCIe bridges seems to be rounding everything to 4k boundaries. I haven't found any documentation on what the PCIe standards says that limit, if any, should actually be. Regards, Joe
pci bridges have always rounded to 4k multiples ... it's in the very earliest spec. ron
On Sat, May 22, 2010 at 12:54:08PM -0400, ron minnich wrote: > pci bridges have always rounded to 4k multiples ... it's in the very > earliest spec. Thanks. It does seem that PCI-e uses more bridges than the older PCIs; if so that would explain the excessive spreading-out of IO port addresses under PCI-e. Joe
On 22.05.2010 20:54, Joe Korty wrote: > On Sat, May 22, 2010 at 12:54:08PM -0400, ron minnich wrote: > >> pci bridges have always rounded to 4k multiples ... it's in the very >> earliest spec. >> > > Thanks. It does seem that PCI-e uses more bridges than the older PCIs; > if so that would explain the excessive spreading-out of IO port addresses > under PCI-e. > I think someone mentioned that some PCIe bridges can map IO port space to memory space to give non-x86 systems access to IO port space (many architectures do not have a separate IO port space). If you manage to find such a PCIe bridge chip with mem<->IOport mapping capability and if you can hack up Linux to use the memory accessor functions for the devices behind such a bridge, you can work around IOport resource space constraints. It would definitely be interesting to see if that is possible in paractice without breaking lots of stuff. Regards, Carl-Daniel
Patch
Index: trunk/src/Kconfig =================================================================== --- trunk.orig/src/Kconfig 2010-05-14 10:24:35.000000000 -0400 +++ trunk/src/Kconfig 2010-05-14 10:25:00.000000000 -0400 @@ -80,6 +80,17 @@ Enables the use of ccache for faster builds. Requires ccache in path. +config HEAP_SIZE + hex "Heap size (in bytes)" + default 0x4000 + help + The primary coreboot heap user is the PCI + bus walk. Therefore heap size may need to be + increased on systems that have exceptionally + large and/or deep PCI device trees. + + If unsure, use the default. + endmenu source src/mainboard/Kconfig @@ -124,10 +135,6 @@ bool default n -config HEAP_SIZE - hex - default 0x4000 - config DEBUG bool default n Index: trunk/src/lib/malloc.c =================================================================== --- trunk.orig/src/lib/malloc.c 2010-05-14 10:24:35.000000000 -0400 +++ trunk/src/lib/malloc.c 2010-05-14 10:25:00.000000000 -0400 @@ -14,7 +14,10 @@ { void *p; - MALLOCDBG("%s Enter, size %ld, free_mem_ptr %p\n", __func__, size, free_mem_ptr); + MALLOCDBG("%s Enter, size %ld, %d of %d bytes available.\n", + __func__, size, + (int)(free_mem_end_ptr - free_mem_ptr), + (int)(&_eheap - &_heap)); /* Checking arguments */ if (size < 0)
Promote heap sizing to first-class Kconfig citizenship. Changing the heap size is something that those, like me, with large PCI device trees need to do. Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system. Also change the malloc debug message to more clearly indicate how much memory is left. Signed-off-by: Joe Korty <joe.korty@ccur.com>