Patchwork Add option 'compress ramstage'

login
register
about
Submitter Sven Schnelle
Date 2011-05-02 14:13:33
Message ID <1304345613-28138-1-git-send-email-svens@stackframe.org>
Download mbox | patch
Permalink /patch/2933/
State New
Headers show

Comments

Sven Schnelle - 2011-05-02 14:13:33
Add an option to make compression of ramstage configurable. Right now
it is always compressed. On my Thinkpad, the complete boot to grub takes
4s, with around 1s required for decompressing ramstage. This is probably
caused by the fact the decompression does a lot of single byte/word/qword
accesses, which are really slow on SPI buses. So give the user the option
to store ramstage uncompressed, if he has enough memory.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
---
 Makefile.inc |    4 ++++
 src/Kconfig  |    8 ++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)
Patrick Georgi - 2011-05-02 15:42:02
Am 02.05.2011 16:13, schrieb Sven Schnelle:
> Add an option to make compression of ramstage configurable. Right now
> it is always compressed. On my Thinkpad, the complete boot to grub takes
> 4s, with around 1s required for decompressing ramstage. This is probably
> caused by the fact the decompression does a lot of single byte/word/qword
> accesses, which are really slow on SPI buses. So give the user the option
> to store ramstage uncompressed, if he has enough memory.
> 
> Signed-off-by: Sven Schnelle <svens@stackframe.org>
Acked-by: Patrick Georgi <patrick@georgi-clan.de>
Scott - 2011-05-02 15:45:32
Sven Schnelle wrote:

]Add an option to make compression of ramstage configurable. Right now
]it is always compressed. On my Thinkpad, the complete boot to grub takes
]4s, with around 1s required for decompressing ramstage. This is probably
]caused by the fact the decompression does a lot of single byte/word/qword
]accesses, which are really slow on SPI buses. So give the user the option
]to store ramstage uncompressed, if he has enough memory.
]
]Signed-off-by: Sven Schnelle <svens@stackframe.org>

Hello Sven,

Thanks, I like having this option. For AMD Persimmon I get these
boot times for coreboot+seabios+dos ssd drive:

                                      Standard    No compress
SPI 33 MHz fast mode, prefetch        0.690       0.717
SPI 33 MHz fast mode, no prefetch     0.933       1.041
AMD Simnow                            9.0         3.0

For this project compress disable slows booting on real hardware slightly,
but it vastly improves simnow boot time.

Is your SB SPI prefetch enabled? Using the cycle logging feature of
DediProg EM100 shows that for AMD, SPI reads are all dwords until the
SB SPI prefetch is enabled, at which time they become cache line reads.

Thanks,
Scott
Stefan Reinauer - 2011-05-02 18:34:42
* Sven Schnelle <svens@stackframe.org> [110502 16:13]:
> Add an option to make compression of ramstage configurable. Right now
> it is always compressed. On my Thinkpad, the complete boot to grub takes
> 4s, with around 1s required for decompressing ramstage. This is probably
> caused by the fact the decompression does a lot of single byte/word/qword
> accesses, which are really slow on SPI buses. So give the user the option
> to store ramstage uncompressed, if he has enough memory.

Hi Sven,

can you try whether your thinkpad can boot faster if you enable spi
prefetching in src/southbridge/intel/i82801gx/bootblock.c

i.e.

static void enable_spi_prefetch(void)
{
        u8 reg8;
        device_t dev;

        dev = PCI_DEV(0, 0x1f, 0);

        reg8 = pci_read_config8(dev, 0xdc);
        reg8 &= ~(3 << 2);
        reg8 |= (2 << 2); /* Prefetching and Caching Enabled */
        pci_write_config8(dev, 0xdc, reg8);
}

static void bootblock_southbridge_init(void)
{
        ...
        enable_spi_prefetch();
        ...
}

Stefan
Peter Stuge - 2011-05-02 18:49:38
Sven Schnelle wrote:
> +++ b/src/Kconfig
..
> +	help
> +	  Compress ramstage to save memory in the flash image. Note
> +	  that decompression might slow down booting if the BIOS flash
> +	  is connected through a slow Link (i.e. SPI)

Please write "boot flash" since there may not be any BIOS.


//Peter
Eric W. Biederman - 2011-05-02 20:00:29
Sven Schnelle <svens@stackframe.org> writes:

> Add an option to make compression of ramstage configurable. Right now
> it is always compressed. On my Thinkpad, the complete boot to grub takes
> 4s, with around 1s required for decompressing ramstage. This is probably
> caused by the fact the decompression does a lot of single byte/word/qword
> accesses, which are really slow on SPI buses. So give the user the option
> to store ramstage uncompressed, if he has enough memory.

Odd. Historically this has been solved by simply putting an mtrr over
the compressed area.  So that you would still get full cache block
transfers during the decompression.  I am fuzzy about the appropriate
mode.  Write protect I think.

Have you tried setting up an mtrr over the area that will be
decompressed.  That should result in something that is even faster
than copying non-compressed data.

Eric
Stefan Reinauer - 2011-05-02 23:02:19
* Eric W. Biederman <ebiederm@xmission.com> [110502 22:00]:
> Sven Schnelle <svens@stackframe.org> writes:
> 
> > Add an option to make compression of ramstage configurable. Right now
> > it is always compressed. On my Thinkpad, the complete boot to grub takes
> > 4s, with around 1s required for decompressing ramstage. This is probably
> > caused by the fact the decompression does a lot of single byte/word/qword
> > accesses, which are really slow on SPI buses. So give the user the option
> > to store ramstage uncompressed, if he has enough memory.
> 
> Odd. Historically this has been solved by simply putting an mtrr over
> the compressed area.  So that you would still get full cache block
> transfers during the decompression.  I am fuzzy about the appropriate
> mode.  Write protect I think.
> 
> Have you tried setting up an mtrr over the area that will be
> decompressed.  That should result in something that is even faster
> than copying non-compressed data.

The problem is that the code hard coded those values assuming coreboot
lives in the first 1MB which is not the case anymore since we have SMM
handlers.

Stefan
Eric W. Biederman - 2011-05-02 23:26:12
Stefan Reinauer <stefan.reinauer@coreboot.org> writes:

> * Eric W. Biederman <ebiederm@xmission.com> [110502 22:00]:
>> Sven Schnelle <svens@stackframe.org> writes:
>> 
>> > Add an option to make compression of ramstage configurable. Right now
>> > it is always compressed. On my Thinkpad, the complete boot to grub takes
>> > 4s, with around 1s required for decompressing ramstage. This is probably
>> > caused by the fact the decompression does a lot of single byte/word/qword
>> > accesses, which are really slow on SPI buses. So give the user the option
>> > to store ramstage uncompressed, if he has enough memory.
>> 
>> Odd. Historically this has been solved by simply putting an mtrr over
>> the compressed area.  So that you would still get full cache block
>> transfers during the decompression.  I am fuzzy about the appropriate
>> mode.  Write protect I think.
>> 
>> Have you tried setting up an mtrr over the area that will be
>> decompressed.  That should result in something that is even faster
>> than copying non-compressed data.
>
> The problem is that the code hard coded those values assuming coreboot
> lives in the first 1MB which is not the case anymore since we have SMM
> handlers.

Was that a destination hard code?  The code itself should come out of
the last couple of megabytes before 4G.

Regardless the performance penalty for not caching is huge fractions
of the boot time so whatever small practical issues exist we should
figure them out.

Eric
Stefan Reinauer - 2011-05-03 01:23:05
* Eric W. Biederman <ebiederm@xmission.com> [110503 01:26]:
> Was that a destination hard code?  The code itself should come out of
> the last couple of megabytes before 4G.
 
Yes. Only the lower 1MB of the destination memory was cached, while
coreboot's ram stage is now copied to 1MB.

My other mail to the list shows how to fix it.
Sven Schnelle - 2011-05-03 08:20:44
Hi Stefan, hi Eric,

Stefan Reinauer <stefan.reinauer@coreboot.org> writes:

> * Eric W. Biederman <ebiederm@xmission.com> [110503 01:26]:
>> Was that a destination hard code?  The code itself should come out of
>> the last couple of megabytes before 4G.
>  
> Yes. Only the lower 1MB of the destination memory was cached, while
> coreboot's ram stage is now copied to 1MB.

thanks for all your help. I've did both changes (enabling SPI prefetch
and setting the MTRRs right). Boot time decreased now to 1.8s (with only
1s spent in coreboot). Decompression time is now about 100ms, which is
much better than what we had before (1.9s only for ramstage loading).

Thanks,

Sven.

Patch

diff --git a/Makefile.inc b/Makefile.inc
index 6c4a16a..6267539 100644
--- a/Makefile.inc
+++ b/Makefile.inc
@@ -85,7 +85,11 @@  cbfs-files-handler= \
 
 #######################################################################
 # a variety of flags for our build
+CBFS_COMPRESS_FLAG:=
+ifeq ($(CONFIG_COMPRESS_RAMSTAGE),y)
 CBFS_COMPRESS_FLAG:=l
+endif
+
 CBFS_PAYLOAD_COMPRESS_FLAG:=
 CBFS_PAYLOAD_COMPRESS_NAME:=none
 ifeq ($(CONFIG_COMPRESSED_PAYLOAD_LZMA),y)
diff --git a/src/Kconfig b/src/Kconfig
index 76e77f8..a782315 100644
--- a/src/Kconfig
+++ b/src/Kconfig
@@ -98,6 +98,14 @@  config USE_OPTION_TABLE
 	  Enable this option if coreboot shall read options from the "CMOS"
 	  NVRAM instead of using hard coded values.
 
+config COMPRESS_RAMSTAGE
+	bool "Compress ramstage with LZMA"
+	default y
+	help
+	  Compress ramstage to save memory in the flash image. Note
+	  that decompression might slow down booting if the BIOS flash
+	  is connected through a slow Link (i.e. SPI)
+
 endmenu
 
 source src/mainboard/Kconfig