Patchwork xcshell: serial shell for early debugging

login
register
about
Submitter Michael Gold
Date 2009-07-14 09:23:11
Message ID <20090714092310.GA17981@iria.rilmarder.org>
Download mbox | patch
Permalink /patch/31/
State Not Applicable, archived
Headers show

Comments

Michael Gold - 2009-07-14 09:23:11
I've been working on some code to ensure I can debug a system even if
coreboot fails to boot an OS.  The attached patch adds this code to the
tree:
  - serialprobe watches the serial port for about a second, and boots
    the system normally unless 32 consecutive null bytes are seen.
  - xcshell allows the CPU to be controlled via the serial port, using
    a binary protocol.

The code assembles to about 1 KB, so it can be left in an image for
emergency use.  It's also modular in case someone wants to use it with
something other than a serial port.

xcsinterp.inc describes the protocol, and xccmd.inc describes the
command set.

Signed-off-by: Michael Gold <mgold@ncf.ca>
---
It should possible to jump to the 'xcshell' or 'serialprobe' symbol
(with the return address in ESP) as soon as the serial port is
configured and the CPU is executing 32-bit code (e.g., after
console_init in auto.c).  I'm using these lines in my mainboard's
Config.lb to include the code:
  mainboardinit arch/i386/xcshell/jmp_xccmd_end.inc
  mainboardinit arch/i386/xcshell/serialprobe.inc
  mainboardinit arch/i386/xcshell/io_serial.inc
  mainboardinit arch/i386/xcshell/xcsinterp.inc
  mainboardinit arch/i386/xcshell/xccmd.inc

I'm not sure whether there's anything that needs to be added, but
suggestions are welcome.  I'm hoping the command set is sufficient to
initialise RAM, though so far I've only tested cache-as-RAM mode
(which just seems to work for data, not code).

-- Michael
ron minnich - 2009-07-14 15:38:15
I like it. We talked about something like this in v3. How would you
feel about putting in emergency flash upgrade support :-)

Acked-by: Ronald G. Minnich <rminnich@gmail.com>
Stefan Reinauer - 2009-07-14 16:20:56
On 14.07.2009 11:23 Uhr, Michael Gold wrote:
> I've been working on some code to ensure I can debug a system even if
> coreboot fails to boot an OS.  The attached patch adds this code to the
> tree:
>   - serialprobe watches the serial port for about a second, and boots
>     the system normally unless 32 consecutive null bytes are seen.
>   - xcshell allows the CPU to be controlled via the serial port, using
>     a binary protocol.
>
> The code assembles to about 1 KB, so it can be left in an image for
> emergency use.  It's also modular in case someone wants to use it with
> something other than a serial port.
>
> xcsinterp.inc describes the protocol, and xccmd.inc describes the
> command set.
>
>   

Hi,

out of curiosity..

Would it be possible to enhance the already existing "llshell" with your
features? I would want to prevent more than one assembler written shell
of this type floating around in the coreboot tree.

Best wishes,

Stefan
Michael Gold - 2009-07-14 16:55:20
On Tue, Jul 14, 2009 at 08:38:15 -0700, ron minnich wrote:
> I like it. We talked about something like this in v3. How would you
> feel about putting in emergency flash upgrade support :-)

I like the flash upgrade idea, but I'm not sure how to do it yet (in a
generic way).  Any suggestions?

I think it might be possible without any changes by marking the part of
the ROM containing xcshell as cacheable, and then using pokes to rewrite
the rest of the chip.  It would be slow, but a small flash upgrade
program could be uploaded that would do the rest of the work more
quickly.

My original idea was that one would set up CAR, upload code, and jump to
it; this would allow the system to be extended without having to include
much code.  It doesn't work, at least on my Pentium 3, but the same
thing could be done with RAM once it's initialised (or with flash for
occasional use).

-- Michael
Michael Gold - 2009-07-14 17:31:56
On Tue, Jul 14, 2009 at 18:20:56 +0200, Stefan Reinauer wrote:
> Hi,
> 
> out of curiosity..
> 
> Would it be possible to enhance the already existing "llshell" with your
> features? I would want to prevent more than one assembler written shell
> of this type floating around in the coreboot tree.

I considered extending it, but I didn't like how closely the command
implementations are tied to the protocol.  xcshell could be easily
extended to execute opcodes from RAM (or CAR), for instance, which would
be difficult with llshell; or, someone may want to define a new protocol
that works with a USB debug device.

The llshell protocol is also designed for direct human use, which would
make it more difficult for a remote script to control it and ensure the
commands are being received/executed properly.  Programmatic control of
xcshell is simple, and the sequence numbers and checksums allow errors
to be detected and recovered from.

I also noticed that xcshell implements all the commands needed for
SerialICE; I haven't tried using them together, but I probably will at
some point.

-- Michael
ron minnich - 2009-07-14 18:03:06
Ah, well, stefan, sorry if I messed up.

I will agree that we don't want to go the linux path of fifty slightly
different versions of the same capability. It's bad enough in a kernel
but far worse in coreboot. So, maybe we can take it on ourselves to
figure out how to get one thing that does what we want?

ron
Carl-Daniel Hailfinger - 2009-07-14 19:51:51
On 14.07.2009 18:55, Michael Gold wrote:
> On Tue, Jul 14, 2009 at 08:38:15 -0700, ron minnich wrote:
>   
>> I like it. We talked about something like this in v3. How would you
>> feel about putting in emergency flash upgrade support :-)
>>     
>
> I like the flash upgrade idea, but I'm not sure how to do it yet (in a
> generic way).  Any suggestions?
>   

If you have RAM, downloading a flashrom binary linked against libpayload
would be best. The good news is that flashrom itself is written in a way
that allows you to put it on top of any backend, even a backend which
does nothing but perform remote control of xcshell. I think that would
be easiest.

> I think it might be possible without any changes by marking the part of
> the ROM containing xcshell as cacheable, and then using pokes to rewrite
> the rest of the chip.  It would be slow, but a small flash upgrade
> program could be uploaded that would do the rest of the work more
> quickly.
>   

If you have a generic "write X to loation y" interface, flashrom can
deal with that, even remotely.

> My original idea was that one would set up CAR, upload code, and jump to
> it; this would allow the system to be extended without having to include
> much code.  It doesn't work, at least on my Pentium 3, but the same
> thing could be done with RAM once it's initialised (or with flash for
> occasional use).
>   

As long as RAM works, we could even upload a complete flashrom binary
over zmodem.

Regards,
Carl-Daniel
Carl-Daniel Hailfinger - 2009-07-14 19:54:30
On 14.07.2009 11:23, Michael Gold wrote:
> I've been working on some code to ensure I can debug a system even if
> coreboot fails to boot an OS.  The attached patch adds this code to the
> tree:
>   - serialprobe watches the serial port for about a second, and boots
>     the system normally unless 32 consecutive null bytes are seen.
>   - xcshell allows the CPU to be controlled via the serial port, using
>     a binary protocol.
>
> The code assembles to about 1 KB, so it can be left in an image for
> emergency use.  It's also modular in case someone wants to use it with
> something other than a serial port.
>   

How does this relate to RemoteBIOS? AFAIK RemoteBIOS is explicitly
designed to be machine controllable even if CAR does not work.
http://linuxupc.upc.es/~urbez/RemoteBIOS.eng.html

Regards,
Carl-Daniel
Rudolf Marek - 2009-07-14 21:08:03
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> As long as RAM works, we could even upload a complete flashrom binary
> over zmodem.

You dont need RAM, because if you just do remote calls via flashrom you will
directly write to SPI controller for example ;)

Rudolf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEARECAAYFAkpc87MACgkQ3J9wPJqZRNWVQQCWN6t2W09qcuC8wyq9G5zCkKa/
HQCfQbceNgPreibXDZoSpzhxYDVdjtU=
=nGuk
-----END PGP SIGNATURE-----
Michael Gold - 2009-07-14 22:39:09
On Tue, Jul 14, 2009 at 21:54:30 +0200, Carl-Daniel Hailfinger wrote:
> How does this relate to RemoteBIOS? AFAIK RemoteBIOS is explicitly
> designed to be machine controllable even if CAR does not work.
> http://linuxupc.upc.es/~urbez/RemoteBIOS.eng.html

I may have seen that page before, but I'm not really familiar with the
project.  The command set looks pretty similar.  Like the SerialICE
shell, though, it appears to require 64 KB of ROM space, which means
it's unlikely to be included in an image when it's not being actively
used.  I've already had some trouble with image size when including both

-- Michael
Michael Gold - 2009-07-14 23:17:40
On Tue, Jul 14, 2009 at 21:51:51 +0200, Carl-Daniel Hailfinger wrote:
> > I think it might be possible without any changes by marking the part of
> > the ROM containing xcshell as cacheable, and then using pokes to rewrite
> > the rest of the chip.  It would be slow, but a small flash upgrade
> > program could be uploaded that would do the rest of the work more
> > quickly.
> >   
> 
> If you have a generic "write X to loation y" interface, flashrom can
> deal with that, even remotely.

That interface exists.  I guess you'd need to execute xcshell from RAM,
or use the --estart and --eend options to exclude the portion of the ROM
it's executing from.

> 
> > My original idea was that one would set up CAR, upload code, and jump to
> > it; this would allow the system to be extended without having to include
> > much code.  It doesn't work, at least on my Pentium 3, but the same
> > thing could be done with RAM once it's initialised (or with flash for
> > occasional use).
> >   
> 
> As long as RAM works, we could even upload a complete flashrom binary
> over zmodem.

I don't have zmodem support currently.  It would be easy to add a
command that writes both 32-bit operands to RAM, using the accumulator
as the pointer (auto-incremented afterwards).  Then only 3 of 11 bytes
would be protocol overhead, which should be fine if the binary isn't too
large.  Adding an automatic increment to peekl and pokel might be useful
as well.

-- Michael

Patch

Index: src/arch/i386/xcshell/serialprobe.inc
===================================================================
--- src/arch/i386/xcshell/serialprobe.inc	(revision 0)
+++ src/arch/i386/xcshell/serialprobe.inc	(revision 0)
@@ -0,0 +1,73 @@ 
+/*
+ * This file is part of the coreboot project.
+ *
+ * (C) 2009 Michael Gold <mgold@ncf.ca>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ * MA 02110-1301 USA
+ */
+
+/*
+ * Watch for CONFIG_SERIALPROBE_TRIGGER '\0' characters in a row.
+ * If this occurs within CONFIG_SERIALPROBE_WAIT tries, execute the code
+ * immediately following this file; otherwise return to ESP.
+ *
+ * Clobbers EBX, ESI, EBP.
+ */
+
+#ifndef CONFIG_SERIALPROBE_WAIT
+#define CONFIG_SERIALPROBE_WAIT 0x22aa99
+#endif
+#ifndef CONFIG_SERIALPROBE_TRIGGER
+#define CONFIG_SERIALPROBE_TRIGGER 32
+#endif
+
+.globl serialprobe
+
+serialprobe:
+	mov %esp, %ebp
+	mov %eax, %esi
+	mov $CONFIG_SERIALPROBE_WAIT, %ebx
+_sprobe_restart:
+	/* AH tracks the number of consecutive null bytes still needed */
+	mov $CONFIG_SERIALPROBE_TRIGGER, %ah
+
+	/* wait for a character */
+_sprobe_attempt:
+	dec %ebx
+	/* give up after CONFIG_SERIALPROBE_WAIT attempts */
+	js _sprobe_out
+
+	mov $1f, %esp
+	jmp getchar_try  /* clobbers the top 16 bits of EAX */
+1:	jz _sprobe_attempt
+
+	/* got a character; if it wasn't '\0', restart the count */
+	test $0xff, %al
+	jnz _sprobe_restart
+
+	/* got '\0'; we can declare success when AH reaches 0 */
+	dec %ah
+	jnz _sprobe_attempt
+
+_sprobe_out:
+	test $0xff, %ah
+	/* restore original register values */
+	mov %ebp, %esp
+	mov %esi, %eax
+	jz __serialprobe_end    /* triggered */
+	jmp *%esp               /* not triggered */
+
+__serialprobe_end:
Index: src/arch/i386/xcshell/jmp_xccmd_end.inc
===================================================================
--- src/arch/i386/xcshell/jmp_xccmd_end.inc	(revision 0)
+++ src/arch/i386/xcshell/jmp_xccmd_end.inc	(revision 0)
@@ -0,0 +1,2 @@ 
+__jmp_xccmd_end:
+	jmp __xccmd_end
Index: src/arch/i386/xcshell/xccmd.inc
===================================================================
--- src/arch/i386/xcshell/xccmd.inc	(revision 0)
+++ src/arch/i386/xcshell/xccmd.inc	(revision 0)
@@ -0,0 +1,532 @@ 
+/*
+ * This file is part of the coreboot project.
+ *
+ * (C) 2009 Michael Gold <mgold@ncf.ca>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ * MA 02110-1301 USA
+ */
+
+/*
+ * This file contains the xcode_exec function and its supported commands.
+ */
+
+.globl xcode_init
+.globl xcode_exec
+
+__xccmd_start:
+	jmp __xccmd_end
+
+/* 0x01: Jump to OP1; if OP2 is nonzero, swap it with ESP first
+ *       (i.e., ECX will provide the return address). */
+xccmd_call:
+	test %ecx, %ecx
+	jz 1f
+	xchg %ecx, %esp
+1:	jmp *%edx
+
+/* 0x02: ACC := MEM[OP1] (dword) */
+xccmd_peekl:
+	mov (%edx), %edi
+	jmp *%esp
+
+/* 0x03: MEM[OP1] := OP2 (dword) */
+xccmd_pokel:
+	mov %ecx, (%edx)
+	jmp *%esp
+
+/* 0x04: PCICONF[OP1] := OP2 */
+xccmd_pokepci:
+	/* TODO: translate the request to the alternate PCI access method,
+	 *       if necessary */
+	xchg %edx, %eax
+
+	/* write OP1 to 0xcf8 */
+	mov $0xcf8, %dx
+	out %eax, (%dx)
+
+	/* write OP2 to 0xcfc */
+	xchg %ecx, %eax
+	mov $0xfc, %dl
+	out %eax, (%dx)
+	xchg %ecx, %eax
+
+	xchg %eax, %edx
+	jmp *%esp
+
+/* 0x05: ACC := PCICONF[OP1] */
+xccmd_peekpci:
+	xchg %edx, %eax
+
+	/* write OP1 to 0xcf8 */
+	mov $0xcf8, %dx
+	out %eax, (%dx)
+
+	/* read 0xcfc to accumulator */
+	xchg %eax, %edi  /* save OP1 in EDI */
+	mov $0xfc, %dl
+	in (%dx), %eax
+	xchg %eax, %edi  /* store the result in ACC */
+
+	xchg %eax, %edx  /* restore OP1 */
+	jmp *%esp
+
+/* 0x06: ACC := (ACC & OP1) | OP2 */
+xccmd_andor:
+	and %edx, %edi
+	or %ecx, %edi
+	jmp *%esp
+
+/* 0x0a: Swap ACC with ACC2 */
+xccmd_swap_a1a2:
+	xchg %esi, %edi
+	jmp *%esp
+
+/* 0x0b: Swap ACC with OP1, ACC2 with OP2 */
+xccmd_swap_ao:
+	xchg %edi, %edx
+	xchg %esi, %ecx
+	jmp *%esp
+
+/* 0x0c: Execute CPUID with EAX=OP1, ECX=OP2 */
+xccmd_cpuid:
+	xchg %edx, %eax
+	mov %ebx, %esi   /* save old EBX value */
+	cpuid
+	/* Store EAX,EBX,ECX,EDX in ACC,ACC2,OP2,OP1 (EDI,ESI,ECX,EDX) */
+	xchg %ebx, %esi
+	xchg %eax, %edi
+	jmp *%esp
+
+/* 0x10: ACC2 := (ACC2 & OP1) | OP2 */
+xccmd_andor_a2:
+	and %edx, %esi
+	or %ecx, %esi
+	jmp *%esp
+
+/* 0x11: PORT[OP1] (byte) := OP2 */
+xccmd_outb:
+	mov %cl, %al
+	out %al, (%dx)
+	jmp *%esp
+
+/* 0x12: ACC := PORT[OP1] (byte) */
+xccmd_inb:
+	in (%dx), %al
+	movzx %al, %edi
+_xc_ret_esp:
+	jmp *%esp
+
+/* 0x07: Execute opcode OP1 with operands OP2, ACC */
+xccmd_indirect:
+	test $0xffffff00, %edx
+	jnz _xc_ret_esp  /* opcode out of range */
+
+	mov %ecx, %edx   /* set OP1=OP2 */
+	mov %edi, %ecx   /* set OP2=ACC */
+
+/* fall through to xcode_exec *****************************************/
+
+xcode_exec:
+	/* Execute the given command.
+	 *  AL = command ID (0-63)
+	 *  EDX = operand 1
+	 *  ECX = operand 2
+	 *  ESP = return address
+	 * (EDI = accumulator, ESI = accumulator 2)
+	 *
+	 * EAX will be 0 when the command is executed.
+	 *
+	 * (This code is located in the middle of the file so short jumps
+	 * can be used as much as possible.)
+	 */
+
+	/* determine the command */
+	/* using EAX instead of AL reduces the size of the DEC instruction */
+	and $0x3f, %eax
+	dec %eax
+	jz xccmd_call         /*0x01*/
+	dec %eax
+	jz xccmd_peekl        /*0x02*/
+	dec %eax
+	jz xccmd_pokel        /*0x03*/
+	dec %eax
+	jz xccmd_pokepci      /*0x04*/
+	dec %eax
+	jz xccmd_peekpci      /*0x05*/
+	dec %eax
+	jz xccmd_andor        /*0x06*/
+	dec %eax
+	jz xccmd_indirect     /*0x07*/
+	/* 0x08(BNE), 0x09(BRA) not implemented */
+	sub $(0x0a-0x07), %al
+	jz xccmd_swap_a1a2    /*0x0a*/
+	dec %eax
+	jz xccmd_swap_ao      /*0x0b*/
+	dec %eax
+	jz xccmd_cpuid        /*0x0c*/
+	dec %eax
+	jz xccmd_rdmsr        /*0x0d*/
+	dec %eax
+	jz xccmd_wrmsr        /*0x0e*/
+	dec %eax
+	jz xccmd_rdpmc        /*0x0f*/
+	dec %eax
+	jz xccmd_andor_a2     /*0x10*/
+	dec %eax
+	jz xccmd_outb         /*0x11*/
+	dec %eax
+	jz xccmd_inb          /*0x12*/
+	dec %eax
+	jz xccmd_outw         /*0x13*/
+	dec %eax
+	jz xccmd_inw          /*0x14*/
+	dec %eax
+	jz xccmd_outl         /*0x15*/
+	dec %eax
+	jz xccmd_inl          /*0x16*/
+	dec %eax
+	jz xccmd_pokeb        /*0x17*/
+	dec %eax
+	jz xccmd_pokew        /*0x18*/
+	sub $(0x1b-0x18), %al
+	jz xccmd_invd         /*0x1b*/
+	dec %eax
+	jz xccmd_reset        /*0x1c*/
+	sub $(0x2a-0x1c), %al
+	jz xccmd_memscan_eq   /*0x2a*/
+	dec %eax
+	jz xccmd_memscan_ne   /*0x2b*/
+	dec %eax
+	jz xccmd_memread      /*0x2c*/
+	dec %eax
+	jz xccmd_memset       /*0x2d*/
+	dec %eax
+	jz xccmd_exit         /*0x2e (a.k.a. 0xee)*/
+	dec %eax
+	jz xccmd_regmove      /*0x2f*/
+	/* 0x00 and 0x3f are NOPs, along with any unknown commands */
+	jmp *%esp
+/**************/
+
+/* 0x0d: Read MSR OP1 to ACC2:ACC */
+xccmd_rdmsr:
+	xchg %edx, %ecx
+	xchg %esi, %edx
+	xchg %edi, %eax
+	rdmsr            /* EDX:EAX = MSR[ECX] */
+_xcmsr_out:
+	xchg %eax, %edi
+	xchg %edx, %esi
+	xchg %ecx, %edx
+	jmp *%esp
+
+/* 0x0e: Write ACC2:ACC to MSR OP1 */
+xccmd_wrmsr:
+	xchg %edx, %ecx
+	xchg %esi, %edx
+	xchg %edi, %eax
+	wrmsr            /* MSR[ECX] = EDX:EAX */
+	jmp _xcmsr_out
+
+/* 0x0f: Read PMC OP1 to ACC2:ACC */
+xccmd_rdpmc:
+	xchg %edx, %ecx
+	xchg %esi, %edx
+	xchg %edi, %eax
+	rdpmc            /* EDX:EAX = PMC[ECX] */
+	jmp _xcmsr_out
+
+/* 0x13: PORT[OP1] (word) := OP2 */
+xccmd_outw:
+	mov %ecx, %eax
+	out %ax, (%dx)
+	jmp *%esp
+
+/* 0x14: ACC := PORT[OP1] (word) */
+xccmd_inw:
+	in (%dx), %ax
+	movzx %ax, %edi
+	jmp *%esp
+
+/* 0x15: PORT[OP1] (dword) := OP2 */
+xccmd_outl:
+	mov %ecx, %eax
+	out %eax, (%dx)
+	jmp *%esp
+
+/* 0x16: ACC := PORT[OP1] (dword) */
+xccmd_inl:
+	in (%dx), %eax
+	xchg %eax, %edi
+	jmp *%esp
+
+/* 0x17: MEM[OP1] := OP2 (byte) */
+xccmd_pokeb:
+	mov %cl, (%edx)
+	jmp *%esp
+
+/* 0x18: MEM[OP1] := OP2 (word) */
+xccmd_pokew:
+	mov %cx, (%edx)
+	jmp *%esp
+
+/* 0x1b: invalidate caches; write back to memory first if OP1 is nonzero */
+xccmd_invd:
+	test %edx, %edx
+	jnz 1f
+	invd
+1:	wbinvd
+	jmp *%esp
+
+/* 0x1c: reset (by causing a triple fault) */
+xccmd_reset:
+	lidt %cs:1f
+	int3
+	jmp *%esp        /* shouldn't be executed */
+1:	.word 0
+
+/* 0x2e: exit the interpreter loop */
+xccmd_exit:
+	/* the interpreter will exit when it sees ESP=0xffffffff */
+	xchg %esp, %eax  /* set ESP=0 */
+	dec %esp
+	jmp *%eax
+
+/* 0x2a: Scan memory block in dwords, until the data is equal to ACC;
+ *       sets ACC2=ACC, and ACC to the last dword read.
+ *   OP1 - starting address (at exit, points to the next address that
+ *         would have been read)
+ *   OP2 - bits 31..25: reserved; must be 0
+ *         bit 24: direction flag (0=forwards, 1=backwards)
+ *         bits 23..0: unsigned dword count (0 = 2**24)
+ */
+xccmd_memscan_eq:
+	or $0x10, %al
+	/* fall through to memscan_ne */
+
+/* 0x2b: Scan memory block in dwords, as long as the data is equal to ACC;
+ *       sets ACC2=ACC, and ACC to the last dword read.
+ *       See memscan_eq for parameters. */
+xccmd_memscan_ne:
+	or $0x80, %al
+	/* fall through to memread */
+
+/* 0x2c: Read memory block in dwords, setting ACC to the last dword read.
+ *       See memscan_eq for parameters. */
+xccmd_memread:
+	or $0x40, %al
+	/* fall through to memset */
+
+/* 0x2d: Write ACC across memory block.  See memscan_eq for parameters. */
+xccmd_memset:
+	/* EAX is 0 unless coming through memread/memscan */
+	/* set EDX=EAX, ESI=OP1(addr), EDI=ACC2, EAX=ACC */
+	xchg %eax, %edx
+	xchg %eax, %esi
+	xchg %eax, %edi
+
+	rol $8, %ecx
+	or %ecx, %edx     /* top 24 bits of EDX were 0 */
+	test $0xfe, %cl
+	jnz _xcmem_out    /* reserved bits were set */
+
+	shr $8, %ecx
+	jnz 1f
+	/* count is zero; change to maximum value */
+	bts $24, %ecx
+1:
+	cld
+	test $0x01, %dl
+	jz 1f
+	std
+1:
+	test $0xc0, %dl
+	js _xcmem_scan
+	jnz _xcmem_read
+_xcmem_write:
+	xchg %edi, %esi
+	rep stosl %eax, %es:(%edi)
+	xchg %edi, %esi
+	/* ECX=0; fall through since read won't do anything */
+_xcmem_read:
+	rep lodsl %ds:(%esi), %eax
+	jmp _xcmem_out
+_xcmem_scan:
+	xchg %eax, %edi  /* set ACC2 to original ACC value */
+	test $0x10, %dl
+	jnz 2f
+1:	lodsl %ds:(%esi), %eax   /* memscan_ne */
+	cmp %eax, %edi
+	loopel 1b
+	jmp _xcmem_out
+2:	lodsl %ds:(%esi), %eax   /* memscan_eq */
+	cmp %eax, %edi
+	loopnel 2b
+_xcmem_out:
+	cld
+	and $0x01, %dl
+	ror $8, %edx
+	mov %edx, %ecx   /* restore original OP2 value */
+	xchg %eax, %edi  /* ACC */
+	xchg %eax, %esi  /* ACC2 */
+	xchg %eax, %edx  /* OP1 (now the ending address) */
+	jmp *%esp
+
+
+/* 0x2f: Load register OP1 into ACC, then store ACC into register OP2 */
+xccmd_regmove:
+	/* If -1, no load/store is performed; otherwise,
+	 *  0x00..0x07 = {EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI}
+	 *  0x08..0x0f = { ES,  CS,  SS,  DS,  FS,  GS, res, GDTR (indirect)}
+	 *  0x10..0x14 = {CR0, res, CR2, CR3, CR4}
+	 */
+
+	/*** part 1: load EDI ***/
+
+	xchg %ecx, %eax  /* set EAX=ECX, ECX=0 */
+	or %edx, %ecx
+
+	jnz 1f
+	mov %eax, %edi  /* 0x00 */
+1:
+	loopl 1f
+	mov %ecx, %edi  /* 0x01 */
+1:
+	loopl 1f
+	mov %edx, %edi  /* 0x02 */
+1:
+	loopl 1f
+	mov %ebx, %edi  /* 0x03 */
+1:
+	loopl 1f
+	mov %esp, %edi  /* 0x04 */
+1:
+	loopl 1f
+	mov %ebp, %edi  /* 0x05 */
+1:
+	loopl 1f
+	mov %esi, %edi  /* 0x06 */
+1:
+	dec %ecx        /* 0x07 (EDI) */
+	loopl 1f
+	mov %es, %edi   /* 0x08 */
+1:
+	loopl 1f
+	mov %cs, %edi   /* 0x09 */
+1:
+	loopl 1f
+	mov %ss, %edi   /* 0x0a */
+1:
+	loopl 1f
+	mov %ds, %edi   /* 0x0b */
+1:
+	loopl 1f
+	mov %fs, %edi   /* 0x0c */
+1:
+	loopl 1f
+	mov %gs, %edi   /* 0x0d */
+1:
+	dec %ecx        /* 0x0e reserved */
+	loopl 1f
+	lgdt (%edi)     /* 0x0f */
+1:
+	loopl 1f
+	mov %cr0, %edi  /* 0x10 */
+1:
+	dec %ecx        /* 0x11: reserved */
+	loopl 1f
+	mov %cr2, %edi  /* 0x12 */
+1:
+	loopl 1f
+	mov %cr3, %edi  /* 0x13 */
+1:
+	loopl 1f
+	mov %cr4, %edi  /* 0x14 */
+1:
+
+	/*** part 2: store EDI ***/
+
+	xor %ecx, %ecx
+	or %eax, %ecx   /* set ECX=OP2 */
+
+	jnz 1f
+	/* need to store into EAX, which contains the original ECX value */
+	xchg %eax, %ecx
+	mov %edi, %eax  /* 0x00 */
+	jmp *%esp
+1:
+	loopl 1f
+	mov %edi, %ecx  /* 0x01 */
+	jmp *%esp       /* exit without decrementing ECX */
+1:
+	loopl 1f
+	mov %edi, %edx  /* 0x02 */
+1:
+	loopl 1f
+	mov %edi, %ebx  /* 0x03 */
+1:
+	loopl 1f
+	xchg %eax, %ecx
+	xchg %esp, %eax
+	mov %edi, %esp  /* 0x04 (ESP) */
+	jmp *%eax
+1:
+	loopl 1f
+	mov %edi, %ebp  /* 0x05 */
+1:
+	loopl 1f
+	mov %edi, %esi  /* 0x06 */
+1:
+	dec %ecx        /* 0x07 (EDI) */
+	loopl 1f
+	mov %edi, %es   /* 0x08 */
+1:
+	dec %ecx        /* 0x09 (CS - must use far jmp to load) */
+	loopl 1f
+	mov %edi, %ss   /* 0x0a */
+1:
+	loopl 1f
+	mov %edi, %ds   /* 0x0b */
+1:
+	loopl 1f
+	mov %edi, %fs   /* 0x0c */
+1:
+	loopl 1f
+	mov %edi, %gs   /* 0x0d */
+1:
+	dec %ecx        /* 0x0e reserved */
+	loopl 1f
+	sgdt (%edi)     /* 0x0f */
+1:
+	loopl 1f
+	mov %edi, %cr0  /* 0x10 */
+1:
+	dec %ecx        /* 0x11: reserved */
+	loopl 1f
+	mov %edi, %cr2  /* 0x12 */
+1:
+	loopl 1f
+	mov %edi, %cr3  /* 0x13 */
+1:
+	loopl 1f
+	mov %edi, %cr4  /* 0x14 */
+1:
+	/* done, restore OP2 */
+	xchg %eax, %ecx
+	jmp *%esp
+
+
+__xccmd_end:
Index: src/arch/i386/xcshell/xcsinterp.inc
===================================================================
--- src/arch/i386/xcshell/xcsinterp.inc	(revision 0)
+++ src/arch/i386/xcshell/xcsinterp.inc	(revision 0)
@@ -0,0 +1,221 @@ 
+/*
+ * This file is part of the coreboot project.
+ *
+ * (C) 2009 Michael Gold <mgold@ncf.ca>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ * MA 02110-1301 USA
+ */
+
+/*
+ * xcshell is a shell that can be executed before memory is initialised,
+ * and which communicates with a controlling process over the serial port
+ * (or another device with compatible getchar/putchar functions).
+ *
+ * It uses a binary protocol based on the Xbox "xcodes"#, but lacks
+ * flow control and sandboxing.
+ *  # www.xbox-linux.org/wiki/The_Hidden_Boot_Code_of_the_Xbox#The_Xcodes
+ *
+ * Clobbers EBX, ESI, EBP (and possibly others, if commands are executed).
+ * This file contains the main loop, and xccmd.inc contains the commands.
+ */
+
+/* Register usage:
+ *  EAX = temporary (can be clobbered by commands)
+ *  ECX = command operand 2 (usually preserved by commands)
+ *  EDX = command operand 1 (usually preserved by commands)
+ *  EBX = reserved for interpreter loop (temporary)
+ *  ESP = temporary (for return addresses)
+ *  EBP = reserved for interpreter loop (saved ESP)
+ *  ESI = accumulator 2
+ *  EDI = accumulator 1
+ */
+
+/* Protocol:
+ *  Request (controller -> shell)
+ *    1 byte: sequence ID
+ *            bits 7..6: 00 = execute command without printing
+ *                       01 = execute command and print accumulator
+ *                       10, 11 reserved
+ *            bits 5..0: sequence ID (arbitrary, but must be nonzero)
+ *    1 byte: command and flags
+ *            bits 7..6: 00 = read both operands
+ *                       01 = no operands
+ *                       10 = read OP2 only
+ *                       11 = read OP1 only
+ *            bits 5..0: command (see xcode_exec for the command list)
+ *    4 bytes: operand 1 (little endian)
+ *    4 bytes: operand 2 (little endian)
+ *    1 byte: checksum (0xFF XORed with the sum of all byte values)
+ *
+ * (null bytes outside of requests are ignored)
+ *
+ *  Response:
+ *    1 byte: sequence ID and result flags
+ *            bits 7..6: 00, 01 reserved
+ *                       10 = command received successfully
+ *                       11 = protocol error (bad checksum or sequence ID)
+ *            bits 5..0: sequence ID (as specified in the request;
+ *                                    commands are executed in order)
+ *    If printing was requested:
+ *      4 bytes: accumulator value after executing command (little endian)
+ *      1 byte: checksum for the preceding 4 bytes only (calculated as above)
+ */
+
+.globl xcshell
+
+/* interpreter entry point */
+xcshell:
+	mov %eax, %esi  /* store original EAX value */
+	mov %esp, %ebp
+
+/* main interpreter loop */
+xcloop:
+	/* exit if ESP=0xffffffff */
+	inc %esp
+	jnz _xcl_read
+
+_xcl_exit:
+	mov $1f, %esp
+	jmp flushchar
+1:	mov %esi, %eax
+	mov %ebp, %esp
+	jmp *%esp
+
+_xcl_read:
+	/* read sequence ID into BH */
+	mov $2f, %esp
+1:	jmp getchar
+2:	test $0xff, %al
+	jz 1b         /* skip null bytes */
+	test $0x80, %al
+	jnz 1b        /* upper bit must be 0 */
+	mov %al, %ah  /* checksum */
+	mov %al, %bh
+
+	/* read command into BL */
+	mov $1f, %esp
+	jmp getchar
+1:	add %al, %ah  /* checksum */
+
+	mov %al, %bl
+	test $0x40, %bl
+	jz 1f
+	xor $0x80, %bl
+1:
+
+_xcl_read_op1:
+	/* read operand 1 (32 bits) into EDX (if provided) */
+	test $0x80, %bl
+	jnz _xcl_read_op2
+	mov $0xff000000, %edx
+	mov $2f, %esp
+1:	jmp getchar
+2:	add %al, %ah  /* checksum */
+	test $0xff, %dl
+	mov %al, %dl
+	ror $8, %edx
+	jz 1b         /* ZF set by test */
+
+_xcl_read_op2:
+	/* read operand 2 (32 bits) into ECX (if provided) */
+	test $0x40, %bl
+	jnz _xcl_read_chk
+	mov $0xff000000, %ecx
+	mov $2f, %esp
+1:	jmp getchar
+2:	add %al, %ah  /* checksum */
+	test $0xff, %cl
+	mov %al, %cl
+	ror $8, %ecx
+	jz 1b         /* ZF set by test */
+
+_xcl_read_chk:
+	/* read checksum */
+	mov $1f, %esp
+	jmp getchar
+1:	add %al, %ah  /* checksum */
+	mov %bh, %al  /* load sequence ID into AL */
+	or $0xc0, %al
+
+	/* verify the checksum: it should be 0xff now */
+	inc %ah
+	jnz 1f
+	and $0xbf, %al   /* checksum was good */
+1:
+_xcl_run:
+	/* write the response */
+	mov $1f, %esp
+	jmp putchar
+1:
+	/* abort if the checksum was bad */
+	test $0x40, %al
+	jnz xcloop
+
+	/* run the command */
+	mov $xcloop, %esp
+	test $0x40, %bh  /* print result afterwards? */
+	jz 1f
+	mov $xcprint, %esp
+1:
+	mov %bl, %al     /* load command ID into AL */
+	jmp xcode_exec
+
+
+/* print the accumulator and a checksum */
+xcprint:
+	xor %al, %al
+	inc %esp
+	jnz 1f
+	/* we'll need to exit; set a flag to indicate this */
+	or $0x80, %al
+1:
+	xchg %edi, %ebx
+	or $4, %al
+	mov $2f, %esp
+	/* transmit each byte; use the low 3 bits of AL as the loop counter */
+1:	xchg %bl, %al
+	jmp putchar
+2:	xchg %bl, %al
+	ror $8, %ebx
+	dec %al
+	test $7, %al
+	jnz 1b
+
+_xcp_chk:
+	/* add up the transmitted bytes */
+	xor %ah, %ah
+	or $4, %al
+1:	add %bl, %ah
+	ror $8, %ebx
+	dec %al
+	test $7, %al
+	jnz 1b
+
+_xcp_out:
+	/* clean up */
+	xchg %edi, %ebx
+	mov $xcloop, %esp
+	/* determine whether to exit after sending the checksum */
+	test $0x80, %al
+	jz 1f
+	mov $_xcl_exit, %esp
+1:
+	/* adjust and send the checksum */
+	mov $0xff, %al
+	xor %ah, %al
+	jmp putchar
+
+__xcsinterp_end:
Index: src/arch/i386/xcshell/io_serial.inc
===================================================================
--- src/arch/i386/xcshell/io_serial.inc	(revision 0)
+++ src/arch/i386/xcshell/io_serial.inc	(revision 0)
@@ -0,0 +1,119 @@ 
+/*
+ * This file is part of the coreboot project.
+ *
+ * (C) 2009 Michael Gold <mgold@ncf.ca>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ * MA 02110-1301 USA
+ */
+
+/*
+ * Provides input/output functions for xcshell that communicate on COM1,
+ * which must be preconfigured.
+ */
+
+#ifndef CONFIG_TTYS0_BASE
+#define CONFIG_TTYS0_BASE 0x3f8
+#endif
+
+.globl getchar
+.globl getchar_try
+.globl putchar
+
+__io_start:
+	jmp __io_end
+
+/* Wait for a character, then read it into AL and jump to ESP.
+ * Clobbers the top 16 bits of EAX. */
+getchar:
+	/* store DX temporarily */
+	rol $16, %eax
+	mov %dx, %ax
+	rol $16, %eax
+
+	/* wait for a character */
+	mov $CONFIG_TTYS0_BASE+5, %dx
+1:	in (%dx), %al
+	test $1, %al
+	jz 1b
+
+	/* read the character */
+_getchar_nocheck:  /* getchar_try jumps here, so don't touch ZF */
+	mov $(CONFIG_TTYS0_BASE & 0xff), %dl
+	in (%dx), %al
+
+	/* done; restore DX and return */
+_serio_out:
+	rol $16, %eax
+	mov %ax, %dx
+	rol $16, %eax
+	jmp *%esp
+
+/* Try to read a character.  If one is available, set ZF=0 and store it
+ * in AL; otherwise set ZF=1 and AL=0.  Returns to ESP in either case.
+ * Clobbers the top 16 bits of EAX. */
+getchar_try:
+	/* store DX temporarily */
+	rol $16, %eax
+	mov %dx, %ax
+	rol $16, %eax
+
+	/* check whether a character is available */
+	mov $CONFIG_TTYS0_BASE+5, %dx
+	in (%dx), %al
+	and $1, %al
+	jnz _getchar_nocheck  /* read a character */
+	jmp _serio_out        /* exit without reading */
+
+
+/* Write the character in AL, then return to ESP.
+ * Clobbers the top 24 bits of EAX. */
+putchar:
+	/* store DX temporarily */
+	rol $16, %eax
+	mov %dx, %ax
+	rol $16, %eax
+	mov %al, %ah
+
+	/* wait for the transmitter to be ready */
+	mov $CONFIG_TTYS0_BASE+5, %dx
+1:	in (%dx), %al
+	test $0x20, %al
+	jz 1b
+
+	/* transmit the character */
+	mov $(CONFIG_TTYS0_BASE & 0xff), %dl
+	mov %ah, %al
+	out %al, (%dx)
+	jmp _serio_out
+
+/* Wait until the transmit buffer is empty.
+ * Clobbers the top 24 bits of EAX. */
+flushchar:
+	/* store DX temporarily */
+	rol $16, %eax
+	mov %dx, %ax
+	rol $16, %eax
+	mov %al, %ah
+
+	/* wait while the character is transmitted */
+	mov $CONFIG_TTYS0_BASE+5, %dx
+1:	in (%dx), %al
+	test $0x40, %al
+	jz 1b
+	mov %ah, %al
+	jmp _serio_out
+
+__io_end: