Patchwork Fast flash read speed with FTDI FT2232H programmer

login
register
about
Submitter Boris Baykov
Date 2015-01-18 05:50:57
Message ID <74681267.20150118085057@borisbaykov.com>
Download mbox | patch
Permalink /patch/4273/
State New
Headers show

Comments

Boris Baykov - 2015-01-18 05:50:57
The patch increases flash read speed via FTDI FT2232H programmer.
New ft2232_spi_read function used instead of spi_read_chunked
to read by 64k chunks instead of 256b chunks.

According to 3-7 ms FTDI delay for each flash instruction
this patch makes read operation faster in about 10 times.

To have stable transfers with 64k buffer SPI frequency should
be decreased to 5 MHz. I don't know why but higher freqs
aren't working correct. DEFAULT_DIVISOR is changed to 12.

Signed-off-by: Boris Baykov <dev@borisbaykov.com>, Russia, Jan 2015
---
The patch increases flash read speed via FTDI FT2232H programmer.
New ft2232_spi_read function used instead of spi_read_chunked
to read by 64k chunks instead of 256b chunks.

According to 3-7 ms FTDI delay for each flash instruction
this patch makes read operation faster in about 10 times.

To have stable transfers with 64k buffer SPI frequency should
be decreased to 5 MHz. I don't know why but higher freqs
aren't working correct. DEFAULT_DIVISOR is changed to 12.

Signed-off-by: Boris Baykov <dev@borisbaykov.com>, Russia, Jan 2015
---
Stefan Tauner - 2015-01-18 10:36:37
On Sun, 18 Jan 2015 08:50:57 +0300
Boris Baykov <dev@borisbaykov.com> wrote:

> 
> The patch increases flash read speed via FTDI FT2232H programmer.
> New ft2232_spi_read function used instead of spi_read_chunked
> to read by 64k chunks instead of 256b chunks.
> 
> According to 3-7 ms FTDI delay for each flash instruction
> this patch makes read operation faster in about 10 times.
> 
> To have stable transfers with 64k buffer SPI frequency should
> be decreased to 5 MHz. I don't know why but higher freqs
> aren't working correct. DEFAULT_DIVISOR is changed to 12.

Hi,

thanks for the patch. I see two major problems with it though. First of
all the divisor problem has to be investigated. I presume you have only
tested with a single hardware setup, right? I'd like to see results from
different dongles/setups... and/or a conclusive theory why the problems
happens at higher frequencies.

The other thing is the incompatibility of the chunk size... yes, a
majority of chips support this, but flashrom prioritizes other things:
stability and compatibility. I don't care for the slowness if that's
the price for it to work universally with all chips in the wild.

The chunk size problem can be fixed in the long run when we come up
with a better infrastructure for this kind of problem (e.g. storing the
maximum read chunk size per chip). However, this is not going to happen
soon. We have more pressing problems and way too many outstanding
patches with a higher priority than speed problems. Therefore I will
not commit anything like this soonish and recommend to keep this patch
or similar ones applied locally if you need it. This is not very
satisfying for anybody but the way it is currently, sorry.
Boris Baykov - 2015-01-18 17:01:30
>> The patch increases flash read speed via FTDI FT2232H programmer.
>> New ft2232_spi_read function used instead of spi_read_chunked
>> to read by 64k chunks instead of 256b chunks.
>>
>> According to 3-7 ms FTDI delay for each flash instruction
>> this patch makes read operation faster in about 10 times.
>>
>> To have stable transfers with 64k buffer SPI frequency should
>> be decreased to 5 MHz. I don't know why but higher freqs
>> aren't working correct. DEFAULT_DIVISOR is changed to 12.
>
> Hi,
>
> thanks for the patch.
Thanks Stefan!

> I see two major problems with it though. First of all the divisor
problem has
> to be investigated. I presume you have only tested with a single hardware
> setup, right? I'd like to see results from different dongles/setups...
and/or
> conclusive theory why the problems happens at higher frequencies.

This is not exactly right. I've really tested it with a single hardware,
however this hardware is a laptop and I tried to switch it to very
Low-Power mode and run several videos simultaniously with Linux VM that is
used to run flashrom. In the low-power mode with CPU freq 800 instead of
3200 I see no difference with SPI freq. It's still working excellent with
5 MHz on FTDI. So, I suppose that it's not an issue with CPU performance.
I'll move this VM to another PCs soon and will check this patch there. Now
I see the following, Flashrom is working correct all times on 5 MHz but on
6 MHz and higher on EVERY try I see many wrong bytes with FF or 00 in
output file after read. I agree with your thought that this situation
should be further investigated.

> The other thing is the incompatibility of the chunk size... yes, a
> majority of chips support this, but flashrom prioritizes other things:
> stability and compatibility. I don't care for the slowness if that's
> the price for it to work universally with all chips in the wild.

I understand this too. I see and appreciate that you're thinking about
stability and reliability. However it's not difficult to add
max_read_chunk param to flashchips chips definitions. Also this data could
be got by SFDP probably. So, there is no real issue to support 64k read
with enough stability.

In serprog.c we already have 64k supported for all chips. I've added same
support to ft2232_spi.c With this thing I can't see big issues comparing
to unknown SPI freq problems above. 10 times speed gain isn't a bad thing
;-) So, imho, it's a good challenge to solve these small troubles and
apply the patch at the end of investigation and development.
Stefan Tauner - 2015-01-18 18:13:03
On Sun, 18 Jan 2015 20:01:30 +0300
"Boris Baykov" <dev@borisbaykov.com> wrote:

> > I see two major problems with it though. First of all the divisor
> problem has
> > to be investigated. I presume you have only tested with a single hardware
> > setup, right? I'd like to see results from different dongles/setups...
> and/or
> > conclusive theory why the problems happens at higher frequencies.
> 
> This is not exactly right. I've really tested it with a single hardware,
> however this hardware is a laptop and I tried to switch it to very
> Low-Power mode and run several videos simultaniously with Linux VM that is
> used to run flashrom. In the low-power mode with CPU freq 800 instead of
> 3200 I see no difference with SPI freq. It's still working excellent with
> 5 MHz on FTDI. So, I suppose that it's not an issue with CPU performance.
> I'll move this VM to another PCs soon and will check this patch there. Now
> I see the following, Flashrom is working correct all times on 5 MHz but on
> 6 MHz and higher on EVERY try I see many wrong bytes with FF or 00 in
> output file after read. I agree with your thought that this situation
> should be further investigated.

I was referring to the other part of the hardware setup: the dongle,
the wiring to the chip and the flash chip itself. Testing different
setups would hopefully give us more insight into what's the real
problem. There must be something odd going on... because in general it
should work AFAIK.

> > The other thing is the incompatibility of the chunk size... yes, a
> > majority of chips support this, but flashrom prioritizes other things:
> > stability and compatibility. I don't care for the slowness if that's
> > the price for it to work universally with all chips in the wild.
> 
> I understand this too. I see and appreciate that you're thinking about
> stability and reliability. However it's not difficult to add
> max_read_chunk param to flashchips chips definitions. Also this data could
> be got by SFDP probably. So, there is no real issue to support 64k read
> with enough stability.

Not in general, no, it is certainly possible to support it correctly.
Just not with this quick hack.

> 
> In serprog.c we already have 64k supported for all chips. I've added same
> support to ft2232_spi.c With this thing I can't see big issues comparing
> to unknown SPI freq problems above. 10 times speed gain isn't a bad thing
> ;-) So, imho, it's a good challenge to solve these small troubles and
> apply the patch at the end of investigation and development.

The loop in serprog is a bug not a feature. I have just not removed it
because serprog is not used much by people not aware of the problem (or
to be more precise... not by many people at all :)
I certainly want to fix this in the future but not in the near future.
Boris Baykov - 2015-01-18 19:45:43
> I was referring to the other part of the hardware setup: the dongle,
> the wiring to the chip and the flash chip itself. Testing different
> setups would hopefully give us more insight into what's the real
> problem. There must be something odd going on... because in general it
> should work AFAIK.

Yes, sure, we have to test on a different external hardware too. Probably
I'll do this later or someone else will do.

> Not in general, no, it is certainly possible to support it correctly.
> Just not with this quick hack.

Unfortunately I hadn't said this before but this patch is not for release.
It's just the first try to support the discussion about performance. It
could be better to post it under that thread.

> The loop in serprog is a bug not a feature. I have just not removed it
Ok :-)

Patch

diff -U 5 -N ./flashrom/ft2232_spi.c ./flashrom.fast_ftdi/ft2232_spi.c
--- ./flashrom/ft2232_spi.c	2015-01-08 15:21:54.000000000 +0300
+++ ./flashrom.fast_ftdi/ft2232_spi.c	2015-01-18 08:44:26.000000000 +0300
@@ -27,10 +27,11 @@ 
 #include <ctype.h>
 #include "flash.h"
 #include "programmer.h"
 #include "spi.h"
 #include <ftdi.h>
+#include "chipdrivers.h"
 
 /* This is not defined in libftdi.h <0.20 (c7e4c09e68cfa6f5e112334aa1b3bb23401c8dc7 to be exact).
  * Some tests indicate that his is the only change that it is needed to support the FT232H in flashrom. */
 #if !defined(HAVE_FT232H)
 #define TYPE_232H	6
@@ -73,11 +74,14 @@ 
 	{OLIMEX_VID, OLIMEX_ARM_TINY_H_PID, OK, "Olimex", "ARM-USB-TINY-H"},
 
 	{0},
 };
 
-#define DEFAULT_DIVISOR 2
+/* Default divisor has been changed to 12 to provide 5 MHz frequency.
+ * This frequency is maximum stable frequency for operate with 64k buffer.
+ */
+#define DEFAULT_DIVISOR 12
 
 #define BITMODE_BITBANG_NORMAL	1
 #define BITMODE_BITBANG_SPI	2
 
 /* Set data bits low-byte command:
@@ -144,17 +148,20 @@ 
 static int ft2232_spi_send_command(struct flashctx *flash,
 				   unsigned int writecnt, unsigned int readcnt,
 				   const unsigned char *writearr,
 				   unsigned char *readarr);
 
+static int ft2232_spi_read(struct flashctx *flash, uint8_t *buf,
+			    unsigned int start, unsigned int len);
+
 static const struct spi_master spi_master_ft2232 = {
 	.type		= SPI_CONTROLLER_FT2232,
 	.max_data_read	= 64 * 1024,
 	.max_data_write	= 256,
 	.command	= ft2232_spi_send_command,
 	.multicommand	= default_spi_send_multicommand,
-	.read		= default_spi_read,
+	.read		= ft2232_spi_read,
 	.write_256	= default_spi_write_256,
 	.write_aai	= default_spi_write_aai,
 };
 
 /* Returns 0 upon success, a negative number upon errors. */
@@ -294,10 +301,15 @@ 
 			msg_perr("Error: Invalid SPI frequency divisor specified: \"%s\".\n"
 				 "Valid are even values between 2 and 131072.\n", arg);
 			free(arg);
 			return -2;
 		} else {
+			if (temp < DEFAULT_DIVISOR) {
+				msg_pinfo("\nWarning: SPI frequency is too high for FT2232H with 64k buffer.\n"
+					  "Some data from SPI may be lost. To ensure stability please\n"
+					  "encrease the divisor value at least to %d.\n\n", DEFAULT_DIVISOR);
+			}
 			divisor = (uint32_t)temp;
 		}
 	}
 	free(arg);
 
@@ -486,6 +498,53 @@ 
 		msg_perr("send_buf failed at end: %i\n", ret);
 
 	return failed ? -1 : 0;
 }
 
+/* FIXME: This function is optimized so that it does not split each transaction
+ * into chip page_size long blocks unnecessarily like spi_read_chunked. This has
+ * the advantage that it is much faster for most chips, but breaks those with
+ * non-continuous reads. When spi_read_chunked is fixed this method can be removed. */
+static int ft2232_spi_read(struct flashctx *flash, uint8_t *buf,
+			    unsigned int start, unsigned int len)
+{
+	int ret;
+	unsigned int i, cur_len;
+	const unsigned int max_read = spi_master_ft2232.max_data_read;
+	int show_progress = 0;
+	unsigned int percent_last = 0, percent_current = 0;
+
+	/* progress visualizaion init */
+	if(len >= MIN_LENGTH_TO_SHOW_READ_PROGRESS) {
+		msg_cinfo(" "); /* only this space will go to logfile but
+				 all strings with \b wont. */
+		msg_cinfo("\b 0%%");
+		percent_last = percent_current = 0;
+		show_progress = 1; /* enable progress visualizaion */
+	}
+
+	for (i = 0; i < len; i += cur_len) {
+		cur_len = min(max_read, (len - i));
+		ret = (flash->chip->feature_bits & FEATURE_4BA_SUPPORT) == 0
+			? spi_nbyte_read(flash, start + i, buf + i, cur_len)
+			: flash->chip->four_bytes_addr_funcs.read_nbyte(flash,
+						 start + i, buf + i, cur_len);
+		if (ret)
+			break;
+
+		if(show_progress) {
+			percent_current = (unsigned int)
+			    (((unsigned long long)(i + cur_len)) * 100 / len);
+			if(percent_current != percent_last) {
+				msg_cinfo("\b\b\b%2d%%", percent_current);
+				percent_last = percent_current;
+			}
+		}
+	}
+
+	if(show_progress && !ret)
+		msg_cinfo("\b\b\b\b"); /* remove progress percents from the screen */
+
+	return ret;
+}
+
 #endif