Mon, 30 Oct 2006
HTTP Performance
This fascinating article about HTTP performance caught my eye on Slashdot today. It parallels the W3C paper about HTTP, keep-alives and pipelining that I used as a reference while working on zsync. zsync uses HTTP 1.1 with keep-alive and pipelining, and certainly has terrible performance without it.
The most interesting thing is how the whole article is really about problems
caused by not having pipelining enabled by default in web browsers.
The recommendation ‒ run 4 different hostnames, as a way of raising the browser based restriction (actually mandated by the HTTP RFC) on the number of connections in parallel to the same site from 2 to 8 ‒ is effectively saying that the system as-is is broken. The RFC limits to 2 connections at a time because there is a tragedy of the commons here: every individual user gains by opening as many parellel connections as they can, but this hurts overall network performance (HTTP keep-alives are designed specifically to avoid having one HTTP connection overhead per HTTP request). Having lots of connections reduces latency, but it creates a lot of TCP connection overhead on the network and for the server.
The article is up-front about the fact that the performance problems are all solved by enabling pipelining. The discussion of upstream bandwidth is interesting, and I hadn't considered the effect of asymmetric Internet connections in this light before. But the upstream bandwidth wouldn't affect the latency (except for the first request) if pipelining was used; zsync works fine on ADSL, because it uses pipelining to transmit the next request up while the current one is being received down. Upstream bandwidth could still affect latency, but far less than the effect the article shows for servers without pipelining; and making multiple connections would certainly do no better.
Since the owners of broken servers are free to disable HTTP 1.1 keep-alives, or just put a proxy in front, I second the idea of just turning on pipelining in the mainstream browsers. HTTP 1.1 already solved this problem. The only reason we aren't benefitting from the solution is that it is disabled due to broken servers; but they will never be fixed if the option is never turned on.
[20:59] | [/computers/zsync] | #
Sun, 06 Aug 2006
zsync 0.5
The main feature of this release is that large file support is now enabled on
systems (Linux/i386 in particular) where it needs to be explicitely selected.
As I do most of the development on Linux/x86_64 now, I was blissfully ignorant
of this problem until Robert Lemmen forwarded a complaint about it.
I have also fixed some compilation problems to MacOS X and Solaris. There is also a substitute for getaddrinfo provided for systems that need it, which someone emailed me about.
Finally, I have made the source code repository for zsync available online.
This, and the new release, are available from the download page.
[11:36] | [/computers/zsync] | #
Sat, 08 Jul 2006
zsync 0.4.3
I have had this sitting around for a while, so it is about time that I released it. No big changes in this release; I have tidied up the program output, so the program is more silent with -s now. I have also added HTTP basic authentication support ‒ this makes zsync usable in places where you can't have everybody accessing your downloads. Get it from the download page.
[11:55] | [/computers/zsync] | #
Sat, 24 Jun 2006
Succinct
I have not seen this before — some of the Ubuntu people are experimenting
with methods based off of rsync/zsync for doing package updates (link).
In other news, I have frozen a copy of the zsync technical paper, for reference purposes. It has been ages since I updated it anyway, so I am keeping a copy as-is before I update it with some of my current ideas. It is important that I get a comparison in the paper with what I am calling structured patching systems: like the new differential Package list updating that Debian have implemented.
[18:01] | [/computers/zsync] | #
Sun, 09 Oct 2005
zsync 0.4.2
Just a minor update, fixing a few bugs. Download.
[22:21] | [/computers/zsync] | #
Tue, 12 Jul 2005
zsync 0.4.1
This is just a bugfix release. Someone noticed that zsync would be vulnerable
to CAN-2005-2096, due to it's use of zlib code. So the patch for that is now done.
This release also includes some HTTP protocol fixes, as there have been some complaints/observations on how zsync failed to correctly implement some details from the standard. I still have some of these to do, and the fixes in this release have not been heavily tested yet, so let me know if there are any problems.
Get it from the download page.
[21:40] | [/computers/zsync] | #
Sun, 08 May 2005
zsync 0.4.0
Nothing much has changed, but I thought it was about time for another release. The only significant change in this release is some fixes to the progress bar display. As zsync is quite stable now, and bug reports have dried up for the moment, I am declaring it to be a beta instead of an alpha now; I expect the file format and command-line interface to be fairly stable from now on.
[17:58] | [/computers/zsync] | #
Sun, 10 Apr 2005
zsync Progress
0.3.3 has made its way into Debian testing. FreeBSD still only has 0.2.2 in the ports, though. I see there are RPMs starting to appear, and it is included in Mandrake cooker.
I haven't touched it for two weeks now. No bug reports, which might be a good sign — or might not be. I think I make the next release a beta instead of an alpha, and just do any small fixes that I think of by then.
[16:57] | [/computers/zsync] | #
Sat, 26 Mar 2005
zsync 0.3.3
Here is 0.3.3. The major new feature is the optimised gzip compressor, which I described previously. This, implemented by the new -z option, is the recommended way of distributing files that are not already compressed.
I have also added the -k option, for keeping the zsync file on your local computer. This is useful if you ujst want to run zsync from a cron job, and have it download only when there is something newer on the server. Also new is the -e option, which causes zsyncmake to abort it it would not give the client an exact copy of the target file; and -C to disable the recompression support (which restores the 0.3.1 and earlier behaviour — sorry about that unannounced usage change in the previous version.
Apart from that, there are some bug fixes, for crashes that could occur in unusual cases. As usual, you can get the latest version from the download page. I am increasinly happy with the way zsync is working, and assuming the latest changes all settle down well, I am thinking of declaring a beta (instead of alpha) quality release. So I am interested for feedback on how people are getting on with it.
[21:32] | [/computers/zsync] | #
Downloading source tarballs with zsync
Another interesting application for zsync: downloading updated source tarballs. For regular software updaters, this has to be a big win. I wrote a small wrapper script for the fetch operation in the FreeBSD ports system, and was able to cut the total download for updating XFCE by roughly a third.
% make FETCH_CMD="/home/cph/src/zsync-fetch -ARr"
===> Vulnerability check disabled, database not found
===> Found saved configuration for xfce4-wm-4.2.1
=> xfwm4-4.2.1.tar.gz doesn't seem to exist in /usr/ports/distfiles/xfce4.
=> Attempting to fetch from http://www.us.xfce.org/archive/xfce-4.2.1/src/.
Downloading xfwm4-4.2.1.tar.gz; old versions ./xfce4/xfwm4-4.0.6.tar.gz found.
#################### 100.0% 21.2 kBps DONE
Read /usr/ports/distfiles/./xfce4/xfwm4-4.0.6.tar.gz. Target 27.4% complete.
downloading from http://www.us.xfce.org/archive/xfce-4.2.1/src/xfwm4-4.2.1.tar.g
z:
#################### 100.0% 47.3 kBps DONE
verifying download...checksum matches OK
used 1681408 local, fetched 857709
===> Extracting for xfce4-wm-4.2.1
=> Checksum OK for xfce4/xfwm4-4.2.1.tar.gz.
…
=> libxfcegui4-4.2.1.tar.gz doesn't seem to exist in /usr/ports/distfiles/xfce4.
=> Attempting to fetch from http://www.us.xfce.org/archive/xfce-4.2.1/src/.
Downloading libxfcegui4-4.2.1.tar.gz; old versions ./xfce4/libxfcegui4-4.0.6.tar
.gz found.
#################### 100.0% 15.2 kBps DONE
Read /usr/ports/distfiles/./xfce4/libxfcegui4-4.0.6.tar.gz. Target 33.8% complete.
downloading from http://www.us.xfce.org/archive/xfce-4.2.1/src/libxfcegui4-4.2.1.tar.gz:
#################### 100.0% 48.8 kBps DONE
verifying download...checksum matches OK
used 1582080 local, fetched 548199
===> Extracting for libxfce4gui-4.2.1
=> Checksum OK for xfce4/libxfcegui4-4.2.1.tar.gz.
…
In total, a 6.1MB download was reduced to 4.0MB. You can see the benefits of the recompression support added in 0.3.2 now — it reconstructs the original .gz file, allowing it to pass the checksum applied by the ports framework. This was on a fairly large update, from 4.0.x to 4.2.x, so a smaller update would probably see a larger benefit; this has to be a big application of zsync, I think.
For the moment I am running a CGI on moria.org.uk which will automatically generate .zsync files for FreeBSD package sources. And I am providing the zsync-fetch wrapper script which you see in use above. Feel free to try it out (note that you should upgrade to zsync-0.3.3 at least before using it, though, as 0.3.2 had some glitches in the recompression support) — it obviously only kicks in if you have an older version of a file to update from, and it only works for ports with gzip-compressed source tarballs (it just falls back to fetch for .tbz, .tar.bz, and files that moria.org.uk rejects or is unable to download). This is highly experimental, but I hope gives a good idea of the potential that zsync has.
[20:28] | [/computers/zsync] | #
Fri, 25 Mar 2005
Breakthrough!
Committed revision 399.
cph@athlon zsync/c% svn log -r399 makegz.c
-----------------------------------------------------------
r399 | cph | 2005-03-25 09:36:36 +0000 (Fri, 25 Mar 2005)
Built in gzip compressor which optimises for zsync.
I have been thinking about the problem of compression with zsync. Strangely, the code for looking inside gzipped files — written as a workaround, to help kickstart zsync when there is little rsync-able content already available — is currently the most efficient way of transferring most files. I was sure that there had to be something more efficient than compressing stuff with gzip --best and then doing elaborate hacks in zlib to enable us to decopresss mid-file.
I had been intending to look at Transfer-Encoding: gzip, using mod_deflate/mod_gzip, to see if this could get us compresion without the nasty hacks. But I was sceptical that this could take off, because it puts some load back on the server, and these modules are far from ubiquitous. What we want is the individual blocks to be stored compressed on the server, so it does not have to do any compression when clients connect.
Now we have it. I have imported the deflate code from zlib into zsync, and written a small gzip program (actually built into zsyncmake) which optimises the comressed file for zsync. It starts a new deflate block in the output at the start of every zsync block (so every 1024, 2048 or whatever bytes). Initial tests are very promising — on my main test case, of updating a 12MB Debian Packages file with 1 week of changes, the total transfer drops from 140KB to 107KB, taking us amazingly close to rsync -z's best result of 82KB (in fact, the difference between zsync and rsync now is almost precisely the size of the Z-Map — the map of the .gz file — at 23KB).
I have updated the technical paper with the theory and the results.
[11:29] | [/computers/zsync] | #
Tue, 22 Mar 2005
Application to Gentoo emerge
Some discussion about possibly using zsync to speed up emerge downloads for Gentoo. The idea of building the data into an ISO so the download can be immediately
mounted for use is certainly interesting. It's nice to see that zsync is holding up fairly well against rsync; but more importantly, being build on HTTP it has simpler requirements and can benefit from proxies, web caching etc.
[23:02] | [/computers/zsync] | #
zsync 0.3.2 is here
I have added some better progress indication to downloads now, which should
make it easier to see how fast zsync is going, how near it is to completion
etc. I have also fixed a SEGV that you could run into using locally downloaded
.zsync files. So I hope this version is a little more user-friendly.
The big feature in this version is recompression. zsync will now compress a
file after you download it if the original was compressed; and it does its best
to make the gzip file identical with the original (fixing the timestamp,
filenames etc in the gzip header). I think this opens up a whole new area of
applications, but I'll defer talking about that until I can put up some kind of
demonstration.
[22:09] | [/computers/zsync] | #
Sat, 19 Mar 2005
zsync in Action
- % ssh moria.org.uk
- cph@moria% wget http://www.mirrorservice.org/sites/ftp.x.org/pub/X11R6.8.2/src/X11R6.8.2-src3.tar.gz
- cph@moria% ~/zsync-0.3.1/zsyncmake -b 2048 -u http://www.mirrorservice.org/sites/ftp.x.org/pub/X11R6.8.2/src/X11R6.8.2-src3.tar.gz ./X11R6.8.2-src3.tar.gz
- cph@moria% mv X11R6.8.2-src3.tar.zsync /var/www/zsync/s/
- cph@moria% logout
- % wget http://zsync.moria.org.uk/s/X11R6.8.2-src3.tar.zsync
- % zcat /usr/ports/distfiles/xorg/X11R6.8.1-src3.tar.gz | zsync -i /dev/stdin X11R6.8.2-src3.tar.zsync
- reading seed file /dev/stdin: ****************************************************
- downloading from http://www.mirrorservice.org/sites/ftp.x.org/pub/X11R6.8.2/src/X11R6.8.2-src3.tar.gz:..
- downloading from http://www.mirrorservice.org/sites/ftp.x.org/pub/X11R6.8.2/src/X11R6.8.2-src3.tar.gz:
- hashhit 898738, weakhit 29601, checksummed 42053, stronghit 21757
- verifying download...checksum matches OK
- used 44537856 local, fetched 2389527
10MB download reduced to 0.29MB .zsync file and 2.39MB transfer to update from
the previous version. But there are far too many steps here; there's no way
this is convenient to use yet. Top of the todo list:
- Automatically recognise compressed local file data.
- rsh/ssh client support
[17:03] | [/computers/zsync] | #
Thu, 17 Mar 2005
zsync 0.3.1 released
Just minor cleanups this time, fixing things that broke in the last release.
Most importantly, some uncompressed streams would fail to download fully with
0.3.0 — this is fixed now.
[21:08] | [/computers/zsync] | #
Sun, 13 Mar 2005
0.2.x in Debian testing
Finally it is in. I have now upgraded the .zsync streams for the sarge and unstable package lists to 0.2+ format; you cannot use zsync-0.1.x with them anymore.
The 0.2+ format is much more compact, reducing the size of the .zsync to download by around 60%.
[16:17] | [/computers/zsync] | #