jfr.im git - z_archive/twitter.git/commit

author	Mike Verdone <redacted>
	Wed, 12 Feb 2014 14:06:23 +0000 (15:06 +0100)
committer	Mike Verdone <redacted>
	Wed, 12 Feb 2014 14:06:23 +0000 (15:06 +0100)
commit	786b9a4f7ca4dc2c7d6001ff094910918b306841
tree	f8d98ee2aaae746969230120f37eebf140ebc89d	tree \| snapshot (tar.gz tar.bz2 zip)
parent	99407dab6c2455664a46e0c171972cfd7b61f43b	commit \| diff
parent	90ec27595dcf2c362b3a47e5deb6416c7e0a3439	commit \| diff

Merge pull request #199 from adonoho/use-bytearray-buffer

Reduce memory usage by writing directly into byte array buffer.
Gentlefolk,

The whole point of using a `bytearray` as opposed to concatenating reads of `bytes` was to reduce memory usage. This pull request now takes that strategy to its logical conclusion by writing the balance of the chunk directly into the `bytearray`. This saves the creation of a temporary `bytes` array, up to 8KiB in size.

It does this by creating a `memoryview` of the `bytearray`. While this is still an allocation, the `memoryview`s are much smaller and are, presumably, reclaimed faster.

This patch has been running for over 24 hours and has processed over 4 MTw under Python v3.3.3 on OS X 10.8.5. It has also been run for about 10 minutes on Python v2.7.6 on a similar machine. `memoryview` does not appear to have been backported to Python v2.6.*. Hence, this pull request is incompatible with that platform.

Anon,
Andrew

P.S. The changes in this pull request are larger than strictly necessary for adding this functionality. I chose to improve the naming of my variables and move some of them closer to where they are used.