ATARI depacking benchmark

GFA, ASM, STOS, ...

Moderators: exxos, simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

ATARI depacking benchmark

Postby leonard » Sun Apr 02, 2017 6:40 pm

Hi,
I'm always curious about packing technics and depacking speed. Recently I looked at "shrinkler" wich is a really interesting packer for small data. It use arithmetic coding it's quite slow. So I decide to make a benchmark to look at verious atari depacker speed.
I attached the benchmark. It's a MSA file + an HdLoad.prg ( you can run it from harddrivre or from floppy). Warning: You need a 2MiB RAM machine.
I attached the becnhmark result coming from an atari STE
I'm interested if someone could post bench result from other machines such as MegaSTE, TT, Falcon, etc.
(warning: when running from floppy, the screen is white for a while, the time to load the complete disk, and then benchmark numbers start to appair)

Some explains about numbers:
first columns is the size in bytes. second colums "(xxx)" is the byte size of the depacking routine, and the third one (50hz:) is the number of 50hz tick (using a 50 hz timer so it should work on TT or falcon)

three files from the "we were @" demo are used: a small one (kernel), a medium with gfx data ( loader ) and a large one containing animation data)

Packer tested:
- am4 ArjBeta -m4 mode
- am7 ArjBeta -m7 mode
- l77 lz77 packer
.ice Ice 2.40
- pft: PackFire, Tiny mode
- pfl : PackFire, Large mode
- shk: Shrinkler


test_000_SC1425.png
You do not have the required permissions to view the files attached to this post.
Leonard/OXYGENE.

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Sun Apr 02, 2017 7:19 pm

leonard wrote:So I decide to make a benchmark to look at verious atari depacker speed.


The first screen in {Closure} is 123464 bytes unpacked, 12663 packed, includes all common code for all the screens in the demo and is perceived as starting instantly from boot sector .. ;)

(The depacking is _done_ when the first small logo dist appears. Actually, might be even earlier, I think I'm waiting for the fade to finish, would need to check)

/Troed

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Sun Apr 02, 2017 7:48 pm

out of curiosity, what packer are you using in {closure}?
Leonard/OXYGENE.

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Tue Apr 04, 2017 7:59 pm

Anyone would like to run my depacking test MSA file on other machine than STE? I'm curious to see the timing on Falcon for instance.
Leonard/OXYGENE.

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2759
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: ATARI depacking benchmark

Postby AtariZoll » Tue Apr 04, 2017 8:44 pm

Most of packers used with ST are very slow in depacking. Including commercial SW, menu disks ... They are slow even for floppy - depacking is slower than reading from floppy (what is about 20 KB/sec). I repacked many of it just to get faster work of SW.
I use mine simple, where depacking code is very short, some 60 bytes. It can 300 KB/sec (depacked data len.). Did not see faster on Atari.
Little slower is UPX, but with better compression ratio, code is still pretty short.
I will do test on Falcon tomorrow.
English language is like bad boss on workplace: it expecting from you to strictly follow all, numerous rules, but self bending rules as much likes :mrgreen:

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Tue Apr 04, 2017 8:55 pm

AtariZoll wrote:Most of packers used with ST are very slow in depacking


some of them are really good and depack as fast as floppy read. ( LH5 and ArjBeta -m7 ). I use these packers in my demos, and the depacking time is free ( I mean it depack at the same time the floppy is reading, and depacking use almost the same bandwidth)

I'm interested to see bench of PackFire large mode and Shrinkler, because both use MUL instructions (because of arithmetic coding). I guess Falcon could run MUL really faster than ST.
Leonard/OXYGENE.

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2759
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: ATARI depacking benchmark

Postby AtariZoll » Wed Apr 05, 2017 11:23 am

Here is what got on Falcon at 16MHz, caches on (default) :
DeptFalc.png

Does not seem that CPU is much more efficient.
May I ask: why non-standard floppy format, actually why floppy image at all instead nice regular files ?
I certainly will not steal any of this slow depackers :D
You do not have the required permissions to view the files attached to this post.
English language is like bad boss on workplace: it expecting from you to strictly follow all, numerous rules, but self bending rules as much likes :mrgreen:

User avatar
Cyprian
Atari God
Atari God
Posts: 1398
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: ATARI depacking benchmark

Postby Cyprian » Wed Apr 05, 2017 12:27 pm

leonard wrote:I'm interested if someone could post bench result from other machines such as MegaSTE, TT, Falcon, etc.

I could check it on MSTE/TT tomorrow.
Is it autostart floppy disk?
In case of yes, then figures for MSTE could be not correct, because starts in STE mode - 8Mhz no cache.
Also does it use TT-RAM or user should somehow set TT-RAM flag?
Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Aranym / Steem / Saint
http://260ste.appspot.com/

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Wed Apr 05, 2017 8:16 pm

AtariZoll wrote:Here is what got on Falcon at 16MHz, caches on (default)


thanks for bench! I though falcon would be faster for shk and pfl, really strange.

AtariZoll wrote:May I ask: why non-standard floppy format, actually why floppy image at all instead nice regular files ?


I do all my atari stuff with my own PC toolchain since a while, and my tools directly produce a MSA file with my own demo system. I agree a PRG file would be better for that kind of stuff, but I even don't have any tool to do that :)

AtariZoll wrote:I certainly will not steal any of this slow depackers :D


arj-m7 is currently the best compromise I find on ST ( packing ratio vs depacking speed ). It depacks as as fast as floppy read. What kind of depacker are you thinking about? If it's really faster then I doubt the packing ratio will be good. ( for instance here lz77 is really fast but does not pack really well)
Leonard/OXYGENE.

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Wed Apr 05, 2017 8:35 pm

Cyprian wrote:Is it autostart floppy disk?


it's a trackload demo system with bootsector etc. In case of MegaSTE I activate the cache and set to 16Mhz ( I use move.b #$ff,$ffff8e21.w )

For TT, I don't use TT ram at all ( I even don't know it exist :))
Leonard/OXYGENE.

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2759
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: ATARI depacking benchmark

Postby AtariZoll » Thu Apr 06, 2017 8:44 am

leonard wrote:...
arj-m7 is currently the best compromise I find on ST ( packing ratio vs depacking speed ). It depacks as as fast as floppy read. What kind of depacker are you thinking about? If it's really faster then I doubt the packing ratio will be good. ( for instance here lz77 is really fast but does not pack really well)

Klapauzius used too UPX because it has very good packing ratio, and depacking speed is much better than floppy speed - 150-200 KB/sec. Only "little" problem is that it is only for executables. I made some simple tools which add TOS header, so it can pack any file. After packing need to extract only packed data from SFX, again, there is batch tool for that. All it is not much handy. I contacted UPX team, so maybe will be able to make packer for ordinary files, using NRV2 packing. That will be not the best possible (what is closed source), but still at least as good as ICE in packing ratio, while depacking speed is way better. Some blah about UPX here: http://www.atari-forum.com/viewtopic.php?f=4&t=1813
Good thing with NRV, beside speed is that depacker code is very short - some 150 bytes for each of 3 methods usable.
English language is like bad boss on workplace: it expecting from you to strictly follow all, numerous rules, but self bending rules as much likes :mrgreen:

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Thu Apr 06, 2017 10:01 am

Indeed, and {Closure} uses UPX. I wrote a toolchain to automate the building and disk creation and updated some outdated things I've found on this forum to work with the latest UPX versions. As AtariZoll also did, I remove headers etc. The depacker fits in the boot sector, and the depack speed is so fast that I can get away with mostly no background loading in the demo - it's much faster to load and depack than to load the unpacked data.

My pack-commandline is: upx --nrv2b --small infile -o outfile

/Troed

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 11:57 am

troed wrote:The depacker fits in the boot sector, and the depack speed is so fast that I can get away with mostly no background loading in the demo -


oh UPX, ok. According to https://sourceforge.net/p/upx/discussion/6805/thread/e7c6e993/ -nrv2b means LZO packing algortihm. LZO is really fast for decompression, but has poor packing ratio. I will add UPX to my bench, could be interesting ( but I guess it will be very close to lz77 )

troed wrote:it's much faster to load and depack than to load the unpacked data.


You could get even faster using arj -m7, because the depacking bandwidth is close to floppy, and packing ratio is better, so you have less data to load :)
Leonard/OXYGENE.

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Thu Apr 06, 2017 12:40 pm

leonard wrote:oh UPX, ok. According to https://sourceforge.net/p/upx/discussion/6805/thread/e7c6e993/ -nrv2b means LZO packing algortihm.


Maybe :) Let's test.

The NRV2* ones are similar to LZO, but they compress better and decompress slightly slower


/Troed

User avatar
Orion_
Captain Atari
Captain Atari
Posts: 333
Joined: Sat Jan 10, 2004 12:20 pm
Location: France
Contact:

Re: ATARI depacking benchmark

Postby Orion_ » Thu Apr 06, 2017 3:28 pm

I didn't know about "Shrinkler", it seems like a better compromise than PackFire Tiny.
faster depacking and about as good in compression ratio (and not limited to tiny files)

User avatar
ggn
Atari God
Atari God
Posts: 1131
Joined: Sat Dec 28, 2002 4:49 pm

Re: ATARI depacking benchmark

Postby ggn » Thu Apr 06, 2017 4:29 pm

Thanks to Leonard for taking the initiative to start a thread like this - packing is always fun :). Here's the results from my TT, MSTE and Falcon (3 machines, 3 different monitors, 1 phone ghetto cam so bare with me!)

TT:
depacktt.jpg


MSTE:
depackmste.jpg


Falcon:
depackfalc.jpg


For many years I've been using arj mode 7 - for speed vs pack ratio it's unbeatable IMO. lz77 is also used when depack speed is essential. But since nobody mentioned it yet I'd like to point out NRV2e which is not that fast but also quite good pack ratio. Thanks to Insane/tSCc for pointing it out a few years ago (or maybe it was Defjam, I'm not 100% sure!). You can find the packer on dml's agtools repository (packer, depack source). It's been definitely used in his YM Gradius demo to squeeze the audio data down to quite good size. Hopefully Leonard will add this one to his tester too - I think it's worth it :).
You do not have the required permissions to view the files attached to this post.
is 73 Falcon patched atari games enough ? ^^

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Thu Apr 06, 2017 5:37 pm

ggn wrote:But since nobody mentioned it yet I'd like to point out NRV2e which is not that fast but also quite good pack ratio. Thanks to Insane/tSCc for pointing it out a few years ago (or maybe it was Defjam, I'm not 100% sure!). You can find the packer on dml's agtools repository (packer, depack source). It's been definitely used in his YM Gradius demo to squeeze the audio data down to quite good size. Hopefully Leonard will add this one to his tester too - I think it's worth it :).


I did not know about that depacker! UPX can use nrv2e which would create another multi platform option.

$ upx --nrv2e --small infile -o outfile

However, I just did a quick test (note - this is with the header still there):

-rw-r--r-- 1 troed staff 123464 Apr 4 21:01 CDIST2.ORG
-rw-r--r-- 1 troed staff 13144 Apr 4 21:01 CDIST2.NRV2B.PRG
-rw-r--r-- 1 troed staff 13084 Apr 4 21:01 CDIST2.NRV2E.PRG

-rw-r--r-- 1 troed staff 200066 Dec 25 2015 FAERY.ORG
-rw-r--r-- 1 troed staff 120912 Dec 25 2015 FAERY.NRV2B.PRG
-rw-r--r-- 1 troed staff 121128 Dec 25 2015 FAERY.NRV2E.PRG

(CDIST2 = the first screen in Closure + all the glue, music and common code for the demo. FAERY = the PhotoChrome 6.2 Faery image screen)

I think the nrv2b option in UPX still holds up, but I'd really like for it to be benchmarked toward the other options on Leonard's disk.

/Troed

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 7:17 pm

I just made a test with upx --nrv2b, it's close to arj -m7. Here are size results:

-a---- 01/04/2017 18:32 5560 kernel.l77
-a---- 01/04/2017 18:32 3885 kernel.am4
-a---- 06/04/2017 20:59 3836 kernel.upx
-a---- 01/04/2017 18:32 3347 kernel.am7
-a---- 01/04/2017 18:32 3310 kernel.ice
-a---- 02/04/2017 20:26 3226 kernel.sp3
-a---- 01/04/2017 18:32 3214 kernel.pft
-a---- 02/04/2017 20:03 3186 kernel.atm
-a---- 30/03/2017 22:58 3156 kernel.ztd
-a---- 01/04/2017 18:32 3092 kernel.pfl
-a---- 01/04/2017 18:32 2908 kernel.shk

-a---- 01/04/2017 18:32 117875 3dmorph.l77
-a---- 01/04/2017 18:32 101886 3dmorph.am4
-a---- 01/04/2017 18:32 100300 3dmorph.ice
-a---- 28/03/2017 00:12 100194 lines_b.ztd
-a---- 02/04/2017 20:05 94280 3dmorph.atm
-a---- 02/04/2017 20:27 93718 3dmorph.SP3
-a---- 01/04/2017 18:32 92660 3dmorph.am7
-a---- 06/04/2017 20:28 90692 3dmorph.upx
-a---- 28/03/2017 00:12 87135 3dmorph.ztd
-a---- 01/04/2017 18:32 79684 3dmorph.shk
-a---- 01/04/2017 18:32 73903 3dmorph.pfl

-a---- 01/04/2017 18:32 43011 loader.l77
-a---- 01/04/2017 18:32 35289 loader.ice
-a---- 01/04/2017 18:32 34855 loader.am4
-a---- 02/04/2017 20:04 33706 loader.atm
-a---- 02/04/2017 20:26 33388 loader.sp3
-a---- 06/04/2017 21:00 33276 loader.upx
-a---- 01/04/2017 18:32 31585 loader.am7
-a---- 01/04/2017 18:32 30170 loader.ztd
-a---- 01/04/2017 18:32 28992 loader.shk
-a---- 01/04/2017 18:32 27817 loader.pfl

I packed my files by transforming the files into PRG (adding a 28 bytes header).
upx files are packed with --nrv2b --small

btw .ztd is very promising stuff: it could become the new universal packing format ( http://facebook.github.io/zstd/ ). I don't have the 68k depacking code but it's classic LZxx algorithm so there is no reason to be slower than arj. I should investigate on that.
Last edited by leonard on Thu Apr 06, 2017 7:24 pm, edited 1 time in total.
Leonard/OXYGENE.

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Thu Apr 06, 2017 7:23 pm

leonard wrote:I just made a test with upx --nrv2b, it's close to arj -m7. Here are size results:
I packed my files by transforming the files into PRG (adding a 28 bytes header).
upx files are packed with --nrv2b --small


Thanks!

The header+included depack code in those packed PRGs is ~480 bytes in size so should be subtracted I think compared to the other test cases.

/Troed

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 7:25 pm

troed wrote:The header+included depack code in those packed PRGs is ~480 bytes in size so should be subtracted I think compared to the other test cases.


you're right! btw do you know where can I extract the packed data only (offset & size) and depacking routine?
Leonard/OXYGENE.

User avatar
troed
Atari God
Atari God
Posts: 1182
Joined: Mon Apr 30, 2012 6:20 pm
Location: Sweden

Re: ATARI depacking benchmark

Postby troed » Thu Apr 06, 2017 7:34 pm

leonard wrote:
troed wrote:The header+included depack code in those packed PRGs is ~480 bytes in size so should be subtracted I think compared to the other test cases.


you're right! btw do you know where can I extract the packed data only (offset & size) and depacking routine?


See PM

/Troed

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 8:19 pm

thanks! I will have a look carefully. I started to disasemble the UPX PRG bootstrap, the depacker is at the end of the PRG and seems really small. It does not contains any huffman-like decoding, that's why it's so fast. The packing ratio is quite good for a non-huffman offset-length coding.

I really think ZSTD could be better than any other ATARI packer ( exept arithmetic packers such as PackFire large or shrinkler cause it's too slow). But I have the feeling ZSTD depacking routine will be big.
Leonard/OXYGENE.

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2759
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: ATARI depacking benchmark

Postby AtariZoll » Thu Apr 06, 2017 8:54 pm

In case of TOS executables UPX will try all 3 - NRV2B, D & E methods and save which is shortest (when no method given in command line). Depackers are all very similar, and about 150 bytes long. Basically we do same thing as is done long time ago - adding TOS header, then stripping out only packed data form SFX + depacker. Best would be if someone could trace UPX in Windows, and use only packer part of it (NRV2B-E methods), without bothering with TOS header and SFX making. Just because it uses best, non-public NRV. There are sources for NRV methods at UPX site at 2004 - what they did. Packing ratio is little worse, but still excellent. I guess that pack2e is compiled using it - my tests indicate about 1-2% worse packing ratio.
By my experiences, after packing some 500 files with UPX NRV2B is most used. So, should try all 3 methods, not only E .
English language is like bad boss on workplace: it expecting from you to strictly follow all, numerous rules, but self bending rules as much likes :mrgreen:

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 9:28 pm

Here is a speed test of nrv2b. The speed is quite impressive for the packing ratio. it's not as good as arj -m7 2 times on 3, but depack speed is x1.5 faster. It's cool for atari demo files PRG because the total time is load_time + depack_time. But in case of demo with trackloader this is not usefull because total time is max( load_time, depack_time ).

DepackB_001_SC1425.png
You do not have the required permissions to view the files attached to this post.
Leonard/OXYGENE.

User avatar
leonard
Moderator
Moderator
Posts: 640
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: ATARI depacking benchmark

Postby leonard » Thu Apr 06, 2017 10:18 pm

latest news: I modify "We Were @" demo installer to use upx --nrv2b, and first floppy won't fit! All screens are bigger than arj -m7! I keep arjbeta -m7 for my demos :)

With UPX:
25642
32762
536708
10736
42576
32978
7920
20240
152788

With ARJ -m7
24924
31462
507354
10704
40284
32172
7790
19784
138860
Leonard/OXYGENE.


Social Media

     

Return to “Coding”

Who is online

Users browsing this forum: No registered users and 1 guest