4096 colors
Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team
Re: 4096 colors
What is the highest colour 'gaming' screen possible? Or to put it another way the lowest version of pcs/specturm that would be able to be used for a regular game?
Glad to hear its all coming together d.m.l.
Glad to hear its all coming together d.m.l.
Atari STFM 512 / STe 4MB / Mega ST+DSP / Falcon 4MB 16Mhz 68882 - DVD/CDRW/ZIP/DAT - FDI / Jaguar / Lynx 1&2 / 7800 / 2600 / XE 130+SD Card // Sega Dreamcast / Mega2+CD2 // Apple G4
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
Re: 4096 colors
Ok, doing hard-sync on Falcon would be close to hopeless I guess with all the different DMA things going on, not to mention the 31 kHz hsync (VGA) which would make the colour changes twice as long. But the real killer is the videl bug that produces noise on the screen while updating the palette.dml wrote: TBH my memory is beginning to come back - and it's likely I didn't try it on the ST/e machines at all. It would have been the Falcon, and for a plasma zoomer thingy (changing colour 0 only). I'm sure I still have the test somewhere here on the 030's HD.
Yes 56 colours being uneven by 16 is a little problem. We'd need to feed 48 colours with the blitter (during visible scanlines) and the remaining 8 colours with CPU in the borders. Setting sy/sx inc. and dx/dy inc. correctly should also take away most of the blitter setup for each line (remember the blitter inc can be negative so it will "loop" through the colour registers without any manual correction). The colour data would need to be organised in two buffers, one for blitter and one for CPU to avoid having to correct modulo.dml wrote: If I had done it on the ST for PCS, I think it would have caused extra problems with colour count per line (line ending half way through a palette, needing multiple blits or different colour alloc rules per scanline) which I would definitely remember.
Simple calc:
move.w d0,(a0) ;8 cycles (a0=$ffff8a38 (blitter rows))
move.b d1,(a1) ;8 cycles + 8 cycles (a1=$ffff8a3c (start blitter (Mega STe is 8+12)))
48 * 8 = 384 cycles for the blitter pass
4*move.l (a2)+,(a3)+ = 80 cycles (a2=colour data, a3=$ffff8240 (CPU pass))
move.l a4,a3 ;4 cycles (reset a3 to $ffff8240)
Total: 492 cycles (STe) and 496 on MSTe.
So unless I did a brainfart (I often do ) it looks doable at least with 199 lines.
The lower border should be able to fit in 16 cycles if done perfect (mixed in with the CPU code) and hence we are at exact 512 cycles on MSTe and 508 on STe. Wehoo!
Cool, looking forward to itdml wrote: I'm already looking at a portrait/landscape overscan mode for v5, which may work better now with the new colour allocation rules. I tried landscape mode years ago but the reproduction was poor because of the lower colour change density, with simplistic colour allocation causing streaks to appear in the image, and it gets worse as the palette bit depth is increased and competition increases. That should all be dealt with this time round.
The lower border is mixed in with the left border killing. In my case I've decided to waste a little more CPU on the borders to save address registers, and a scanline looks like this on ST (each move is 12 cycles):dml wrote: A full overscan mode would be nice too but bottom border could be a pain - I can't remember where it occurs in the display frame and it probably requires a special ruleset to map colours on that line. I may try it but it will be much later - I'll see how the other modes go first. If it isn't going well I may change my mind
Code: Select all
; D7 = 2
; NORMAL SCANLINE
;left border
move.b d7,$ffff8260.w
move.w d7,$ffff8260.w
dcb.w 88,$4e71
;right border
move.w d7,$ffff820a.w
move.b d7,$ffff820a.w
dcb.w 11,$4e71
;stabilizer
move.b d7,$ffff8260.w
move.w d7,$ffff8260.w
dcb.w 11,$4e71
;LOWER BORDER SCANLINE
;left and bottom
move.w d7,$ffff820a.w ; this instruction is set last on the line before
move.b d7,$ffff8260.w
move.w d7,$ffff8260.w
move.b d7,$ffff820a.w
dcb.w 85,$4e71
;right border
move.w d7,$ffff820a.w
move.b d7,$ffff820a.w
dcb.w 11,$4e71
;stabilizer
move.b d7,$ffff8260.w
move.w d7,$ffff8260.w
dcb.w 11,$4e71
That also reduces the linewidth from the odd 230 bytes to more even 224 bytes.
I drew up a little page before to explain overscan and linewidths to some guys, might be useful:
http://ae.dhs.nu/overscan/
Also, wasting a few address reigsters can shave up more cycles, but I've yet to try timings on that to know exactly were we end up at, but probably around 4 cycles for each border and stabilizer, that's becase you need to pad with some nops as the switches gets too fast otherwise.
Re: 4096 colors
If you want writes to the palette registers to be perfectly timed so that they correspond to particular positions on the screen, and you're using HB interrupts, then yes, you'll have to take into account the E clock.dml wrote: So are you saying that an additional / alternate form of sync is required before starting an external device, versus normal CPU bus cycles? and that it's due to the *internal* clock?
cheers
Simply reloading the palette registers at the end of a scanline on a HB interrupt with the blitter isn't much of a problem. Small variations in timing don't matter. But suppose you desire to reload the registers at a precise point on a scanline? The variation in interrupt latency is going to make it difficult to start the blitter at the precise moment.
But by precisely timing blits to palette registers, you can get color changes in the middle of a scanline exactly where you want them. And by properly setting up the destination increment registers, it's even possible to continuously write to all the palette registers over the entire length of a scanline.
Another trick is to setup the blitter to write to one particular palette register over and over again by using a destination X increment of -2. This creates a pseudo low resolution 64-pixel wide, 64 colors [oops -- 40 pixel, 40 colors] per scanline background and the remaining 15 colors can be used for sprites on top of this background. You might even try some other combinations, like updating the same 4 palette registers over and over for a 2 bitplane background and using the remaining 2 bitplanes for sprites.
Re: 4096 colors
It would have been composite/TV mode I was using at the time for demo stuff (it was from a time before CRTs became quite dead and people still had them . Still not quite clear on the details but I'll update if/when I find the test and figure out what it was doing and where it was going wrong, just in case it was an ST experiment after all.evil wrote: Ok, doing hard-sync on Falcon would be close to hopeless I guess with all the different DMA things going on, not to mention the 31 kHz hsync (VGA) which would make the colour changes twice as long. But the real killer is the videl bug that produces noise on the screen while updating the palette.
(back to CPU version for bit...)
I notice that today's experiments with STEEM show the familar 1-pixel vertical lines in PCS images when the colour allocation rules don't exclude recently changed palette indices. These lines are stable under STEEM but in my original experiments (old Mega4) this line would be noisy and subject to machine startup conditions. There's a 1-clock 'window of uncertainty' involved. I have attached a pic (excuse the terrible quality - just a simplified test).
without sync correction rule:
with sync correction rule:
(My vague recollection of the 'early blitter experiment' I referred to earlier resulted in something similar - but the window just appeared to be bigger and more noisy).
I can see various trades involved with selecting which occurs first - cpu or blit. Reg reloading is obviously best done inside the border - as early as possible, but starting a blit in the middle of a scan will take longer than starting a preloaded movem. Preloading registers then starting the blit early (similar to your example) is likely better. OTOH, leaving some registers free could help reduce this delay, at the cost of dropping movem or using a smaller movem. Not using a movem for the CPU part at all means *slightly* less change density for that section (more density is good, since you want most of the changes within the scanline if possible).evil wrote: Yes 56 colours being uneven by 16 is a little problem. We'd need to feed 48 colours with the blitter (during visible scanlines) and the remaining 8 colours with CPU in the borders.
Anyway I'd probably start with your example because you have the timings there already - and improvements on that aren't likely to be big, if anything
Looks decent to me. It's worth trying. I'm still working on the new convertor, but after some fuss porting the image library (eek) I have it basically working now so I might shift back to display stuff for a bit once I'm happy most of the 'old functionality' is in.evil wrote: So unless I did a brainfart (I often do ) it looks doable at least with 199 lines.
I'm looking to refactor the display devices into handlers, such that new handlers can be built for new devices and contain all the code specific to that device. So it should be easier to add modes for machine variants, overscan modes, cpu speeds etc. for anyone with the lifespan and determination to fill that matrix out
Yes that's neat - almost evilevil wrote: The lower border should be able to fit in 16 cycles if done perfect (mixed in with the CPU code) and hence we are at exact 512 cycles on MSTe and 508 on STe. Wehoo!
Thanks for all the samples, it will definitely speed things along I'm sure - I'll refer/mention your input in the next version when i get that stuff working. I'll also link the relevant DHS notes writeups from the site.evil wrote: The lower border is mixed in with the left border killing. In my case I've decided to waste a little more CPU on the borders to save address registers, and a scanline looks like this on ST (each move is 12 cycles):
brilliant.evil wrote: I drew up a little page before to explain overscan and linewidths to some guys, might be useful:
http://ae.dhs.nu/overscan/
In order to avoid getting myself into a mess, I'd probably start with just getting a reliable reference working before shaving cycles here and there, it's likely to involve testing on real machines - which means waiting for that floppy emulator to arrive (to help with the cross-dev). I might try the overscan stuff in Steem for a laugh but not holding out a lot of hope that it will either work or do the same thing as a real machine. I'm impressed with it so far but this may be pushing itevil wrote: Also, wasting a few address reigsters can shave up more cycles, but I've yet to try timings on that to know exactly were we end up at, but probably around 4 cycles for each border and stabilizer, that's becase you need to pad with some nops as the switches gets too fast otherwise.
I also still need to sort out an assembler to use along with gcc - either vasm or just put up with gas (yuck). I haven't checked the object format for vasm but if it's compatible with the gcc linker I'll just use it. Having to use devpac with cross-dev is uncomfortable. It's ok when you're working exclusively on the native machine but painful from the PC and my Ataris are still being put back together...
cheers.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
...so I have uploaded an early version of the new convertor v5 which currently implements only the 'photochrome' (STE) mode with a fixed version of the original 'hatched' dithering.
It still using a very simplistic colour allocator but I have now sorted out the nasty strobing issue :-z. The dual dithering in the old version looks like it was actually wrong and I didn't notice. So the average intensity of the frames was not exactly equal and therefore flickered more than necessary. gaaah So the DHS 'complaint' was accurate but not for the reasons assumed!
Anyway a 68k test build can be found here: http://www.leonik.net/dml/sec_pcs.py
It's pretty slow on an 8mhz machine and uses tons of ram - partly due to the FreeImage file format library being massive (it's especially slow if your input image is not exactly 320x200!) and bloating my executable. Experimenters may want to consider running it on something bigger (like an emulator to get images out until I do something about that.
I may drop FreeImage if I can't prune the plugins down and make it more efficient, it just seemed like a handy way to get access to tons of file formats. This library is well used by games devs but it's easy to forget just how fat some of this stuff is compared with what we had 20 years ago!
I'm not putting up a x86 build yet because there are likely still some intel/motorola endian issues to fix. I did my best while writing the code but haven't tested it.
CORRECTION - the x86 build did actually work so it's now on the site too. You might need to install cygwin tho - I was in a hurry and didn't check.
Anyway overall not bad I think for a few evenings work, from scratch. I still have a few things to sort out before I start on the improvements but I think the output is already better than v4...
It still using a very simplistic colour allocator but I have now sorted out the nasty strobing issue :-z. The dual dithering in the old version looks like it was actually wrong and I didn't notice. So the average intensity of the frames was not exactly equal and therefore flickered more than necessary. gaaah So the DHS 'complaint' was accurate but not for the reasons assumed!
Anyway a 68k test build can be found here: http://www.leonik.net/dml/sec_pcs.py
It's pretty slow on an 8mhz machine and uses tons of ram - partly due to the FreeImage file format library being massive (it's especially slow if your input image is not exactly 320x200!) and bloating my executable. Experimenters may want to consider running it on something bigger (like an emulator to get images out until I do something about that.
I may drop FreeImage if I can't prune the plugins down and make it more efficient, it just seemed like a handy way to get access to tons of file formats. This library is well used by games devs but it's easy to forget just how fat some of this stuff is compared with what we had 20 years ago!
I'm not putting up a x86 build yet because there are likely still some intel/motorola endian issues to fix. I did my best while writing the code but haven't tested it.
CORRECTION - the x86 build did actually work so it's now on the site too. You might need to install cygwin tho - I was in a hurry and didn't check.
Anyway overall not bad I think for a few evenings work, from scratch. I still have a few things to sort out before I start on the improvements but I think the output is already better than v4...
Last edited by dml on Mon Aug 27, 2012 8:39 pm, edited 1 time in total.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
wow I really like a such topics
Dal pls share source code for that converter.
More or les year ago I was struggling with a such tool but I died on color quantization for spectrum512 color allocation (48 colors, every changed 12 pixels)
Actually I was tried to implement blitter spectrum - change color every 8 pixels. Asm routine was done but I had no possibility to convert any image to that format...
Dal pls share source code for that converter.
More or les year ago I was struggling with a such tool but I died on color quantization for spectrum512 color allocation (48 colors, every changed 12 pixels)
Actually I was tried to implement blitter spectrum - change color every 8 pixels. Asm routine was done but I had no possibility to convert any image to that format...
Lynx I / Mega ST 1 / 7800 / Portfolio / Lynx II / Jaguar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
DDD HDD / AT Speed C16 / TF536 / SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.atari.org
DDD HDD / AT Speed C16 / TF536 / SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.atari.org
Re: 4096 colors
it's nice to be back in the Atari communityCyprian wrote:wow I really like a such topics
The source will definitely be made public. I'm holding it just long enough to make it modular, so it can be extended properly. It's not quite ready yet but in a few days or a week will be open.Cyprian wrote:Dal pls share source code for that converter.
It is a difficult problem. I have looked at different solutions to it, but it is effectively an NP-complete problem, or NP-hard minimum. An optimal solution isn't obvious and perhaps not possible.Cyprian wrote:More or les year ago I was struggling with a such tool but I died on color quantization for spectrum512 color allocation (48 colors, every changed 12 pixels)
The old PCS (implemented in 68k asm - which is why it was quick, but also a bit too simplistic) used reverse-carry indexing to subdivide spans of pixels progressively, to make colour allocation more 'fair'. But it was really just sidestepping a very difficult problem.
I have a new algorithm to try, which uses weighted bins with every colour allocated to every palette index, and progressively re-balances colours so they get allocated to bins where they can be merged with the minimum error. This is similar to the cube-splitting colour reduction algorithm but better at dealing with these overlapping palettes....
I have some other improvements to make which are perception-oriented, such as managing colours which exist only at edges or are isolated, versus colours within fine gradients. There is also the issue of error diffusion and how best to manage the error. Lots of stuff to try.
Well hopefully this tool will make that easier. Just implement your own display code, and provide sync tables for the colour change timing. Most of the rest will be solved. If different colour reduction, dithering or device layouts are needed those can be replaced too.Cyprian wrote:Actually I was tried to implement blitter spectrum - change color every 8 pixels. Asm routine was done but I had no possibility to convert any image to that format...
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
- DarkLord
- Ultimate Atarian
- Posts: 5537
- Joined: Mon Aug 16, 2004 12:06 pm
- Location: Prestonsburg, KY - USA
- Contact:
Re: 4096 colors
Crap - have to wait until the full version is out, or go dig out my Mega STe. I've got
it packed away in a closet right now (shame, I know, but no room ATM).
Thanks for the update(s) Doug!
it packed away in a closet right now (shame, I know, but no room ATM).
Thanks for the update(s) Doug!
Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
- calimero
- Fuji Shaped Bastard
- Posts: 2624
- Joined: Thu Sep 15, 2005 10:01 am
- Location: Serbia
- Contact:
Re: 4096 colors
Just to ask same question as Nativ:
so there are no CPU time left if you make 3 palete switch (48 colors) per scanline?
it should be like 3 times * 16 * move.w (one palete entry is 9 bits so it is a word?) per scanline, right?
so there are no CPU time left if you make 3 palete switch (48 colors) per scanline?
it should be like 3 times * 16 * move.w (one palete entry is 9 bits so it is a word?) per scanline, right?
using Atari since 1986. ・ http://wet.atari.org ・ http://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Re: 4096 colors
There are 48 (3 x full set of 16) colour changes per scan yes. However there are more like 50-60 colours *available* on each scanline (I didn't calculate the true number out of lazyness) because each scan refers to part of the last palette on the previous scan. So there are approx 4 palettes per scanline used in the display - a little less than that, because of this borrowing effect.calimero wrote:Just to ask same question as Nativ:
so there are no CPU time left if you make 3 palete switch (48 colors) per scanline?
In terms of data transferred however - that's 3 palettes per line.
Something like that. There are different ways to load the colours but it amounts to the same process overall. It is words yes. 7 (or 4) bits are wasted depending on ST/STE.calimero wrote: it should be like 3 times * 16 * move.w (one palete entry is 9 bits so it is a word?) per scanline, right?
There is actually some time left to load more colours but not an entire palette - a partial palette. I think some people do this to get more colours overall but this can cause the colour changes to spread out into the borders and reduce the 'change density' during display time.
As for doing 'other things' with the time - that is also possible, but seriously difficult. If the top and bottom border is unused (as with most of the display routs) then you have that time to do other things. Maybe 15% CPU approx (somebody will have better figures).
If you want to use cycles within the scanline - that's the doman of demo programming. Whatever you do, it has to be extremely tight, very fine grained (to fit in small cycle windows) and have constant time overhead (no wobble). No multiplications, shifts or other variable-time operations as these would require re-synchronizing the cpu with the display 'timer'.
I once considered implementing a small 'virtual cpu' inside the scanline of an overscan display rout, in order to execute code in a regulated way, from inside that. But it's pretty hard, and you're effectively implementing a new cpu or virtual machine, and then have to write code for that somehow. Painful and not recommended. Might be quite confusing for coders watching it though.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
In a couple of the recent Atari Arcade > Atari ST fixes Klaz turned up some 6502 'cores' that were left to run the game logic, I believe.dml wrote:[I once considered implementing a small 'virtual cpu' inside the scanline of an overscan display rout, in order to execute code in a regulated way, from inside that. But it's pretty hard, and you're effectively implementing a new cpu or virtual machine, and then have to write code for that somehow. Painful and not recommended. Might be quite confusing for coders watching it though.
If you used a 'widescreen' left and right border ( perhaps containing the Score? ).... this would still be a ***Full colour screen*** ???
Have you ever seen Gauntlet IV on the MegaDrive? not sure how many colours it uses, but there's a funky overscan full screen to the game!
Atari STFM 512 / STe 4MB / Mega ST+DSP / Falcon 4MB 16Mhz 68882 - DVD/CDRW/ZIP/DAT - FDI / Jaguar / Lynx 1&2 / 7800 / 2600 / XE 130+SD Card // Sega Dreamcast / Mega2+CD2 // Apple G4
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
- calimero
- Fuji Shaped Bastard
- Posts: 2624
- Joined: Thu Sep 15, 2005 10:01 am
- Location: Serbia
- Contact:
Re: 4096 colors
thanx for replay dml.
so you essentially load new colors in color tables as scanline progress (e.g. you will have one new color every xx pixels)?
would it be possible to reserve e.g. two color registers for simple sprite/cursor over static image (would there be enough CPU time to draw/mask cursor)?
and if you use blitter for settings color tables? will it free some cpu more?
I ask all this to see if it is possible to write point and click adventure with 512 colors...
so you essentially load new colors in color tables as scanline progress (e.g. you will have one new color every xx pixels)?
would it be possible to reserve e.g. two color registers for simple sprite/cursor over static image (would there be enough CPU time to draw/mask cursor)?
and if you use blitter for settings color tables? will it free some cpu more?
I ask all this to see if it is possible to write point and click adventure with 512 colors...
using Atari since 1986. ・ http://wet.atari.org ・ http://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Re: 4096 colors
Yes. The palettes are 'skewed' because one change occurs every xx pixels, and it's a different index changing each time. Some pixels can see into more than one palette at a time, but never more than 16 colours available at any time per pixel of course.calimero wrote: so you essentially load new colors in color tables as scanline progress (e.g. you will have one new color every xx pixels)?
The code used to do the transfer decides the spacing between colour changes.
Yes you can reserve colour registers (if you want to mask sprites or other graphics on top) and you can probably reserve a plane or two if required (if you want to draw without masks) - but the more you reserve the more damage you'll do to the image.calimero wrote:would it be possible to reserve e.g. two color registers for simple sprite/cursor over static image (would there be enough CPU time to draw/mask cursor)?
I'll see if I can incorporate the 'reserved colours & planes' idea into PCS v5, in case it helps with projects like this. It should be reasonably straightforward (parameter + new conversion tables) although a specific display routine will be needed for each permutation -
CORRECTION - that's not really true a standard display rout would work too, but you'll get more colours on screen if the bandwidth isnt wasted transferring the same colours to the same regs over and over and that means a custom routine. Best work with a standard display first and pay the price of a few colours then claim some colours back with a better version later.
No because the blitter hogs the bus while it is working, locking out the CPU. It just takes less time to do its work. It's not really practical to try to 'buy back' time within scanlines used in this way (for overscan, or palette boosting). At best you can reserve colours or change colours at different rates. The CPU time available is based on how many scanlines you 'own' for this task, and how many remain free.calimero wrote:and if you use blitter for settings color tables? will it free some cpu more?
e.g. an image with left/right overscan will use the same amount of cpu as a normal image (but fewer colours possible per scanline, because some time needed to do overscan). Free CPU will be the same.
I think it is likely you can do this yes. You'll need to write efficient code so it can execute in the remaining 10-20% CPU time without feeling unresponsive. Processes which update the screen would be best done incrementally / as changes, rather than drawing something every single frame (except the mouse cursor of course!). Even updating text should be done gradually a character at a time, 'threaded' with other work. A little work scheduler would be a good idea.calimero wrote:I ask all this to see if it is possible to write point and click adventure with 512 colors...
I wish you success with your plan
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
I just took apart the old 'tobias richter' slideshow of mine to find out what the overscan/colour boost stuff was doing. It works quite differently from PCS and the image quality isn't great. The images are double-height interlaced (must be around 400 or 500 lines) and there is only one palette change per scan. It doesn't look like I was short of cycles for colour changes - it looks more like I didn't produce a sophisticated enough colour reduction routine to handle skewed palettes at the time (must have been before PCS then). So there is indeed plenty of dead time to load more colours in overscan mode.
The overscan code used there is not very compact - there wasn't any optimisation required to load one palette - but amazingly it does work in Steem (!?). That's one solid emulator...
The overscan code used there is not very compact - there wasn't any optimisation required to load one palette - but amazingly it does work in Steem (!?). That's one solid emulator...
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
There is some time left. One of the projects I still want to do is writing the most colorful game for the ST ever. The game is planned with a maximum of 44 colors per scanline. I have written a proof of concept code part. Before I start this game I want to finish an other game sitting on my harddisk for more then a year. Then I have to write some very specific color quantization routines to create the graphics for the game. I think this will be a special research project for me into color quantization. How to make the most of the limited colors available is not a trivial task. Having done this difficult but interesting hurdle comes the most difficult task: creating beautiful graphics and put the game together.calimero wrote:Just to ask same question as Nativ:
so there are no CPU time left if you make 3 palete switch (48 colors) per scanline?
it should be like 3 times * 16 * move.w (one palete entry is 9 bits so it is a word?) per scanline, right?
The answer to your question is yes, I thing it is possible to make a game for the ST using 44 colors per scanline.
Hans Wessels
Re: 4096 colors
I'm on OS X so using Hatari for cross-dev. Hatari has been rock-solid for timings in fullscreen, blitter and Spectrum pictures. Plus it's still developed very activly by Mr.Styckx who himself is an expert on ST lowlevel coding (see for example the great No Cooper demo). I can't but recomend giving it a go, even if the UI is still a bit boring.dml wrote: I notice that today's experiments with STEEM show the familar 1-pixel vertical lines in PCS images when the colour allocation rules don't exclude recently changed palette indices. These lines are stable under STEEM but in my original experiments (old Mega4) this line would be noisy and subject to machine startup conditions. There's a 1-clock 'window of uncertainty' involved. I have attached a pic (excuse the terrible quality - just a simplified test).
As a bonus, it will run Apex for you
Well I did forget one tiny little thing; the blitter source needs updating as well, and it will be a problem, not just by cycles, but to start the blitter in "mid palette" will fork up the source x/y inc.dml wrote: Looks decent to me. It's worth trying. I'm still working on the new convertor, but after some fuss porting the image library (eek) I have it basically working now so I might shift back to display stuff for a bit once I'm happy most of the 'old functionality' is in.
The solution is probably a giant blitter-pass from the first scanline down to the lower border scanline. The blitter will fill out 64 colours per line (the palette data has padded space for the extra 8 colours). 64 colours will be exactly 512 cycles so it fits well. The Zerkman blitter mode works like that (in Antiques demo). Once done the first 228 lines of a big blitter-pass, the lower border needs taken care of real good. Preloaded CPU registers with 8 colours, so they can be movem'ed out (40 cycles), starting blitter twice (once for the intermediate lower-border line, and once for starting up a big pass for the remaining 44 lines).
I'll think about this some more, but for sure 228 lines shouldn't be a problem, the Zerkman rout should already be able to do that if he had killed the top border.
With vasm comes vbcc and vlink as well, the object formats between the vbcc/vasm are of course compatible with vlinkdml wrote: I also still need to sort out an assembler to use along with gcc - either vasm or just put up with gas (yuck). I haven't checked the object format for vasm but if it's compatible with the gcc linker I'll just use it. Having to use devpac with cross-dev is uncomfortable. It's ok when you're working exclusively on the native machine but painful from the PC and my Ataris are still being put back together...
Example makefile for a one-object assembler project (I'm a strange person who never learned C but does everything in assembler instead..):
Code: Select all
PATH := $(PATH):/usr/local/bin:/opt/vbcc/bin
CC = /opt/vbcc/bin/vc
ASM = /opt/vbcc/bin/vasm
LD = /opt/vbcc/bin/vlink
CFLAGS = -cpu=68000 -O1
ASMFLAGS = -m68000 -Felf -noesc -nosym -quiet -no-opt
LDFLAGS = -bataritos -tos-flags 7
LOADLIBES =
LDLIBS =
PRG = main.tos
OBJ = main.o
.PHONY: main.s # always rebuild target
all : $(PRG)
install : $(all)
mcopy -o main.tos e:main.tos
sync
$(PRG): $(OBJ)
$(LD) $< $(LDFLAGS) -o $@
.c.o:
$(CC) -c $(CFLAGS) $<
.s.o:
$(ASM) $(ASMFLAGS) $< -o $@
main.o: $(SRC)
clean:
rm -f $(PRG) $(OBJ)
Moved from Xcode to Eclipse to become a little more platform independent.
Cool, looking forward to the improved colour allocator. It would also be very neat if it could handle any Y resolution (can even easily hardscroll that on ST with 8 scans at a time), then with a good dither.. Wow all I need to do then is to run the source 24-bit image one time through Photochrome and be done with it. What an improvement! Might be a good time for a making a slideshow thendml wrote: It still using a very simplistic colour allocator but I have now sorted out the nasty strobing issue :-z. The dual dithering in the old version looks like it was actually wrong and I didn't notice. So the average intensity of the frames was not exactly equal and therefore flickered more than necessary. gaaah So the DHS 'complaint' was accurate but not for the reasons assumed!
- calimero
- Fuji Shaped Bastard
- Posts: 2624
- Joined: Thu Sep 15, 2005 10:01 am
- Location: Serbia
- Contact:
Re: 4096 colors
I have in mind: wet.atari.org - I already wrote game engine in GFA basic, what is left is to cut/convert complete graphics from PC.Nyh wrote: There is some time left. One of the projects I still want to do is writing the most colorful game for the ST ever. The game is planned with a maximum of 44 colors per scanline. I have written a proof of concept code part. Before I start this game I want to finish an other game sitting on my harddisk for more then a year. Then I have to write some very specific color quantization routines to create the graphics for the game. I think this will be a special research project for me into color quantization. How to make the most of the limited colors available is not a trivial task. Having done this difficult but interesting hurdle comes the most difficult task: creating beautiful graphics and put the game together.
The answer to your question is yes, I thing it is possible to make a game for the ST using 44 colors per scanline.
Hans Wessels
for Spectrum 512-like version of game everything should be rewriten in pure asm anyway... :/
maybe Elansar could be made in Spectrum 512 technic
in ether way, I would suggest conversion of PC/Mac game for this project.
using Atari since 1986. ・ http://wet.atari.org ・ http://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Re: 4096 colors
yeah was waiting for that game to come out.... for sometime now.calimero wrote: I have in mind: wet.atari.org - I already wrote game engine in GFA basic, what is left is ....
My Stuff: FB/Falcon CT63 CTPCI ATI RTL8139 USB 512MB 30GB HDD CF HxC_SD/ TT030 68882 4+32MB 520MB Nova/ 520STFM 4MB Tos206 SCSI
Shared SCSI Bus:ScsiLink ethernet, 9GB HDD,SD-reader @ http://phsw.atari.org
My Atari stuff that are no longer for sale due to them over 30 years old - click here for list
Shared SCSI Bus:ScsiLink ethernet, 9GB HDD,SD-reader @ http://phsw.atari.org
My Atari stuff that are no longer for sale due to them over 30 years old - click here for list
Re: 4096 colors
Well the other night I was playing with the reworked convertor and I managed to do a few new things with it.
1) Change the pixel reduction order from recursive subdivision of each scanline (but otherwise still scanline order) to using a linear congruential generator, sampling the full image once in pseudo-random order. The error is better distributed because subsequent scanlines aren't suffering from palette choices already fixed at the end of the previous line.
2) Switch the error calculation into CIELAB space (reference white = D65), instead of RGB space, in an attempt to make error less perceptible to the eye, versus just going for the minimum numerical error in the palette values themselves. This conversion is expensive and stops an ST dead - but I'm testing on a PC so that's ok.
3) Fix the dithering code - separating the dither step into flicker-management and error-dither steps which are orthogonal to each other. I haven't finalised dithering yet but this version is at least functionally correct.
I still need to write the new reduction/bin-rebalancing algorithm but I have it figured out, will try when I get some time. It will need a lot of memory at runtime and probably quite expensive but it should do a very good job I think.
First test output image here:
https://dl.dropbox.com/u/12947585/TEST2.PCS (will only look correct using an STE or at least STE emulation)
I don't have the original reference PCS handy but I'll generate it later and edit this post. I'll update the tool soon to emit error analysis images and metrics to help figure out which settings/methods work better.
@evil - I'll reply to your post when I get home!
1) Change the pixel reduction order from recursive subdivision of each scanline (but otherwise still scanline order) to using a linear congruential generator, sampling the full image once in pseudo-random order. The error is better distributed because subsequent scanlines aren't suffering from palette choices already fixed at the end of the previous line.
2) Switch the error calculation into CIELAB space (reference white = D65), instead of RGB space, in an attempt to make error less perceptible to the eye, versus just going for the minimum numerical error in the palette values themselves. This conversion is expensive and stops an ST dead - but I'm testing on a PC so that's ok.
3) Fix the dithering code - separating the dither step into flicker-management and error-dither steps which are orthogonal to each other. I haven't finalised dithering yet but this version is at least functionally correct.
I still need to write the new reduction/bin-rebalancing algorithm but I have it figured out, will try when I get some time. It will need a lot of memory at runtime and probably quite expensive but it should do a very good job I think.
First test output image here:
https://dl.dropbox.com/u/12947585/TEST2.PCS (will only look correct using an STE or at least STE emulation)
I don't have the original reference PCS handy but I'll generate it later and edit this post. I'll update the tool soon to emit error analysis images and metrics to help figure out which settings/methods work better.
@evil - I'll reply to your post when I get home!
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
It looks like this: Hans Wesselsdml wrote:First test output image here:
https://dl.dropbox.com/u/12947585/TEST2.PCS (will only look correct using an STE or at least STE emulation)
You do not have the required permissions to view the files attached to this post.
Re: 4096 colors
Nice one The flicker is close to none, and the interlaced image is really getting there. Looking forward to see the original 24-bit to comparedml wrote:Well the other night I was playing with the reworked convertor and I managed to do a few new things with it.
https://dl.dropbox.com/u/12947585/TEST2.PCS (will only look correct using an STE or at least STE emulation)
Here's the pic two frames and combined (as it would look interlaced):
I'd say it's getting time for a trickier picture with more sideway colours
Update: Note to myself: always reload page before doing a new post.
Last edited by evil on Wed Aug 29, 2012 4:58 pm, edited 1 time in total.
Re: 4096 colors
Atari STFM 512 / STe 4MB / Mega ST+DSP / Falcon 4MB 16Mhz 68882 - DVD/CDRW/ZIP/DAT - FDI / Jaguar / Lynx 1&2 / 7800 / 2600 / XE 130+SD Card // Sega Dreamcast / Mega2+CD2 // Apple G4
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
http://soundcloud.com/nativ ~ http://soundcloud.com/nativ-1 ~ http://soundcloud.com/knot_music
http://soundcloud.com/push-sounds ~ http://soundcloud.com/push-records
Re: 4096 colors
I have done quite a bit of work on the mac but mostly PC based. i'll probably have a powerbook soon if work pays for it, and Frank B is going to test my makefile under OSX in the meantime.evil wrote: I'm on OS X so using Hatari for cross-dev.
I've actually been using Hatari for my Falcon cross-dev experiments I can boot a copy of my old Falcon in aranym, and build files to a share, and hatari looks into the share, also configured as a Falcon.evil wrote: Hatari has been rock-solid for timings in fullscreen, blitter and Spectrum pictures. Plus it's still developed very activly by Mr.Styckx who himself is an expert on ST lowlevel coding (see for example the great No Cooper demo).
(I can probably drop aranym for most things now that I have access to compilers and assemblers on other host platforms)
does it emulate the DSP morphing code properly? It runs DSPBENCH ok, which is host-port oriented so I guess thats a good sign.evil wrote: As a bonus, it will run Apex for you
I had another thought about this - might be some value in setting up a huge blit, but not starting it in hog mode. Restart it where necessary but don't reload the registers, and put it to sleep for CPU palette updating (if required). Doesn't solve the mid-palette problem but it might work well with a hybrid blit/cpu mixture. I can't remember if the blitter can be paused like this in nice mode or if it just resets the counters too.evil wrote: Well I did forget one tiny little thing; the blitter source needs updating as well, and it will be a problem, not just by cycles, but to start the blitter in "mid palette" will fork up the source x/y inc.
Do you work out the timings in your head? I think i did some sort of stupid 'ruler' bitmap for the original PCS and then watched where the colour changes affected the ruler divisions. hehe. but later on I did it a bit more properly.evil wrote: the first 228 lines of a big blitter-pass, the lower border needs taken care of real good. Preloaded CPU registers with 8 colours, so they can be movem'ed out (40 cycles), starting blitter twice (once for the intermediate lower-border line, and once for starting up a big pass for the remaining 44 lines).
I'll think about this some more, but for sure 228 lines shouldn't be a problem, the Zerkman rout should already be able to do that if he had killed the top border.
BTW if the scanline for lower border removal needs special colour reduction rules that shouldn't be a problem. I'm planning to allow specialisation per scan for this purpose.
Cool. It might be interesting to build an object file convertor for the gcc linker - so this stuff can be linked into C based tools etc. Not at the top of my list, but i've done a lot with ELF and gcc so it might not be that hard to do.evil wrote: Example makefile for a one-object assembler project (I'm a strange person who never learned C but does everything in assembler instead..):
I'm not that keen on Xcode - will probably adopt eclipse if/when I hop to mac.evil wrote: Moved from Xcode to Eclipse to become a little more platform independent.
I'll do an earlier release with height as an argument - maybe width too. But the plan is to make a profile syntax so you can describe the format you want, colour changes per line, planes, reserved colours, the default ruletable and overloads for specific scans, and so on. Some will be bundled/builtin but you can craft your own. More advanced stuff may involve changing the code but in most cases adding new modules. We'll see how it goes.evil wrote: Cool, looking forward to the improved colour allocator. It would also be very neat if it could handle any Y resolution (can even easily hardscroll that on ST with 8 scans at a time), then with a good dither.. Wow all I need to do then is to run the source 24-bit image one time through Photochrome and be done with it. What an improvement! Might be a good time for a making a slideshow then
Will try the new colour reduction stuff and will tidy it up a bit for another test release and then work on the profiles. I'll make the code available as soon as it looks like the basic layout won't change too much.
The silly reduction algorithm is 90% written - everything except the bin-pruning step. It is very memory and cpu greedy (I didn't add it up but it must be something like 20mb for a 320x200 image already haaaha!). If it makes a better image though, that will be interesting.
cheers again for the input!
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
Got the new reduction algorithm working, albeit not complete. Wasn't too successful merging colour bins in CIELAB space - I think the space is strangely curved and merging the bins can result in illegal colours. Or my conversion has a bug which only shows when merging colours in that space. Or something related.
Anyway in RGB space it works fine. First results look pretty good. Still some issues though - namely:
1) not happening in the favoured colour space - since I'm probably generating illegal colours, I will change it to compute error in CIE but merge/average final bins in RGB - that should be near enough
2) scanline-to-scanline coherence is slightly worse than the other algorithms, because the palette solutions are even more local than before. I'm sure this can be solved in a nice way by managing which pixels the bins look at (colour-related pixels outside the scan in question) (not yet done but this algo makes that practical to do).
3) haven't implemented any of the other colour-perception tricks yet, which I think will help with efficient allocation
4) its still very very memory greedy
Having said all that, the palette solving *within* each scanline looks excellent, better than the others I tried. Here's a sample from the unfinished code:
https://dl.dropbox.com/u/12947585/TEST3.PCS
UPDATED: One more version with the CIE bug (!) fixed and dithering back on.
https://dl.dropbox.com/u/12947585/TEST4.PCS
Anyway in RGB space it works fine. First results look pretty good. Still some issues though - namely:
1) not happening in the favoured colour space - since I'm probably generating illegal colours, I will change it to compute error in CIE but merge/average final bins in RGB - that should be near enough
2) scanline-to-scanline coherence is slightly worse than the other algorithms, because the palette solutions are even more local than before. I'm sure this can be solved in a nice way by managing which pixels the bins look at (colour-related pixels outside the scan in question) (not yet done but this algo makes that practical to do).
3) haven't implemented any of the other colour-perception tricks yet, which I think will help with efficient allocation
4) its still very very memory greedy
Having said all that, the palette solving *within* each scanline looks excellent, better than the others I tried. Here's a sample from the unfinished code:
https://dl.dropbox.com/u/12947585/TEST3.PCS
UPDATED: One more version with the CIE bug (!) fixed and dithering back on.
https://dl.dropbox.com/u/12947585/TEST4.PCS
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: 4096 colors
Right after quite a few changes and a few false starts (my last test I think was bogus, there was a bug in the compressor which was aborting the save file step, so I don't know what stale thing I actually uploaded in the end) - I have something I think I'm happy with. For now anyway.
- new exhaustive, iterative bin-balancing algorithm, working entirely in RGB space (I'm not really sure CIE space is that helpful when you have colour quantization in the mix - I think it can encourage false dupe colours and, complicating reduction and the image is suffering. will return to it another time)
- added a visual perception filter mask, to attenuate error computation for edges, so gradients get more attention from the allocator. the impact of this is quite significant. happy with that.
- emit some diagnostic images containing error information, the filter mask and separate/combined image fields
The perception filter makes the new reduction algorithm worthwhile. It sorts out the scan-to-scan coherence problem which makes the image look streaky because (being kind of similar to a sobel/edgedetect filter) it makes each scan slightly aware of the scans above and below, so the colours can't end up too far apart. The whole image is therefore being solved simultaneously, instead of behaving like one very long scanline. That's nice.
Without the perception filter, the new reduction algorithm actually looks worse than the LCG reduction algorithm, which is immensely simpler and cheaper to run.
The LCG is still probably the best all-rounder because it gives great results at hardly any cost. It's probably my favourite solution. The bin-balancing thing is a behemoth algorithm but it has the edge for image quality in the end. The RGB error measurements prove it.
Anyway I'm about burned out with this problem for now - going to stop fiddling with it and tidy up the prog. I'll add the y/height parameter and do another test release, then get back to refactoring for fancy display profiles.
I might even get to try a new display routine using some of evil's suggestions with any luck!
Here's the most recent render, this time with reference image and the rest...
https://dl.dropbox.com/u/12947585/TEST5.zip
- new exhaustive, iterative bin-balancing algorithm, working entirely in RGB space (I'm not really sure CIE space is that helpful when you have colour quantization in the mix - I think it can encourage false dupe colours and, complicating reduction and the image is suffering. will return to it another time)
- added a visual perception filter mask, to attenuate error computation for edges, so gradients get more attention from the allocator. the impact of this is quite significant. happy with that.
- emit some diagnostic images containing error information, the filter mask and separate/combined image fields
The perception filter makes the new reduction algorithm worthwhile. It sorts out the scan-to-scan coherence problem which makes the image look streaky because (being kind of similar to a sobel/edgedetect filter) it makes each scan slightly aware of the scans above and below, so the colours can't end up too far apart. The whole image is therefore being solved simultaneously, instead of behaving like one very long scanline. That's nice.
Without the perception filter, the new reduction algorithm actually looks worse than the LCG reduction algorithm, which is immensely simpler and cheaper to run.
The LCG is still probably the best all-rounder because it gives great results at hardly any cost. It's probably my favourite solution. The bin-balancing thing is a behemoth algorithm but it has the edge for image quality in the end. The RGB error measurements prove it.
Anyway I'm about burned out with this problem for now - going to stop fiddling with it and tidy up the prog. I'll add the y/height parameter and do another test release, then get back to refactoring for fancy display profiles.
I might even get to try a new display routine using some of evil's suggestions with any luck!
Here's the most recent render, this time with reference image and the rest...
https://dl.dropbox.com/u/12947585/TEST5.zip
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM