I dealt with this. Most what could achieve is 80 colors per scanline, but that was with special cartridge adapter, and transfer speed of some 3.6 MB/sec.
Normally, you can copy max about 2MB/sec with STE. That means about 50 colors per line.
That would be 200x50=10000, so yes, can have all 4096 colors at once on screen.
You should search here for dml's posts about his Photochrome format. There are diverse versions, and even custom formats are possible.
Then, there is mode with 2 alternating screens, where palettes slightly differ, so visual perception is that there are more colors than 4096,
I used some of them, and wrote myself displaying code in ASM - hardly possible in C, it must be cycle accurate.
Now I using cyg's system - it is basically very similar, just no alternate mode (what of course takes 2x so much space on disk), but conversion tool is better for my needs - dithering is used for better visual impression.
And it is possible to combine it with overscan, so multi color pictures with res of 416x273 for instance. I used them even in movie player.
I can give you asm src. for displaying. But you need to add user interface, loader if you want to make some viewer.
Answer on you question is already done partially. The whole thing is in updating shifter palette registers in exact moment of scanline. So, delays must be actually very accurate - that's mentioned cycle accuracy.
Not to forget Spectrum 512, which was first doing palette change "on fly". It has it's viewer PRG too.
I really don't know is some other high-color picture viewer for Atari made. I have viewer/converter to BMP of Photochrome older format in Windows SW:http://atari.8bitchip.info/floimgd.php