I'm probably joining this late, and some/all of this might be unnecessary repetition, but I might as well summarise my experiences with this...
On ST and Falcon it should be valid to start the blitter in non-hog mode and leave it running, without restart. It will restart itself every alternate block of 64 bus cycles until complete.
In this state you can either force-restart it until complete, or leave it alone and poll it for completeness (or use blitter-done interrupt to do the same) before starting a new blit. This implies some kind of parallel behaviour and some associated coding hazards.
The main difference between ST and Falcon is the kind of parallelism - on ST it's closer to time-slicing as the 68000 is bus-bound for most operations. On Falcon a varying number of instructions/transactions can execute due to caches. Potentially a lot of code if arranged so. The delay before starting is also different, allowing more instructions and some transactions to complete on Falcon where only trivial instructions can complete on ST before the blitter starts. This is a further hazard on Falcon. (The Falcon blitter also yields more irregularly to other hardware on the bus but that shouldn't produce any noticable interactions on CPU - i have measured the timing wobble recently so I can see it does happen but it doesn't matter here)
So code used to poll or restart the blitter needs to be carefully implemented esp. on Falcon because of the weird concurrency effects even upon starting. I'm not familiar with using this older compiler (PureC?) and haven't looked closely at the provided code but the asm for the volatile register transactions should be checked carefully to make sure its doing what is expected, no reordering etc - confusing Falcon-only bugs could ensue otherwise. It's definitely the kind of thing I'd inspect when using GCC for example, in case it does something funny/unexpected.
But that aside, the 1-word corner case looks like it could well be undefined behaviour. I can't say I've noticed a difference between machines in this case but it seems reasonable to me. I have actually had some past troubles with FXSR emulation in Steem for example, which works ok on hardware, and for narrow sprites almost fully clipped out on the left edge. It's not quite the same situation as described earlier here, but might be a clue to the relatively rare usage of these flags for narrow blits and maybe undefined behaviour hiding there.