Re: [hatari-devel] undefined behaviour fixes

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]



> Am 06.01.2025 um 12:10 schrieb Thomas Huth <th.huth@xxxxxxxxx>:
> 
> Am Mon, 6 Jan 2025 11:12:55 +0100
> schrieb Andreas Grabher <andreas_g86@xxxxxxxxxx>:
> 
>>> Am 06.01.2025 um 10:45 schrieb Nicolas Pomarède <npomarede@xxxxxxxxxxxx>:
>>> 
>>> Le 06/01/2025 à 10:39, Andreas Grabher a écrit :  
>>>>> Am 05.01.2025 um 12:39 schrieb Andreas Grabher <andreas_g86@xxxxxxxxxx>:
>>>>> 
>>>>> 
>>>>>> Am 04.01.2025 um 18:07 schrieb Andreas Grabher <andreas_g86@xxxxxxxxxx>:
>>>>>> 
>>>>>> 
>>>>>>> Am 31.12.2024 um 18:24 schrieb Andreas Grabher <andreas_g86@xxxxxxxxxx>:
>>>>>>> 
>>>>>>> Hello Toni,
>>>>>>> 
>>>>>>> here is a little patch that should fix some undefined behaviour (getting these from runtime checking). Please review before applying the patch.
>>>>>>> 
>>>>>>> There are many more warnings from cpuemu_XX files. But I don’t know how to fix those.
>>>>>>> 
>>>>>>> Kind regards,
>>>>>>> Andreas
>>>>>>> 
>>>>>>> <cpu_warnings.txt>  
>>>>>> 
>>>>>> I found some more. An updated patch is appended.
>>>>>> 
>>>>>> <cpu_warnings2.txt>  
>>>>> 
>>>>> Another patch is appended. Now I also fixed the problems in the cpuemu_XX files. Please note that I only use cpuemu_31 and cpuemu32, so others are untested. It seems that all is working normally. But anyway I think some heavy review is needed, especially for the changes in gencpu.
>>>>> 
>>>>> I really think these issues need to be fixed, because with the current code undefined behaviour can occur in many critical functions.
>>>>> 
>>>>> Good news for Hatari: Although I tested only with Previous, no undefined behaviour was not detected in any file that was derived from Hatari, including all DSP code.
>>>>> 
>>>>> <cpu_warnings3.txt>  
>>>> I’ve run into problems with my last patch. So here is a new one. There is quite some type chaos inside the CPU code. I get lots of undefined signed integer overflow, shifting with values greater than type size, shifting negative values, shifting with negative shifts, etc. Not sure this fixes all of them, but at least I no longer run into one of these immediately.  
>>> 
>>> Hi
>>> 
>>> I'll let Toni comment on this (now that he committed his new chipset emulation for WinUAE 6.x he might have more time to look into this :) )
>>> 
>>> but just for the information, what compiler / flags are you using ? I'm using the very latest GCC and never saw these warnings. I don't see them either on our cirrus-ci jobs.
>>> 
>>> Nicolas
>>> 
>> 
>> I am not very used to debugging. I am using the latest version of Xcode with Runtime Undefined Behavior Sanitizer activated.
> 
> The undefined behavior sanitizer is activated by the -fsanitize=undefined
> compiler switch. For example, run "configure" of Hatari like this:
> 
> CFLAGS="-O2 -fsanitize=address -fsanitize=undefined" ./configure
> 
> (that reminds me that we should likely remove the CPU_CAN_ACCESS_UNALIGNED
> stuff in our maccess.h nowadays, compilers should be smart enough nowadays
> to optimize this properly without this stuff ... and it only causes trouble
> with fsanitize=undefined otherwise)
> 
> With regards to your patch, Andreas:
> 
> diff -ru a/src/cpu/cpummu030.c b/src/cpu/cpummu030.c
> --- a/src/cpu/cpummu030.c	2024-10-27 15:30:07
> +++ b/src/cpu/cpummu030.c	2025-01-05 16:50:11
> @@ -910,7 +910,7 @@
>         mmu030.translation.table[i].shift = shift;
>         /* Build the mask */
>         for (j = 0; j < TI_bits[i]; j++) {
> -            mmu030.translation.table[i].mask |= (1<<(mmu030.translation.table[i].shift + j));
> +            mmu030.translation.table[i].mask |= ((uae_u32)1<<(mmu030.translation.table[i].shift + j));
>         }
>         /* Update until reaching the last table */
>         mmu030.translation.last_table = i;
> 
> I think that's easier to be written with a "U" suffix:
> 
>    (1U<<(mmu030.translation.table[i].shift + j));
> 
> All those other casts for left-shifting signed values look rather
> cumbersome, since all modern CPUs nowadays should just get this right
> (AFAIK). Maybe it's better to compile with "-fwrapv" and call it a day?
> 
> Thomas
> 

I prefer casting to the correct type because 1U might have different size on different platforms. 

I’m not sure we can ignore the shifting and overflow issues. There is no „right“. Different platforms/CPUs might behave differently. I already run into such an issue when running Previous on ARM for the first time. There was a very hard to debug difference in handling overflow while casting from float to int.




Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/