Re: [hatari-devel] DSP Mandelbrot bug

[ Thread Index | Date Index | More lists.tuxfamily.org/hatari-devel Archives ]


This is a quick copy (prog2) of the program. I've got it using default parameters, so it may be a little different from what is executed with the altered parameters in my previous test. It has no disassembly text. Maybe it is useful this way too, else i'll try to get one with disassembly later this week. Need some sleep now  ;-)

I appended a second copy (prog1). That one is loaded before running the actual Mandelbrot calculation (while waiting for the user to press "Run" or optionally change parameters).

0000: 0BF080
0001: 000047
0002: 0AF080
0003: 000051
0004: 0BF080
0005: 000059
0006: 0BF080
0007: 000059
0008: 0BF080
0009: 003B75
000A: 0BF080
000B: 003B78
000C: 0BF080
000D: 003B21
000E: 0BF080
000F: 003B21
0010: 0BF080
0011: 003B21
0012: 0BF080
0013: 003B21
0014: 0D000A
0015: 000000
0016: 0D000A
0017: 000000
0018: 0D000A
0019: 000000
001A: 0D000A
001B: 000000
001C: 0BF080
001D: 003B21
001E: 0BF080
001F: 000059
0020: 0853EB
0021: 000000
0022: 0BF080
0023: 003C00
0024: 0BF080
0025: 003CCB
0026: 0BF080
0027: 003C76
0028: 0BF080
0029: 003CC0
002A: 0BF080
002B: 003CE8
002C: 000000
002D: 000000
002E: 000000
002F: 000000
0030: 000000
0031: 000000
0032: 000000
0033: 000000
0034: 300000
0035: 08F4AB
0036: 4C0034
0037: 08D03E
0038: 08D028
0039: 073510
003A: 073710
003B: 073A10
003C: 073B10
003D: 073C10
003E: 54F400
003F: 0BF080
0040: 07000C
0041: 54F400
0042: 000047
0043: 07010C
0044: 0AA323
0045: 0AA503
0046: 0C0067
0047: 08F068
0048: 00003B
0049: 07373C
004A: 07B93C
004B: 07B83D
004C: 08F07F
004D: 00003A
004E: 07BC39
004F: 000000
0050: 000004
0051: 073417
0052: 67F400
0053: 00FFFF
0054: 073517
0055: 07B417
0056: 07343B
0057: 0500BB
0058: 0C0065
0059: 087069
005A: 000035
005B: 08706B
005C: 000034
005D: 087069
005E: 000037
005F: 08706B
0060: 000036
0061: 087068
0062: 00003B
0063: 07383D
0064: 07393C
0065: 08707F
0066: 00003A
0067: 0AA020
0068: 08F4BF
0069: 000C00
006A: 08F4A8
006B: 00001C
006C: 0002F8
006D: 00FEB8
006E: 0AA980
006F: 00006E
0070: 08706B
0071: 00007D
0072: 0AA980
0073: 000072
0074: 08706B
0075: 00007E
0076: 0AA9A4
0077: 00007B
0078: 0AA981
0079: 000078
007A: 0C007D
007B: 0AA980
007C: 00007B
007D: 000000
007E: 000000
007F: 0C0067
0080: 0AA824
0081: 60F400
0082: 000002
0083: 000000
0084: 61D800
0085: 71D800
0086: 66D800
0087: 76D800
0088: 56D800
0089: 65D800
008A: 200003
008B: 0AF0AF
008C: 0000AA
008D: 06CE00
008E: 0000A9
008F: 44D900
0090: 46D100
0091: 044911
0092: 208541
0093: 20C759
0094: 06D500
0095: 0000A7
0096: 21C498
0097: 200080
0098: 200018
0099: 0AF0A5
009A: 0000A1
009B: 22AE00
009C: 0444BF
009D: 200044
009E: 00008C
009F: 0AF080
00A0: 0000A9
00A1: 2000D8
00A2: 20003A
00A3: 20003A
00A4: 200078
00A5: 21E696
00A6: 200032
00A7: 200060
00A8: 22AE00
00A9: 564E00
00AA: 000000
00AB: 0AF080
00AC: 003FF4
00AD: 0000BF
00AE: 20000D
00AF: 0AF0AA
00B0: 0000C8
00B1: 0AF0AA
00B2: 0000CE
00B3: 06C400
00B4: 0000B7
00B5: 0BF080
00B6: 00011D
00B7: 545800
00B8: 0C0096
00B9: 06C400
00BA: 0000BD
00BB: 0BF080
00BC: 00011D
00BD: 5C5800
00BE: 0C0096
00BF: 06C400
00C0: 0000C6
00C1: 0BF080
00C2: 00011D
00C3: 546000
00C4: 0BF080
00C5: 00011D
00C6: 5C5800
00C7: 0C0096
00C8: 06C400
00C9: 0000CC
00CA: 0BF080
00CB: 00011D
00CC: 07588C
00CD: 0C0096
00CE: 0BF080
00CF: 000104
00D0: 240000
00D1: 250000
00D2: 46F400
00D3: 000006
00D4: 0AA984
00D5: 0000EC
00D6: 60F400
00D7: 002000
00D8: 61F400
00D9: 001B20
00DA: 06D100
00DB: 0000DC
00DC: 075886
00DD: 300200
00DE: 31F400
00DF: 06D100
00E0: 0000E1
00E1: 445800
00E2: 300200
00E3: 31F400
00E4: 06D100
00E5: 0000E6
00E6: 4C5800
00E7: 30F600
00E8: 310A00
00E9: 06D100
00EA: 0000EB
00EB: 425800
00EC: 60F400
00ED: 0000FF
00EE: 318000
00EF: 060580
00F0: 0000F2
00F1: 07D88E
00F2: 07598E
00F3: 0500BB
00F4: 05F43C
00F5: 003D5F
00F6: 0500BD
00F7: 000000
00F8: 44F400
00F9: 0BF080
00FA: 070004
00FB: 44F400
00FC: 000047
00FD: 070104
00FE: 000004
00FF: 24A613
0100: 0BF080
0101: 003B71
0102: 0AF080
0103: 003D5D
0104: 05F43A
0105: 000006
0106: 380100
0107: 231900
0108: 231A00
0109: 231B00
010A: 231C00
010B: 231D00
010C: 231E00
010D: 231F00
010E: 057020
010F: 003E74
0110: 057020
0111: 003E75
0112: 44F400
0113: 000080
0114: 447000
0115: 003E68
0116: 0BF080
0117: 00015A
0118: 08F4BF
0119: 00243C
011A: 0500BD
011B: 000000
011C: 00000C
011D: 0AA980
011E: 00011D
011F: 56F000
0120: 00FFEB
0121: 00000C
0122: 0AA981
0123: 000122
0124: 607000
0125: 00FFEB
0126: 00000C
0127: 0AA824
0128: 66F400
0129: 002000
012A: 44F400
012B: 002000
012C: 46F400
012D: 000001
012E: 20001B
012F: 06C400
0130: 000153
0131: 200058
0132: 300100
0133: 556600
0134: 45E600
0135: 20006D
0136: 0AF0A2
0137: 000157
0138: 300200
0139: 45F400
013A: 555555
013B: 456600
013C: 56E600
013D: 200065
013E: 0AF0A2
013F: 000157
0140: 300300
0141: 45F400
0142: AAAAAA
0143: 456600
0144: 56E600
0145: 200065
0146: 0AF0A2
0147: 000157
0148: 300400
0149: 5EE600
014A: 200065
014B: 0AF0A2
014C: 000157
014D: 300500
014E: 07E685
014F: 200065
0150: 0AF0A2
0151: 000157
0152: 250000
0153: 075E85
0154: 300000
0155: 0AA804
0156: 00000C
0157: 0AA804
0158: 00008C
0159: 00000C
015A: 44F400
015B: 003EA0
015C: 447000
015D: 003E90
015E: 44F400
015F: 003EA1
0160: 447000
0161: 003E91
0162: 63F400
0163: 003EFF
0164: 053FA3
0165: 44F400
0166: 666666
0167: 4C5300
0168: 637000
0169: 003E69
016A: 637000
016B: 003E6A
016C: 44F400
016D: 777777
016E: 4C7000
016F: 003EC0
0170: 00000C
0171: 000000
0000: 0BF080
0001: 000080
0002: 0AF080
0003: 000051
0004: 0BF080
0005: 000059
0006: 0BF080
0007: 000059
0008: 0BF080
0009: 003B75
000A: 0BF080
000B: 003B78
000C: 0BF080
000D: 003B21
000E: 0BF080
000F: 003B21
0010: 0BF080
0011: 003B21
0012: 0BF080
0013: 003B21
0014: 0D000A
0015: 000000
0016: 0D000A
0017: 000000
0018: 0D000A
0019: 000000
001A: 0D000A
001B: 000000
001C: 0BF080
001D: 003B21
001E: 0BF080
001F: 000059
0020: 0853EB
0021: 000000
0022: 0BF080
0023: 003C00
0024: 0BF080
0025: 003CCB
0026: 0BF080
0027: 003C76
0028: 0BF080
0029: 003CC0
002A: 0BF080
002B: 003CE8
002C: 000000
002D: 000000
002E: 000000
002F: 000000
0030: 000000
0031: 000000
0032: 000000
0033: 000000
0034: 300000
0035: 08F4AB
0036: 4C0034
0037: 08D03E
0038: 08D028
0039: 073510
003A: 073710
003B: 073A10
003C: 073B10
003D: 073C10
003E: 54F400
003F: 0BF080
0040: 07000C
0041: 54F400
0042: 000047
0043: 07010C
0044: 0AA323
0045: 0AA503
0046: 0C0067
0047: 08F068
0048: 00003B
0049: 07373C
004A: 07B93C
004B: 07B83D
004C: 08F07F
004D: 00003A
004E: 07BC39
004F: 000000
0050: 000004
0051: 073417
0052: 67F400
0053: 00FFFF
0054: 073517
0055: 07B417
0056: 07343B
0057: 0500BB
0058: 0C0065
0059: 087069
005A: 000035
005B: 08706B
005C: 000034
005D: 087069
005E: 000037
005F: 08706B
0060: 000036
0061: 087068
0062: 00003B
0063: 07383D
0064: 07393C
0065: 08707F
0066: 00003A
0067: 0AA020
0068: 08F4BF
0069: 000C00
006A: 08F4A8
006B: 00001C
006C: 0002F8
006D: 00FEB8
006E: 0AA980
006F: 00006E
0070: 08706B
0071: 00007D
0072: 0AA980
0073: 000072
0074: 08706B
0075: 00007E
0076: 0AA9A4
0077: 00007B
0078: 0AA981
0079: 000078
007A: 0C007D
007B: 0AA980
007C: 00007B
007D: 000000
007E: 000000
007F: 0C0067
0080: 000000
0081: 000000
0082: 05F439
0083: 000300
0084: 0AA820
0085: 0AA822
0086: 0AA803
0087: 0AA804
0088: 0AA020
0089: 0BF080
008A: 000122
008B: 0AA103
008C: 0AA323
008D: 0AA503
008E: 08F4BE
008F: 000000
0090: 0AA983
0091: 0000CE
0092: 300000
0093: 0BA9A4
0094: 000127
0095: 20001B
0096: 0BF080
0097: 000122
0098: 0BF080
0099: 00011D
009A: 200003
009B: 0AF0AA
009C: 0000CE
009D: 21CF00
009E: 0BF080
009F: 00011D
00A0: 219000
00A1: 0BF080
00A2: 00011D
00A3: 218400
00A4: 2C0100
00A5: 2C020D
00A6: 0AF0AA
00A7: 0000B3
00A8: 2C030D
00A9: 0AF0AA
00AA: 0000B9
00AB: 2C040D
00AC: 0AF0AA
00AD: 0000BF
00AE: 20000D
00AF: 0AF0AA
00B0: 0000C8
00B1: 0AF0AA
00B2: 0000CE
00B3: 06C400
00B4: 0000B7
00B5: 0BF080
00B6: 00011D
00B7: 545800
00B8: 0C0096
00B9: 06C400
00BA: 0000BD
00BB: 0BF080
00BC: 00011D
00BD: 5C5800
00BE: 0C0096
00BF: 06C400
00C0: 0000C6
00C1: 0BF080
00C2: 00011D
00C3: 546000
00C4: 0BF080
00C5: 00011D
00C6: 5C5800
00C7: 0C0096
00C8: 06C400
00C9: 0000CC
00CA: 0BF080
00CB: 00011D
00CC: 07588C
00CD: 0C0096
00CE: 0BF080
00CF: 000104
00D0: 240000
00D1: 250000
00D2: 46F400
00D3: 000006
00D4: 0AA984
00D5: 0000EC
00D6: 60F400
00D7: 002000
00D8: 61F400
00D9: 001B20
00DA: 06D100
00DB: 0000DC
00DC: 075886
00DD: 300200
00DE: 31F400
00DF: 06D100
00E0: 0000E1
00E1: 445800
00E2: 300200
00E3: 31F400
00E4: 06D100
00E5: 0000E6
00E6: 4C5800
00E7: 30F600
00E8: 310A00
00E9: 06D100
00EA: 0000EB
00EB: 425800
00EC: 60F400
00ED: 0000FF
00EE: 318000
00EF: 060580
00F0: 0000F2
00F1: 07D88E
00F2: 07598E
00F3: 0500BB
00F4: 05F43C
00F5: 003D5F
00F6: 0500BD
00F7: 000000
00F8: 44F400
00F9: 0BF080
00FA: 070004
00FB: 44F400
00FC: 000047
00FD: 070104
00FE: 000004
00FF: 24A613
0100: 0BF080
0101: 003B71
0102: 0AF080
0103: 003D5D
0104: 05F43A
0105: 000006
0106: 380100
0107: 231900
0108: 231A00
0109: 231B00
010A: 231C00
010B: 231D00
010C: 231E00
010D: 231F00
010E: 057020
010F: 003E74
0110: 057020
0111: 003E75
0112: 44F400
0113: 000080
0114: 447000
0115: 003E68
0116: 0BF080
0117: 00015A
0118: 08F4BF
0119: 00243C
011A: 0500BD
011B: 000000
011C: 00000C
011D: 0AA980
011E: 00011D
011F: 56F000
0120: 00FFEB
0121: 00000C
0122: 0AA981
0123: 000122
0124: 607000
0125: 00FFEB
0126: 00000C
0127: 0AA824
0128: 66F400
0129: 002000
012A: 44F400
012B: 002000
012C: 46F400
012D: 000001
012E: 20001B
012F: 06C400
0130: 000153
0131: 200058
0132: 300100
0133: 556600
0134: 45E600
0135: 20006D
0136: 0AF0A2
0137: 000157
0138: 300200
0139: 45F400
013A: 555555
013B: 456600
013C: 56E600
013D: 200065
013E: 0AF0A2
013F: 000157
0140: 300300
0141: 45F400
0142: AAAAAA
0143: 456600
0144: 56E600
0145: 200065
0146: 0AF0A2
0147: 000157
0148: 300400
0149: 5EE600
014A: 200065
014B: 0AF0A2
014C: 000157
014D: 300500
014E: 07E685
014F: 200065
0150: 0AF0A2
0151: 000157
0152: 250000
0153: 075E85
0154: 300000
0155: 0AA804
0156: 00000C
0157: 0AA804
0158: 00008C
0159: 00000C
015A: 44F400
015B: 003EA0
015C: 447000
015D: 003E90
015E: 44F400
015F: 003EA1
0160: 447000
0161: 003E91
0162: 63F400
0163: 003EFF
0164: 053FA3
0165: 44F400
0166: 666666
0167: 4C5300
0168: 637000
0169: 003E69
016A: 637000
016B: 003E6A
016C: 44F400
016D: 777777
016E: 4C7000
016F: 003EC0
0170: 00000C
0171: 000000

Am 22.11.2015 um 23:34 schrieb Laurent Sallafranque <laurent.sallafranque@xxxxxxx>:

Yes, just print it from the dsp ram, that's the easiest way to get a source code.

I think I should write a dsp code extractor one day (some code that would extract some DSP from memory in a nearly recompilable style).

Regards

Laurent


Le 22/11/2015 23:29, Andreas Grabher a écrit :
Laurent and Douglas, you are of course right. This should be checked on real hardware. I try to find someone to test it for me. But until then, given the sense for perfection of all NeXT stuff, i think it is quite unlikely they shipped it with that obvious bug. There are also obvious issues (black line in the middle), if the default parameters are used.
I'll report back as soon as is have information from real hardware.

Laurent, i tried to answer your questions below.

Douglas, your optimizations are interesting. I think NeXT would have been interested in them, because the main purpose of the Mandelbrot demo was to demonstrate the performance advantage of the DSP over the CPU for certain tasks  ;-)
I'll have a look if i can isolate the complete DSP application. It seems to be embedded into the binary. Maybe i'll just print if from the DSP RAM ...


Am 22.11.2015 um 12:06 schrieb Laurent Sallafranque <laurent.sallafranque@xxxxxxx>:

Hi Andreas,

First, are you sure the resulting bad pixels don't appear on the real computer too ?
(just to be sure).



When I have a closer look at the trace, I can read :


; Previous loop
p:0099  0af0a5 0000a1  (06 cyc)  jec p:$00a1                 


; New loop that seems to bug                     

p:00a1  2000d8         (02 cyc)  mpy +y0,x0,b                                     
    Reg: b   $00:4a0b9a:44a2c4 -> $00:2355a0:579062
    Reg: sr  $8040 -> $8050
p:00a2  20003a         (02 cyc)  asl b                                            
    Reg: b   $00:2355a0:579062 -> $00:46ab40:af20c4
    Reg: sr  $8050 -> $8040


; Is the next one correct ? (we have b = $00:8...   ) ?
I think this is all correct. It gets back to 00:7... later just before the jec.

p:00a3  20003a         (02 cyc)  asl b                                            
    Reg: b   $00:46ab40:af20c4 -> $00:8d5681:5e4188
    Reg: sr  $8040 -> $8060


; What worth Y1 value just below ? (I haven't found it in the trace)
; The problem may be here
y1 is 0x2c28f8, it gets set up at the beginning and does not get changed during the calculation.

p:00a4  200078         (02 cyc)  add y1,b                                         
    Reg: b   $00:8d5681:5e4188 -> $00:ae4c44:5e4188


; Just below, we've got the "overflow" (7fffff) (into y0)
; The problem seem to be that the program copy $00:ae4c44:5e4188 into y0 and set it to $7fffff
I think that overflow is normal behavior. But i'm not sure about it.

p:00a5  21e696         (02 cyc)  mac -y0,y0,a b,y0                                
    Reg: y0  $4e71df -> $7fffff
    Reg: a   $00:19f86d:2f6242 -> $ff:e9e540:1a21c0
    Reg: sr  $8060 -> $8058


p:00a6  200032         (02 cyc)  asl a                                            
    Reg: a   $ff:e9e540:1a21c0 -> $ff:d3ca80:344380
    Reg: sr  $8058 -> $8059
p:00a7  200060         (02 cyc)  add x1,a                                         
    Reg: a   $ff:d3ca80:344380 -> $ff:fff376:344380
    Reg: sr  $8059 -> $8058
p:0096  21c498         (02 cyc)  mpy +y0,y0,b a,x0                                
    Reg: x0  $39a7ef -> $fff376
    Reg: b   $00:ae4c44:5e4188 -> $00:7ffffe:000002
    Reg: sr  $8058 -> $8040
p:0097  200080         (02 cyc)  mpy +x0,x0,a                                     
    Reg: a   $ff:fff376:344380 -> $00:000001:3a74c8
    Reg: sr  $8040 -> $8050

p:0098  200018         (02 cyc)  add a,b                                          
    Reg: b   $00:7ffffe:000002 -> $00:7fffff:3a74ca                    <-- Here KO ?
The problem seems to that unlike with the "good" pixels, a is too small to get to 00:8...
    Reg: sr  $8050 -> $8040
p:0099  0af0a5 0000a1  (06 cyc)  jec p:$00a1                                    



Laurent



Le 22/11/2015 10:24, Andreas Grabher a écrit :
Update:

Looking at the values of b at time of final jec it seems that it might also be some kind of rounding issue:

00:800d9f:248aca <--- 3 pixels before
00:80064e:53e34a
00:8001b2:34bbf4
00:7fffff:3a74ca <--- bad pixel
00:80016c:07e922
00:800631:562b04
00:800e89:ff27ca <--- 3 pixels after

Anfang der weitergeleiteten Nachricht:

Von: Andreas Grabher <andreas.grabher@xxxxxxxxxxxx>
Betreff: [hatari-devel] DSP Mandelbrot bug
Datum: 22. November 2015 09:53:17 MEZ

Hello Hatari Community,

i am experiencing a hard to find DSP bug here with Previous. Luckily it can be made "visible" using NeXTstep's included Mandelbrot demo. It might also be responsible for some distorted audio in other applications.

I appended a screenshot of the mandelbrot application where the effect of the bug is clearly visible. I pointed to one failing pixel. I also appended some debugging output containing the calculation of the failing pixel and one pixel before and one after the failing one. The last file i appended contains an overview about the variables during calculation to get a better overview.

Short overview on the calculation:
Every pixel is calculated separately. The visible pixel color is derived from the remaining loop count of some calculation. The higher the remaining count, the lower is the output value of the function. The loop exits using a jec instruction (check extension bit, exit if false).
For the "good" pixels it exits after the second run, because the upper 9 bits of b are no longer all 0. For the "bad" pixel it does not exit, because these bits are still all 0. The most suspect part of the calculation seems to be mpy +x0,x0,a at p:0097. The value of a after the third call of that instruction seems to not fit into the pattern.

Can someone with more DSP experience see the bug? It might or might not be in dsp_mul56.

Any help is greatly appreciated!

Andreas














Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/