Re: [AD] Color convertors

[ Thread Index | Date Index | More lists.liballeg.org/allegro-developers Archives ]


Here's some results: (K6-III 400 Mhz, MMX enabled, Matrox G200)

Win95 OSR2, 800x600x16 desktop:

640x480x8 window:

  SOLID results:       old code        new code      new code
                    (4-aligned w)  (4-aligned w)   (any width)

    putpixel()      - 13924            12488         12464
    hline()         - 32795            31645         30063
    vline()         - 24546            23485         27887
    line()          - 509              457           429
    rectfill()      - 3290             2959          2467
    circle()        - 336              300           303
    circlefill()    - 604              580           552
    ellipse()       - 347              311           302
    ellipsefill()   - 602              581           565
    arc()           - 583              533           522
    triangle()      - 620              587           586

Other functions:

    textout()                    - 2346     2133    2100
    vram->vram blit()            - 5647     5090    4302
    aligned vram->vram blit()    - 5843     5268    5263
    blit() from memory           - 688      617     611
    aligned blit() from memory   - 691      618     616
    vram->vram masked_blit()     - 5436     4884    4194
    masked_blit() from memory    - 679      605     606
    draw_sprite()                - 682      617     614
    draw_rle_sprite()            - 688      613     611
    draw_compiled_sprite()       - 568      615     617
    draw_trans_sprite()          - 679      606     605
    draw_trans_rle_sprite()      - 683      615     606
    draw_lit_sprite()            - 677      608     605
    draw_lit_rle_sprite()        - 686      612     615



640x480x32 window:

  SOLID results:       old code        new code      new code
                    (4-aligned w)  (4-aligned w)   (any width)

    putpixel()      - 11985             11933        12101
    hline()         - 30793             29897        28194
    vline()         - 23243             22363        26904
    line()          - 430               411          437
    rectfill()      - 2681              2721         2058
    circle()        - 277               276          286
    circlefill()    - 563               586          530
    ellipse()       - 279               277          279
    ellipsefill()   - 565               567          544
    arc()           - 482               474          515
    triangle()      - 586               587          560

Other functions:

    textout()                    - 1944     1959    1990
    vram->vram blit()            - 4700     4663    3560
    aligned vram->vram blit()    - 4866     4837    4864
    blit() from memory           - 571      561     571
    aligned blit() from memory   - 570      560     576
    vram->vram masked_blit()     - 4366     4190    3346
    masked_blit() from memory    - 569      562     574
    draw_sprite()                - 559      555     566
    draw_rle_sprite()            - 559      558     570
    draw_compiled_sprite()       - 564      562     573
    draw_trans_sprite()          - 544      538     550
    draw_trans_rle_sprite()      - 552      546     557
    draw_lit_sprite()            - 540      541     554
    draw_lit_rle_sprite()        - 550      541     556


Win95 OSR2, 800x600x24 desktop:

640x480x32 window:
 MMX code: bad colors if width 4-aligned, crash otherwise
 non-MMX code: crash


Win95 OSR2, 800x600x32 desktop:

640x480x8 window:


  SOLID results:       old code        new code      new code
                    (4-aligned w)  (4-aligned w)   (any width)

    putpixel()      - 8340              7971          7996
    hline()         - 26514             26007         25818
    vline()         - 18594             18222         27160
    line()          - 267               236           251
    rectfill()      - 1739              1664          1680
    circle()        - 173               159           159
    circlefill()    - 503               503           503
    ellipse()       - 170               161           164
    ellipsefill()   - 521               511           513
    arc()           - 291               289           270
    triangle()      - 532               524           534

Other functions:

    textout()                    - 1259     1202     1216
    vram->vram blit()            - 3018     2864     2887
    aligned vram->vram blit()    - 3132     2989     3052
    blit() from memory           - 343      324      331
    aligned blit() from memory   - 345      327      332
    vram->vram masked_blit()     - 2947     2831     2864
    masked_blit() from memory    - 341      324      328
    draw_sprite()                - 343      327      329
    draw_rle_sprite()            - 343      326      331
    draw_compiled_sprite()       - 343      326      311
    draw_trans_sprite()          - 341      325      329
    draw_trans_rle_sprite()      - 343      325      331
    draw_lit_sprite()            - 341      323      330
    draw_lit_rle_sprite()        - 342      327      331


Some thoughts:
- the overhead due to the additional code (tests and jumps essentially) is
not too high: between 0 and 11%,
- it's another story for the alignment issue: for operations with very low
overhead (vram->vram blit) that measure the raw performance of the
convertors on block blitting operations, the loss is 24% on 16-bit desktop.
Although it is far less on 32-bit desktops, I expect it to be high too on
24-bit desktops once the convertors have been fixed.

I'm not too sure to be willing to disable the 'width multiple of 4'
condition: it does a very good job of ensuring that the rectangles passed to
the convertors are aligned on 4-byte boundary (at least) and obviously
speeds up the things. And the only restriction visible to the user (the
width of the screen multiple of 4) is not prohibitive.

---
Eric Botcazou
ebotcazou@xxxxxxxxxx



Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/