Sunday, June 15, 2014

Here is an excerpt from profiling results for the bounce demo, using the Python profile module on Android, showing YAPYG functions only:

         452745 function calls (452034 primitive calls) in 30.027 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   ...

       72    0.020    0.000   25.779    0.358 widget.py:57(on_timer)
       34    0.007    0.000   25.715    0.756 widget.py:74(redraw)
       34    0.110    0.003   23.216    0.683 movers/__init__.py:102(run)
      850    3.641    0.004   20.050    0.024 collisions.py:294(run)
    19629    3.045    0.000    7.889    0.000 collisions.py:232(get_collision_shapes)
    18779    1.182    0.000    6.141    0.000 collisions.py:272(_is_collision)
    18587    2.226    0.000    4.696    0.000 fixpoint.py:782(is_circle_circle_collision)
    73265    3.260    0.000    3.260    0.000 fixpoint.py:49(mul)
      850    0.308    0.000    3.055    0.004 movers/physical.py:92(run)
    43953    1.954    0.000    2.704    0.000 entities.py:92(get)
    23160    1.027    0.000    2.456    0.000 entities.py:101(get_pos)


The PC results are similar, though with smaller absolute times.

With profiling enabled the program runs terribly slow. In the 30 seconds this run lasted the cluster of balls only moved a dozen pixels or so.Without profiling the engine runs far too slow as well though, which needs to be fixed.

Apparently the collision detection is the biggest bottleneck, with the biggest single chunk of time inside that again spent in the function that computes the positions of the shapes of each object relative to the current position. Closely followed by the actual collision detection function. This part is obviously the main candidate for optimization!

The function get_collision_shapes is fairly easy to optimize since it's results can be cached, provided the cache is invalidated whenever the entity moves. The change was simple and quick and leads to following profiling results:

         432081 function calls (431519 primitive calls) in 29.598 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       89    0.028    0.000   24.990    0.281 widget.py:57(on_timer)
       50    0.012    0.000   24.909    0.498 widget.py:74(redraw)
       50    0.137    0.003   21.521    0.430 movers/__init__.py:102(run)
     1250    4.450    0.004   17.257    0.014 collisions.py:294(run)
    24610    1.455    0.000    7.981    0.000 collisions.py:272(_is_collision)
    24130    2.714    0.000    5.681    0.000 fixpoint.py:782(is_circle_circle_collision)
   101769    4.212    0.000    4.212    0.000 fixpoint.py:49(mul)
     1250    0.414    0.000    4.125    0.003 movers/physical.py:92(run)
       50    0.201    0.004    3.286    0.066 sprites.py:143(draw)
     1251    0.194    0.000    3.247    0.003 entities.py:129(add_pos)
     1550    1.491    0.001    2.976    0.002 sprites.py:183(_draw_sprite)
     1275    0.065    0.000    2.854    0.002 entities.py:73(_call_pos_listeners)
     1275    0.111    0.000    2.789    0.002 collisions.py:63(entity_pos_listener)
     1306    0.370    0.000    2.703    0.002 collisions.py:195(_update_hash)
     3831    0.990    0.000    2.607    0.001 collisions.py:112(_get_hash_area)

    25860    1.430    0.000    1.737    0.000 collisions.py:233(get_collision_shapes)

Now get_collision_shapes is much faster. This leaves the rest of the collision module as the bottleneck, which will probably require different approaches than caching to speed up.

No comments:

Post a Comment