The genetic algorithm provides an endless stream of candidate sequences, which need to be evaluated by the animat as behaviors. Execution of the sequences is just a matter of starting the timer. The rest of the AI can proceed normally, but when the next time step is beyond the time offset of an action, it must be executed.
Instead of scanning the sequence for actions that need to be executed, the sequence can be sorted for efficiency—copying the old sequence if the genetic algorithm requires it intact. For additional efficiency, we can request a callback specifically at the time of execution of the next action. This approach requires separating the sequence execution from the rest of the AI, but proves to be a much more elegant design.
To prevent any problems, we can force the first action to happen at the start of the sequence. This is done by offsetting all the actions to 0, if the genetic algorithm doesn't do that during the evolution. We must guarantee that sequences are always finished or terminated, no matter what happens to the animat. (Even death must cause the AI to abort the sequence.)
Testing the animats is much easier than using the contraptions. Indeed, rocket jumping and dodging fire can be performed at any location in the terrain. There's no need for any particular game environment; all we need to do is give the bots rocket launchers, and let them fire away.
As usual, it's possible for the animats to learn the behaviors respecting the rules of embodiment. However, making each of them invulnerable (but still aware of normal damage) during the learning can save them a lot of time. As for combining the fitness values and learning phases, this should get the animat to rocket jump away from the impact points (assuming the rocket jumps have lower damage than incoming projectiles).
Finally, it's best to separate the learning for both rocket jumping and dodging fire. (The behaviors are conditionally independent anyway.) The are two ways to achieve this separation: Use distinct learning phases with one unique fitness function, or manually split the fitness according to the behavior and learn the behaviors at the same time. Both achieve the same thing, but in different ways, obviously.