In addition, different methodologies can increase the likelihood of success in projects based on learning:
Incremental learning allows separate components to learn in separate stages. Simple behaviors are learned first, and then they are frozen while more complex components learn.
To deal with adaptation, learn as many components as possible offline. Only the changes to the default behaviors need to be learned online.
To prevent overfitting and any other forms of degenerate learning, precautions can be taken:
Only learn examples that are unacceptable—adapting the representation so the result becomes acceptable. By loosening the criteria defining valid results, the system will not be as accurate, but will allow better generalization.
Decrease the learning rate over time to maintain consistency. This cooling schedule can take into account the system's performance.
Generally, the designer has the most control by trading off exploration and exploitation. Policy adjustment may improve system performance and reliability.