- Default to evaluation decision of None when either agent or evaluator llm fails. This fixes accuracy calculations on errors - Fix showing color for decision True - Enable arg flags to specify output results file paths