Multi Agent: Custom Automation Engine – Solution Accelerator is an open-source GitHub Repository that enables users to solve complex tasks using multiple agents. The accelerator is designed to be generic across business tasks. The user enters a task and a planning LLM formulates a plan to complete that task. The system then dynamically generates agents which can complete the task. The system also allows the user to create actions that agents can take (for example sending emails or scheduling orientation sessions for new employees). These actions are taken into account by the planner and dynamically created agents may be empowered to take these actions.
The solution accelerator is designed to replace and enhance enterprise workflows and processes with intelligent automation. Agents can specialize in various functions and work together to achieve an objective as specified by the user. The accelerator will integrate seamlessly with existing systems and is designed to scale according to the needs of the customer. The system allows users to review, reorder and approve steps generated in a plan, ensuring human oversight. The system uses function calling with LLMs to perform actions, users can approve or modify these actions.
This repository is to be used only as a solution accelerator following the open-source license terms listed in the GitHub repository. The example scenario’s intended purpose is to demonstrate how users can analyze and process audio files and call transcripts to help them work more efficiently and streamline their human made decisions.
How was Multi Agent: Custom Automation Engine – Solution Accelerator evaluated? What metrics are used to measure performance?
The evaluation process includes human review of the outputs, and tuned LLM prompts to produce relevant responses. It's worth noting that the system is designed to be highly customizable and can be tailored to fit specific business needs and use cases. As such, the metrics used to evaluate the system's performance may vary depending on the specific use case and business requirements.
What are the limitations of Multi Agent: Custom Automation Engine – Solution Accelerator? How can users minimize the impact Multi Agent: Custom Automation Engine – Solution Accelerator’s limitations when using the system?
The system allows users to review, reorder and approve steps generated in a plan, ensuring human oversight. The system uses function calling with LLMs to perform actions, users can approve or modify these actions. Users of the accelerator should review the system prompts provided and update as per their organizational guidance. Users should run their own evaluation flow either using the guidance provided in the GitHub repository or their choice of evaluation methods. Note that the Multi Agent: Custom Automation Engine – Solution Accelerator relies on the AutoGen Multi Agent framework. AutoGen has published their own list of limitations and impacts.
What operational factors and settings allow for effective and responsible use of Multi Agent: Custom Automation Engine – Solution Accelerator?
Effective and responsible use of the Multi Agent: Custom Automation Engine – Solution Accelerator depends on several operational factors and settings. The system is designed to perform reliably and safely across a range of business tasks that it was evaluated for. Users can customize certain settings, such as the planning language model used by the system, the types of tasks that agents are assigned, and the specific actions that agents can take (e.g., sending emails or scheduling orientation sessions for new employees). However, it's important to note that these choices may impact the system's behavior in real-world scenarios. For example, selecting a planning language model that is not well-suited to the complexity of the tasks may result in lower accuracy and performance. Similarly, assigning tasks that are outside the system's intended scope may lead to errors or incomplete results. Users can choose the LLM that is optimized for responsible use. The default LLM is GPT-4o which inherits the existing RAI mechanisms and filters from the LLM provider. Caching is enabled by default to increase reliability and control cost. We encourage developers to review OpenAI’s Usage policies and Azure OpenAI’s Code of Conduct when using GPT-40. To ensure effective and responsible use of the accelerator, users should carefully consider their choices and use the system within its intended scope.