Autopilot was created as a way to have frontier LLMs respond to Discord chat messages on your behalf. It takes converstions completely on autopilot. It expanded to include multimodal features, and tool calls for web browsing, autonomous important memory management, and the use of a vector database and an advanced memory management system to store past conversations and retrieve them.
The advanced memory management uses 3 tiers of memory. The short term memory is simply the last few messages, in plain text. The medium term memory then creates a summary of the short term memory periodically, store them in the vector database by use of embedding and storing it in its the memory json file. This is then done once again periodically (once we reached a limit) for long term memory. This cascading memory management system follows principles similar to a human mind. A human may remember the recent past very clearly, but the context and information loss fades overtime. Most importantly, in its default state, the plugin captures emotions, which are contextually relevant to each memory. Finally, a semantic search is done on every message, requesting for relevant medium term, and long term memories. Much like a real human mind, it retrieves contextually relevant, based on the message itself and the emotion it convey, and is finally made available to the LLM for a personalized message output.
For web searches, it is delegated to Perplexity's API. Once the tool call is complete, and the search is done by Perplexity, the context is then injected into the conversational chain, and the LLM will act as if it found the information by doing the search itself. This is much like a human, since we delegate our searches to google, then have all the information, then responds with it. The LLM here, henceforth follows a similar pattern.
For multimodal image processing, this is done by OpenAI. Images are simply retrieved from the received message, and uploaded.
The LLM can also choose to add important memories into its context. For example, if it determines that your name or your favorite color is important, it will make the appropriate tool call.
For transparency, messages can also be watermarked so that users know the responder is currently using AI. Code can be edited to support other LLMs. Default only supports OpenAI models.
Memory management and storage is done locally, but calls OpenAI embeddings, and messages are sent to OpenAI for processing.
This plugin was made for educational purposes. I do not personally condone breaking any terms of service.
This project overall can be used as a standalone AI agent that can act as a real person, or can takeover your conversations for you when you need to step away.
This project is both open source and was made beginning in November 2024, starting with a small prototype.