My previous blog marked the kickoff in a series of blogs about Extreme AI Expert which launched in July 2024 as a phased technical preview for Extreme employees and partners. AI Expert enables IT teams to identify and resolve issues faster so they can run, optimize, and monitor the network more easily. The tool leverages GenAI to analyze our extensive database of technical documents to provide precise and tailored responses to relevant network-related issues.
The launch of AI Expert also marked the beginning of an exciting journey to rigorously assess the tool's effectiveness and efficiency in meeting user needs. With no established blueprint for how users would interact with the AI systems we are building, our aim has been to gather insights from a growing user base, which we expect to reach thousands as the rollout continues. From the onset, we had established a set of metrics and KPIs that we wanted to hit. Those included the overall adoption (how many users with activity vs. entire audience), intensity of usage (weekly usage, number of interactions and questions), quality of interaction (implicit and explicit feedback), and specifically to the interactions themselves the accuracy, completeness, clarity of responses as well as the harmless nature of the interactions.
In the first month of the technical preview, we introduced a powerful conversational interface that transforms the user experience for knowledge acquisition and delivering hyper-personalized responses to any network questions asked. Additionally, the system helps guide the user toward achieving the most favorable outcome in any given interaction. Gone are the days of sifting through endless documents or relying on keyword searches. After adding 100,000´s of topics to the knowledge base, users can simply start a conversation and get tailored answers from our vast repository of technical documentation, providing precise and actionable responses.
One of our initial challenges was determining how to encourage user engagement while avoiding any bias in their behavior or usage patterns. We considered prompting users to test specific topics, but quickly realized that this wouldn't reflect how the system would be used in real-world scenarios. As a result, we abandoned that idea. Ultimately, we adopted a gamification strategy to drive engagement while preserving the authenticity of user interactions. While many users were motivated by the system's new capabilities alone, offering daily and monthly participation rewards—such as t-shirts and other fun prizes—added an extra layer of excitement. Additionally, inter-regional competition and a leaderboard highlighting power users (those who not only use the system but also provide feedback after each interaction) have resulted in consistently high levels of activity during the first few weeks. This engagement was also reinforced by a marketing communication plan that kept the organization and participants updated with the latest news and competition winners.
From the very first day of the preview, we saw substantial participation, with users eager to explore the system's capabilities. A remarkable 87% of the audience we enabled to engage with the system did so during the first month, exceeding our goal of 75%. This initial surge in usage was encouraging, but what truly stood out was the consistent engagement we observed in the following weeks. About 50% of the users were classified as active, using the system at least once a week, which met our expectations. This resulted in consistent daily usage—outside of weekends—of 8% of active users engaging with the system. This steady interaction reflects not just curiosity, but in our view, a growing reliance on the tool in their daily work as users become increasingly comfortable and confident in its ability to provide accurate, timely information, even though we are still in the technical preview stage.
A primary focus of the first month was to gather and evaluate comprehensive feedback on the accuracy and quality of the responses. Our goal was to receive explicit feedback on at least 10% of all interactions which was exceeded significantly at 16%. At this point we believe that the gamification has contributed to this success as users were able to score higher when they not only engage but also provide feedback. While the initial results showed a high level of accuracy and satisfaction, 92.2% of all interactions being neutral or positive, we understand that true precision would come from ongoing improvement to achieve our goal of 98%+—especially with a Generative AI system, where responses evolve and improve through iterative interaction and continuous feedback.
As seen in the figure below, to ensure we captured these insights effectively, we implemented detailed rating mechanisms that empower users to assess the quality of each response. These ratings go beyond simple satisfaction scores—they provide structured, actionable feedback that highlights specific strengths and weaknesses in the output. By leveraging this feedback, we can make targeted improvements, fine-tuning the system’s algorithms like how to retrieve documents and rank them, enhancing its contextual understanding, and ultimately boosting its overall accuracy. This iterative process is essential for building a system that not only meets but exceeds user expectations as it evolves over time.
Our feedback collection wasn’t just about validating correct answers. Equally important was identifying instances where the responses were unclear, incomplete, incorrect, or even harmful. This type of feedback is invaluable, as it provides us with real-world insights into how the system performs under a wide variety of user inquiries and scenarios. With this feedback, we can pinpoint areas where the system might misinterpret queries, lack contextual understanding, or produce answers that need further clarification.
While technical documentation has provided a strong foundation, it is only the beginning. The true power of the system is being unlocked in the next phase of the preview, which has begun to incorporate operational data. By integrating real-time network environments into the knowledge base, we are significantly enhancing the ability to deliver context-specific recommendations that go beyond static technical manuals.
We have already surpassed the thousand-user mark as I write this blog, expanding further and incorporating a broader variety of users into the process. This growing, diverse audience will provide us with even more valuable insights as we continue to refine and enhance the system. Additionally, expanded capabilities and knowledge are expected to maintain participation at consistently high levels.
Let’s see whether our predictions become reality and what lessons we learn and can share on the journey we embarked upon back in July.