Five months after performing a test that put the smart speakers of multiple companies in the spotlight to determine how well they performed in various categories, Loup Ventures is back today with an IQ test focused entirely on digital AI assistants. To get the necessary results, the researchers asked Siri, Google Assistant, Alexa, and Cortana 800 questions each on a smartphone, and compared their findings to a previous AI test held in April 2017.
For Siri in the new test, Apple’s AI helper understood 99 percent of the queries and answered 78.5 percent of them correctly. That’s an improvement on a similar AI-focused test from April 2017 (66.1 percent of 800 questions answered correctly). While Loup Ventures looked at similar methodologies when testing smart speakers in February, the researchers explain that it’s “not worthwhile to compare” the results across these tests since “the use cases differ greatly between digital assistants and smart speakers.”
This is particularly true for Siri on HomePod, which performs well in certain areas but is largely limited to the amount of actions it can perform on the speaker itself. This led Apple’s HomePod to become relegated to the “bottom of the totem pole” in an AI assistant performance test during Loup Venture’s smart speaker research in February, with Siri answering 52.3 percent of 782 total questions correctly, across the same five categories as the new test.
Loup Ventures grades each digital assistant on two metrics: “Did it understand what was being asked?” and “Did it deliver a correct response?” Questions came from five categories, including Local (example: “Where is the nearest coffee shop?”), Commerce (“Can you order me more paper towels?”), Navigations (“How do I get to uptown on the bus?”), Information (“Who do the Twins play tonight?”), and Command (“Remind me to call Steve at 2pm today”).
Questions were asked of Siri on an iPhone running iOS 11.4, Google Assistant on a Pixel XL, Alexa on the iOS app, and Cortana on the iOS app. Siri’s best category was Command (90 percent of questions answered correctly), outperforming all rivals when asked to control aspects of the iPhone, smart home products, Apple Music, and more. Following Command, Siri performed well in Local (87 percent), Navigation (83 percent), and began dipping in Information (70 percent) and Commerce (60 percent).
Google Assistant has the edge in every category except Command. Siri’s lead over the Assistant in this category is odd, given they are both baked into the OS of the phone rather than living on a 3rd party app (as Cortana and Alexa do). We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music, etc. Our question set also includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers.
Google Assistant was the top digital assistant in all categories except Command, with Loup Ventures particularly liking Google’s “featured snippets” feature that reads off search results of voice queries and is often “exactly what you’re looking for.” Both Alexa and Cortana were lesser performers in the test due to the iOS app for each limiting what the assistants can do on an iPhone, unlike Siri’s ability to perform tasks all over iOS and not just in one app.
In total, Google Assistant answered 85.5 percent of the 800 questions asked correctly and understood all of them, compared to Siri’s 78.5 percent answered correctly and 11 misunderstood. Alexa correctly answered 61.4 percent and misunderstood 13, while Cortana was the “laggard” and correctly answered 52.4 percent and misunderstood 19.
Over the 15 month period since April 2017, Siri improved by 13 percentage points, with Loup Ventures pointing out that it was “impressed with the speed at which the technology is advancing” for most of the assistants. The researchers went on to explain that many of the issues they had last year were erased by “improvements to natural language processing and inter-device connectivity.”