Apple is all set to host its ‘Let Loose’ event on Tuesday, May 7, where it is expected to introduce new iPad Air and iPad Pro models and a new Apple Pencil. However, the Worldwide Developers Conference (WWDC) 2024 on June 10, could be the event to lookout for as it can potentially change the company’s approach to its devices, particularly the iPhone. It is said that the Cupertino-based tech giant will unveil its artificial intelligence (AI) strategy and introduce new features with iOS 18. Based on the published papers by Apple researchers, we can see the company’s vision behind it.
A report by The Verge delved deep into the research papers Apple has published recently to highlight that the company is focused on creating a more efficient and smarter Siri, the virtual assistant on the iPhone. While that is one big possibility, there are enough clues that tease that the AI features can entirely reshape the way users interact with the iPhone. Although it is unlikely that all of the new capabilities will be dropped at once, the changes could be introduced incrementally over the next few years.
In most of Apple’s published research papers, there is a focus on small language models (SLMs) that can operate independently inside a device. For example, the company published a paper on an AI model dubbed ReALM, which is shortened for Reference Resolution As Language Model. This model’s functionality is described as performing and completing tasks that are prompted using contextual language. The description has led to the belief that this model could be used to upgrade Siri.
Another such research paper mentions a ‘Ferret-UI’, a multimodal AI model that is “designed to execute precise referring and grounding tasks specific to UI screens, while adeptly interpreting and acting upon open-ended language instructions.” In essence, it can read your screen and perform actions on any interface, be it the Home Screen, or an app. This functionality could essentially make it much more intuitive to use an iPhone via verbal commands over finger gestures.
Then there is Keyframer, which claims it can generate animation from static images, and another AI model that can edit images using AI. These capabilities could exponentially improve the Photos app and allow users to perform complex edits in simple steps, similar to what DALL-E and Midjourney offer.
However, it should be noted that these speculations are based on the published research papers by Apple, and there is no guarantee that they will be turned into a feature. Apple’s vision behind AI will be clearer after the keynote session at the WWDC 2024.