This Week In AI: (February 10th - 16th 2025)

Table of Contents
This week in AI, we saw Apple and Meta explore further into humanoid and anthropomorphic robots, the Airbnb CEO dismisses AI being used for trip planning/recommendations, and OpenAI add file support to the o1 and o3 models. In research, discriminative models were improved for image synthesis to save on masssive amounts of computation, LLMs can reason and improve algorithms, and researchers from MetaFAIR explore if transformers can plan!
Meta Focuses on Humanoid Robots
Meta is making significant investments and creating a new team within their Reality Labs unit focusing on humanoid robots. The team will focus on developing robots capable of assisting with physical tasks, particularly household chores. This initiative aims to leverage Meta's existing AI technologies, notably its Llama AI models, to enhance the functionality of these humanoid robots. Leading this new robotics division is Marc Whitten, the former CEO of Cruise, who brings extensive experience in autonomous systems. Meta's Chief Technology Officer, Andrew Bosworth, emphasized that the company's advancements in AI and augmented reality are complementary to the developments needed for robotics, suggesting a strategic integration of their existing technologies into this new venture.
Airbnb Not Ready for AI
Airbnb CEO Brian Chesky has expressed a cautious approach towards integrating AI into trip planning services. During the company's fourth-quarter earnings call on February 13, 2025, Chesky stated that while AI holds significant potential to transform the travel industry, the technology is "still in its early days" and "not quite ready for prime time". Instead, Airbnb plans to implement AI to enhance its customer support system. Later this year, the company intends to roll out AI-powered customer service capable of offering multilingual support and efficiently processing the vast number of contracts managed annually.
AI has long played a pivotal role in customer service, streamlining operations through chatbots, virtual assistants, and automated support systems that enhance response times and efficiency. Companies like Airbnb itself have leveraged AI-driven tools to manage bookings, answer customer queries, and even detect fraudulent activity. Given this successful track record, it’s surprising to see scepticism about AI’s potential in trip planning. AI-powered tools like Google’s Bard (now Gemini) and ChatGPT have already demonstrated their ability to curate personalized itineraries, suggest accommodations, and optimize travel routes based on user preferences. With advancements in large language models and real-time data integration, AI is not just a promising tool for the future of travel—it’s already proving its worth.
ChatGPT o1 Adds File Support
A little smaller news from this week, OpenAI has added file support to the o1 and o3 models. Previously, the 4o model has been the most powerful model that can analyze documents directly, so users will now have access to models that can perform complex reasoning on their files too!
Apple's New Pixar-style Robot
Apple's AI research team has unveiled a new project featuring a robot reminiscent of Pixar's iconic character (seen in the GIF below). This initiative explores how expressive movements in non-humanoid robots can enhance user engagement and interaction. The robot lamp responds to voice commands and gestures with lifelike behaviors, such as dancing to music and displaying emotions, aiming to make technology more relatable and functional in everyday settings.

Apple has long had a knack for making technology feel more human, a philosophy deeply embedded in its design ethos since the days of Steve Jobs. From the skeuomorphic interfaces of early iPhones, where digital objects mimicked their real-world counterparts, to the friendly, conversational tone of Siri, Apple has consistently blurred the line between machine and personality. This tradition continues today with its new Pixar-like robotic lamp, which moves expressively, responding to gestures and voice commands with almost lifelike behavior. While Apple has never ventured fully into humanoid robotics, its approach to AI and automation has always leaned toward making technology more intuitive, personal, and emotionally engaging—suggesting that the line between assistant and companion may continue to blur in the years ahead.
Direct Ascent Synthesis
A new paper argues that standard discriminative models, like CLIP, inherently contain rich generative capabilities. While discriminative models effectively map inputs to high-dimensional representations, inverting this process to generate meaningful images often results in adversarial noise rather than coherent visuals.

To tackle this, the authors propose Direct Ascent Synthesis (DAS), a novel optimization approach that leverages multi-resolution decomposition. Instead of directly optimizing pixels, DAS breaks down image generation into multiple spatial scales, guiding the synthesis process away from adversarial artifacts and towards natural-looking images. This method does not require additional generative training and enables high-quality image synthesis, style transfer, and reconstruction with only a fraction of the computational resources needed by traditional generative models.
Improving Algorithms with LLMs
The paper "Improving Existing Optimization Algorithms with LLMs" explores how Large Language Models (LLMs), specifically GPT-4o, can enhance existing combinatorial optimization algorithms by suggesting novel heuristics and implementation improvements. The authors focus on the Construct, Merge, Solve & Adapt (CMSA) algorithm, a hybrid metaheuristic used for solving the Maximum Independent Set (MIS) problem, an NP-hard problem in graph theory. See the below a high-level overview of the chatbot and how it is used.

The proposed approach involves using LLMs to analyze, modify, and enhance CMSA’s heuristics and C++ implementation. By iteratively engaging with the model, researchers discovered that LLMs could generate new heuristics—such as leveraging the "age" parameter to guide solution construction—and provide optimized C++ code to improve efficiency. Experimental results show that the LLM-generated heuristics outperform expert-designed ones, especially on larger and more complex graph instances. This study highlights LLMs' potential as research assistants, enabling experts to refine existing algorithms and discover novel optimization strategies with minimal manual effort. More than anything, I think this highlights modern LLMs reasoning abilities about code and tasks when prompted correctly.
Transformers for Planning?
This new paper from MetaFAIR labs investigates whether decoder-only transformers can truly reason and plan or if they are just pattern-matching. To explore this, the authors train transformer models from scratch to predict the shortest paths in simple, connected graphs and then analyze the internal representations and attention mechanisms within them.
The key finding is that two-layer transformers can successfully learn to predict shortest paths in graphs with up to 10 nodes. Instead of traditional graph algorithms like Dijkstra’s or Bellman-Ford, the models learn a spectral embedding of the line graph, which correlates with the graph’s spectral decomposition. This insight leads to the discovery of a novel approximate shortest-path algorithm called Spectral Line Navigation (SLN), which selects the next step in a path based on spectral embeddings rather than direct distance calculations.
The results show that the transformers achieve 99.42% accuracy in predicting shortest paths, and the SLN algorithm achieves 99.32% accuracy, suggesting that transformers can discover novel algorithmic solutions beyond brute-force memorization. This work not only contributes to understanding how transformers process graph-based reasoning but also highlights their potential to uncover new, interpretable heuristics for complex computational problems.