Posted by Artem Dementyev, Hardware Engineer, Google Research Most wearable smart devices and mobile phones have the means to communicate with the user through tactile feedback, enabling applications from simple… Read More »An Open Source Vibrotactile Haptics Platform for On-Body Applications.
At just one year old, a baby is more dexterous than a robot. Sure, machines can do more than just pick up and put down objects, but we’re not quite… Read More »Dexterous robotic hands manipulate thousands of objects with ease
Posted by Dave Epstein, Student Researcher and Chen Sun, Staff Research Scientist, Google Research Machine learning (ML) agents are increasingly deployed in the real world to make decisions and assist… Read More »Making Better Future Predictions by Watching Unlabeled Videos
Posted by Xiaofang Wang, Intern and Yair Alon (prev. Movshovitz-Attias), Software Engineer, Google Research When building a deep model for a new machine learning application, researchers often begin with existing… Read More »Model Ensembles Are Faster Than You Think
Posted by Suyog Gupta, Silicon Engineer and Marie White, Software Engineer, Google Research This fall Pixel 6 phones launched with Google Tensor, Google’s first mobile system-on-chip (SoC), bringing together various… Read More »Improved On-Device ML on Pixel 6, with Neural Architecture Search
Cross-posted from Bounded Regret.
To understand neural networks, researchers often use similarity metrics to measure how similar or different two neural networks are to each other. For instance, they are used to compare vision transformers to convnets , to understand transfer learning , and to explain the success of standard training practices for deep models . Below is an example visualization using similarity metrics; specifically we use the popular CKA similarity metric (introduced in ) to compare two transformer models across different layers:
Figure 1. CKA (Centered Kernel Alignment) similarity between two networks trained identically except for random initialization. Lower values (darker colors) are more similar. CKA suggests that the two networks have similar representations.
Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well as the excellent generalization of supervised deep nets might lead us to expect. In this blog post, we will aim to explain why generalization in RL is fundamentally harder, and indeed more difficult even in theory.
We will show that attempting to generalize in RL induces implicit partial observability, even when the RL problem we are trying to solve is a standard fully-observed MDP. This induced partial observability can significantly complicate the types of policies needed to generalize well, potentially requiring counterintuitive strategies like information-gathering actions, recurrent non-Markovian behavior, or randomized strategies. Ordinarily, this is not necessary in fully observed MDPs but surprisingly becomes necessary when we consider generalization from a finite training set in a fully observed MDP. This blog post will walk through why partial observability can implicitly arise, what it means for the generalization performance of RL algorithms, and how methods can account for partial observability to generalize well. Read More »Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability