Artificial Intelligence

Unsolved ML Safety Problems

Along with researchers from Google Brain and OpenAI, we are releasing a paper on Unsolved Problems in ML Safety. Due to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address. As a preview of the paper, in this post we consider a subset of the paper’s directions, namely withstanding hazards (“Robustness”), identifying hazards (“Monitoring”), and steering ML systems (“Alignment”).

Robustness research aims to build systems that are less vulnerable to extreme hazards and to adversarial threats. Two problems in robustness are robustness to long tails and robustness to adversarial examples.

Long Tails

Examples of long tail events. First row, left: an ambulance in front of a green light. First row, middle: birds on the road. First row, right: a reflection of a pedestrian. Bottom row, left: a group of people cosplaying. Bottom row, middle: a foggy road. Bottom row, right: a person partly occluded by a board on their back. (Source)

Read More »Unsolved ML Safety Problems

Distilling neural networks into wavelet models using interpretations

Fig 1. A wavelet adapting to new data.

Recent deep neural networks (DNNs) often predict extremely well, but sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted or where interpretation is the goal itself. Moreover, interpretable models are concise and often yield computational efficiency.

Read More »Distilling neural networks into wavelet models using interpretations