08:00 08:30
Prof. Piotr Dudek

Introduction

Abstract

TBD

08:30 09:00
Prof. Kwabena Boahen

Retinomorphic Vision Sensors

Abstract

Event cameras achieve higher (effective) sampling rate and shorter latency than frame cameras. Events from a Dynamic Vision Sensor (DVS) report temporal contrast (changes in log-luminance), but coherent optical flow produces incoherent events. For instance, when a DVS camera views a cluttered scene from a moving platform (e.g., a drone), an abrupt edge triggers events with much shorter latency than a smooth edge, and the contrast required to trigger an event is much higher for the latter. Unlike DVS, a Retinomorphic Vision Sensor (RVS) takes the ratio between local and surrounding luminance to report spatial contrast, adapts locally to optical flow (speed) to preserve temporal coherence across space, and detects globally coherent activity to suppress background motion. In the past, incorporating this processing on the sensor (~40 transistors/pixel) sacrificed fill-factor and pixel count. Only recently did it become possible to combine efficient photo-transduction with dense mixed-signal processing by hybrid-bonding a Back-side Illuminated (BI) CMOS Image Sensor (CIS) wafer with a Mixed-Signal (MS) CMOS wafer fabricated in a finer process.

09:00 09:30
Prof. Xuan 'Silvia' Zhang

Beyond Pixels: Co-Designing What On-Sensor Vision Emits, Hides, and Trades Off

Abstract

Modern edge vision systems are bottlenecked not by computation, but by the data the sensor must read out and transmit. Raw pixels are simultaneously an energy bottleneck dominated by ADC and off-chip communication and an information bottleneck for downstream tasks. The right sensor output is rarely an image; deciding what it should be is an algorithm–hardware co-design problem spanning optics, pixel circuits, and downstream models. This talk organizes our recent work into three threads. LeCA, BlissCam, and SnapPix push compression into the pixel array, turning video into information-rich coded outputs and cutting edge energy by up to an order of magnitude. PrivateEye and HoloCode use diffractive optics and metasurfaces to separate task-relevant from sensitive features before the photodiode, enabling near-zero-energy privacy preservation. CamJ and our quantitative modeling framework let us reason about energy, autonomy, and task accuracy as a single co-design problem.

09:30 10:00
Dr. Mika Laiho

On-Sensor Computer Vision: Heterogeneous Architectures for Low-Latency Perception

Abstract

On-sensor computing is emerging as a powerful approach to reducing latency, energy consumption, and off-chip data transfers in modern vision systems. By integrating sensing and computation on the same chip, it enables real-time perception directly at the source of data. However, this integration also introduces new design constraints, including limited flexibility, shared manufacturing technology, and tight area budgets—requiring a fundamental rethinking of both hardware architectures and their coupling to vision algorithms. In this talk, I will explore heterogeneous on-sensor computer vision architectures, focusing on latency-critical applications such as collision avoidance in autonomous drones, robots, and vehicles. I will discuss how efficient hardware–algorithm co-design and careful dataflow planning can eliminate bottlenecks and reduce intermediate storage. As a case study, I will present a prototype heterogeneous on-sensor vision chip (RECER S1), which integrates a 640×480 pixel array with multiple specialized computing cores, including pitch-matched column-parallel processing units, a 2D cellular neural network, associative memory, and an embedded RISC-V processor. I will explain how its key architectural choices and dataflow strategy enable low-latency processing directly on the sensor.

10:00 11:15
11:15 11:45
Prof. Gordon Wetzstein

Towards Agentic Computational Photography

Abstract

Neural networks and advanced image processing algorithms excel in a wide variety of computer vision applications, but their high performance often comes at a steep computational and bandwidth cost. In this talk, we explore a shift from passive capture to agentic computational photography—a paradigm where imaging systems dynamically adapt their acquisition strategy to the task at hand. We first discuss hybrid optical-digital co-design strategies that outsource intensive computations into the optical domain, enabling processing at the speed of light with minimal power. Building on this foundation, we introduce task-aware foveated imaging systems that treat sensor acquisition as a learned attention policy. By leveraging dual-stream architectures and closing the perception-acquisition loop, these systems intelligently allocate bandwidth to critical regions of interest in real-time. This convergence of optical computing and adaptive acquisition opens new frontiers for intelligent imaging systems capable of high-performance perception under strict power and latency constraints.

11:45 12:15
Dr. Mina Khoei

Speck: Where Sparse Sensing Meets Sparse Neural Processing

Abstract

This talk provides an overview of Speck, a smart vision sensor that integrates a Dynamic Vision Sensor (DVS) with a Spiking Neural Network (SNN) processor. Speck processes data in a fully asynchronous fashion, preserving the sparsity of recorded light changes through spike-based processing. As a result, it offers a low-power, low-latency solution built on spiking CNNs (sCNNs). The talk will cover the sensor’s architecture and highlight some implemented applications.

12:15 12:45
Dr. Richard Newcombe

Invited Speaker Talk

Abstract

TBD

12:45

Lunch