As the principal investigator for underwater human-robot teaming at MIT Lincoln Laboratory, Madeline Miller bridges the gap between biological intuition and mechanical precision. With a background in developing advanced undersea systems, she leads research that enables autonomous underwater vehicles (AUVs) to act as intelligent partners to human divers in the world’s most challenging environments. Her work focuses on high-stakes scenarios—such as repairing critical subsea infrastructure or clearing mines—where the stakes are high and the visibility is low. By integrating hardware and sophisticated algorithms, she is redefining how we secure the undersea domain.
This interview explores the logistical and technical hurdles of underwater collaboration, specifically addressing how humans and machines navigate together without GPS. We discuss the complexities of teaching AI to “see” through murky water using sonar, the ingenious ways researchers compress data for low-bandwidth acoustic modems, and the evolution of testing from surface boats to real-world dives. Finally, we look toward the future of maritime security and the critical role of human-machine teaming in protecting global infrastructure.
Robots excel at endurance and speed, while humans possess superior dexterity for tasks like repairing cables or deactivating mines. How do you decide which specific tasks to delegate to the AUV versus the diver, and what are the primary hurdles in synchronizing these two very different types of actors?
The division of labor is driven by the physical limitations of the underwater environment. We delegate the “search and survey” phase to the AUV because it can process complex computations and maintain high-speed mobility for hours without fatigue, whereas a human is limited by oxygen and physical exertion. For example, in a power cable repair mission, the AUV maps the line and pinpoints a fault, a task that would take a human far longer and with less spatial accuracy. Once the fault is located, the human takes over because even the most advanced remotely operated vehicles lack the agility for skilled manipulation, such as deactivating a mine or splicing a cable. The primary hurdle in synchronization is the “blind” transition; once the diver is submerged, there is often no communication with the surface, making it incredibly difficult to coordinate motion. We have to plan missions with extreme care to ensure the two actors don’t collide, as they don’t yet truly “collaborate” in real-time.
Underwater environments lack GPS and visible landmarks, often leaving divers dependent on compasses and fin-kick counts. When ocean currents increase the complexity of spatial tracking, what specific hardware adjustments or algorithmic optimizations are necessary to keep the diver and vehicle from losing each other?
When we moved from calm surface simulations to the open ocean, we realized that basic range-only navigation wasn’t enough. Initially, the vehicle only needed to calculate the distance to the diver at regular intervals to estimate positions, but real ocean forces cause that optimization problem to “blow up” or fail. To fix this, we developed a specialized “tube-let”—a tube-shaped prototype tablet—that the diver carries, equipped with a pressure sensor, a depth sensor, and an inertial measurement unit to track relative motion. These extra data points are fed into our navigation algorithms to stabilize the position estimates despite the currents. In our field tests near Portsmouth, New Hampshire, using research vessels like the Gulf Surveyor, we found that having this multi-sensor payload on the diver is the only way to maintain a lock on their position when the water column is pushing both the human and the AUV in different directions.
Sonar images lack color and detail, and AI often struggles with objects obscured by biological growth or structural damage. Since large sonar datasets are rare, how can optical sensor data be used to train sonar classifiers, and what role does human feedback play in refining these classifications during a mission?
The lack of labeled sonar data is a massive bottleneck, so we are pioneering “knowledge transfer” where optical classifiers help train sonar classifiers. Because optical cameras provide high-res color detail that sonar lacks, we use those clear images to teach the AI what an object, like a tire or a downed aircraft, looks like before it becomes obscured by mussels or structural damage. During the mission, we use a human-in-the-loop approach where the AI identifies a potential target and sends a bounding box to the diver. The diver can then verify the finding or tell the AI to “look over here” to improve its classification on the fly. This turns the diver into a real-time teacher, helping the AI navigate the visual “noise” of the deep ocean.
Sending uncompressed images via acoustic modems can take ten minutes or more due to low bandwidth and high latency. To facilitate real-time human-robot communication, what compression strategies or data-sharing protocols are most effective for ensuring the diver receives actionable information?
In the underwater domain, bandwidth is so scarce that we have to be extremely selective about what we transmit. State-of-the-art acoustic rates are so slow that a single raw image would take 10 minutes to arrive, which is useless for a diver in a fast-moving situation. We focus on compressing data into the “minimum amount to be useful,” often sending metadata or simplified bounding boxes rather than the full image. By using commercial off-the-shelf hardware with low power requirements, we prioritize transmission speed over aesthetic quality. The trade-off is clear: the diver doesn’t need a beautiful photo; they need a low-resolution “hint” or a coordinate that allows them to make a split-second decision.
Testing often transitions from surface-level surrogates, like kayaks or skiffs, to actual human divers using specialized tools like “tube-lets.” What were the most surprising differences in system performance when moving from controlled surface simulations to realistic open-ocean conditions?
The most surprising factor was how much the scale of the “surrogate” affects the data quality. When we used large research vessels to pretend to be divers, we couldn’t accurately mimic the slow, rhythmic motion of a human swimmer. Moving to a small skiff on the Charles River actually provided better data because we could match the relative motion of a diver and an AUV more closely. However, once we got into the Great Lakes with real human divers, the lack of a real-time feedback interface became a glaring challenge. The divers were holding the tube-lets, but without a way to talk back to the AUV, we had to rely entirely on the pre-planned motion paths. This transition highlighted that “realistic conditions” aren’t just about the water—it’s about the unpredictable behavior of the human in the loop.
The global economy relies heavily on undersea telecommunication and power cables, which are increasingly vulnerable to both natural failures and external interference. Given the rising density of autonomous systems in contested waters, how can human-machine teaming improve the security of this infrastructure?
Human-machine teaming is the only way to scale the protection of thousands of miles of undersea cables. AUVs can provide a persistent “eyes-on” presence that humans cannot, scanning for interference or damage across vast distances. For a harbor entry mission, we define success by the system’s ability to operate in an environment with no prior maps—relying on satellite data for the surface but using the AUV to build an underwater map in real-time. By combining AI’s ability to monitor these vast networks with the human’s ability to perform surgical repairs, we create a defensive layer that is much faster and more resilient than traditional ship-based ROV deployments. This synergy is what will allow us to maintain a strategic advantage as the undersea domain becomes more contested by other nations.
What is your forecast for autonomous maritime systems?
I believe we are moving toward a future where the distinction between “human missions” and “robotic missions” will disappear entirely. Within the next decade, AUVs will no longer be seen as just tools, but as true teammates that can anticipate a diver’s needs, much like a seasoned dive partner would. We will see the deployment of “heterogeneous swarms”—groups of specialized robots that can communicate with each other and a human leader to secure entire harbors or inspect hundreds of miles of cable in a single mission. As the undersea domain becomes the primary battlefield for global economic security, the ability to merge human intuition with machine endurance will be the defining factor in who controls the depths.
