AI’s Achilles’ heel: Addressing latency as part of a National AI Action Plan
Bowling Green, Ky. (April 24, 2025) - In the 2002 horror movie, “The Ring,” a haunted videocassette delivers a horrifying climax when the undead creature shown on the video — the mere sight of which has caused the death of anyone who sees it — steps out of the screen to pursue a victim. Over the next few years, we’re likely to see AI step out of the screens of our smartphones and computers to start doing things, and help us do things, in the real world. But thankfully, this is not a horror movie.
While some people are rightfully cautious about AI and what it means for our society, there are a lot more reasons for hope than fear. The real danger is being left out of this revolution. Unfortunately, AI operations in real space threaten to create a new Digital Divide if a lot of places don’t step up with some smart infrastructure investment.
For AI to do work in the physical world, adapting to its environment appropriately and inferring from inputs, it requires ultra-low latency connectivity, or in human terms, fast reflexes. A variety of factors contribute to that, but a particularly stubborn one is the long distances that data sometimes need to travel for every interaction with the internet.
When that data journey is hundreds or thousands of miles, it can still happen counter-intuitively fast (e.g., tens to hundreds of milliseconds) because the data generally travels at the speed of light. Nonetheless, in highly time-sensitive applications, like communications between autonomous vehicles (AVs) in hazardous traffic situations, detect-and-avoid behavior by beyond visual line of sight (BVLOS) delivery drones, or adding realistic holograms to a person’s visual field for an augmented reality (AR) experience, milliseconds matter.
Achieving ultra-low latency depends principally on having a nearby internet exchange point (IXP) where networks exchange data traffic locally, and data centers that house powerful compute capabilities to process information. Much of the U.S. landmass currently lacks both. So, whole regions are on track to get less from cutting-edge AI because of the added latency.
An accelerating nationwide fiber buildout, soon to be boosted by the Broadband Equity, Access, and Deployment (BEAD) program, is making bandwidth so abundant as to be negligible as a source of lag time in using the internet. That leaves latency as the main barrier to faster connectivity. And while that doesn’t matter for training AI models, which occur centrally, it matters for consumer-facing generative and inference AI.
It will matter even more as AI bears fruit in robots and smart tools that can interact with their physical environments, leveraging cloud-based artificial brainpower that operates at the network edge. Some overdue investment in a more distributed IXP infrastructure can ensure that places don’t get left behind. Federal policymakers mulling the responses to a recent Request for Information (RFI) on the development of a National AI Action Plan should think about that.
The evolution of AI: From cyberspace to real space
AI-powered applications are quickly evolving from cyberspace to physical space. Driverless cars, or in more formal terms “autonomous vehicles” (AVs), are just breaking out as the first mass use-case of AI in a physical space application. Before AI, they struggled for a decade to surmount technical, regulatory, and public trust hurdles. Then AI breakthroughs fast-forwarded AV progress so that commercial driverless taxi services are now operating in San Francisco, Phoenix, and a few other pioneer cities. AVs currently comprise only a minuscule share of the global automotive fleet. But expect them to scale from thousands to millions over the next few years.
Driving is actually an unusual example, in that new technology will directly substitute for an essential, daily human task. More often, technology does things that are more original than that. Telephones weren’t really the new messenger boys. Lightbulbs weren’t exactly the new candles. TV wasn’t quite the new theater. And the internet wasn’t just the new Yellow Pages.
New tech meets old needs but also opens up new possibilities. Even in the case of driving, traffic patterns when the automotive fleet is fully self-driving are likely to look very different from what we’re used to, in ways that probably no one alive today could imagine. By the same token, while some impressive humanoid robots are emerging, AI may have more impact by empowering humans to use smart tools with the help of augmented reality (AR) headsets.
All prediction is risky. At the same time, extrapolation from trends can be effective. Here are three trends that are likely to continue:
- Demand for internet speed has escalated annually for many years at a pace far above the general growth rate of the economy. The reasons why have varied as computers and the internet have evolved, but the pattern is always more.
- IT keeps getting more ubiquitous and getting incorporated into more things to make them “smart.”
- More and more “compute” functionality keeps getting moved into the cloud, which can handle overhead and scale better.
AI can meld with these trends by making human-machine interfaces more intuitive and making machines more adaptable. It will take a spontaneous whole-of-society discovery process to dream up and then operationalize all the new potentialities that it creates, as has happened with previous “general purpose technologies” like printing, steam power, electricity, and cars.
But the movement of AI from cyberspace to real space will require good machine “reflexes,” fast real-time adaptation to a physical environment, often leveraging the memory and computing power of the cloud. That, in turn, will require an internet infrastructure that can deliver ultra-low latency.
Why latency is starting to matter more than bandwidth: The basic math
How fast is your internet connection? The question requires multiple answers. To show what I mean, I just ran an online speed test. You can do it, too. It gave me a few metrics:
The numbers in this example mean:
- 280 megabits per second (Mbps) is the download bandwidth;
- 9.8 Mbps is the upload bandwidth;
- 33 milliseconds (ms) is the “unloaded” latency that the network can provide;
- 52 ms is the “loaded” latency when local delaying factors, such as ongoing traffic from other apps and devices, is taken into account.
All this matters because it affects the total lag time of every interaction with the internet.
For example, suppose I want to check Facebook, and there’s one megabyte (1 MB), or 8 megabits (1 byte = 8 bits), of content on the page.
Latency and bandwidth are like the “length” and “diameter” of the “pipe” by which that 1 MB gets to me. The total lag time is driven by both.
First, because of latency, the data takes 52 ms even to begin to arrive.
Then the constrained bandwidth means that it takes 8 megabits / (280 megabits per second) = 29 ms more for the data to finish arriving so that I can view the page.
That adds up to just 81 ms of delay in loading Facebook due to multidimensional network speed, or less than 1/10 of one second, which is barely noticeable and on a par with the fastest recorded human reflexes.
If most people’s lag times are barely noticeable, it makes sense that national broadband policy heretofore has largely ignored latency as a qualitative measure due to the satisfactory nature of the connections enjoyed by the majority, and that it has largely focused on helping people who lack access to service all together.
For people with typical internet speeds today, long lag times aren’t a major pain point — yet. But all the trends point to internet speeds that are adequate today becoming inadequate in future, as technology keeps adapting to bandwidth abundance.
And if we do care about speed, the basic math also shows why latency can matter more than bandwidth, as in the example above. As computing for applications becomes centralized, lowering latency will become more and more important. This is particularly true as AI begins to do work in the physical realm, as it will give rise to growing numbers of tasks where rapid responsiveness matters, and human reflexes aren’t in the loop to be the bottleneck. That’s when ultra-low latency will really pay off.
Meanwhile, latency’s relative importance will increase in the future as trends in deployment kill bandwidth as a pain point. Fiber overbuilding is sweeping the country, with over 10 million homes passed in 2024, and BEAD is on track to deliver mostly end-to-end fiber to hitherto deprived locations.
For end-to-end fiber internet connections, gigabit symmetric service is routine. I showed how latency already contributes more than bandwidth to my lag time in loading a 1 MB page. If I had a gigabit symmetric connection, bandwidth would contribute just 8 ms, but latency would still add 52 more, comprising 87% of the lag time.
Fiber deployers have long argued that the bandwidth offered by fiber is “future-proof,” and considering that the gigabit symmetric connections routinely offered via end-to-end fiber-optic cable — and even that doesn’t max out their capacity — vastly exceed the normal requirements even of ultra high-resolution video streaming, the claim seems justified. As bandwidth becomes so abundant that it no longer matters at the margin, latency is next in line as the impediment to best-in-class connectivity.
Internet infrastructure: IXPs, latency, and the new Digital Divide
Your bandwidth is dependent upon the infrastructure that physically delivers the internet to your house, but latency depends on where you sit in the larger global architecture of the internet. In other words, bandwidth is as high as the weakest link in the chain from where you are to the data you’re fetching. That’s usually the last mile.
By contrast, latency has little to do with the technical specifications of the last-mile pipe to your home. It’s about the journey, and at best, that’s the distance to the nearest peering point where internet traffic is exchanged — the IXP.
Thus, in the case of internet services delivered by satellites in geostationary orbit (GEO), the signal has to make two round trips approximately 22,000 miles up to the satellite and back every time you interact with the internet. Even at the speed of light, that adds a half-second of delay. It doesn’t matter if you’re watching a movie, but it’s likely a problem for a Zoom call — and will be increasingly problematic in the age of real-time AI-driven applications.
For other technologies, although the data journey isn’t all at the speed of light — electrons in copper wires carry data 100 times slower than photons in fiber-optic cables — the vast majority of it is, since data soon gets onto the fiber-optic highways that are the internet’s backbone.
So, the main latency driver is how far the light has to travel. But to where?
We think of internet search in global terms. And it’s true that the information may originate anywhere. But content delivery networks (CDNs) distribute most of the data people regularly access to local IXPs to speed response times and economize long-distance data transport. Therefore, efficiently routed data packets may not need to travel further than to one of the hundreds of IXPs scattered all over the world.
But as mentioned above, that IXP may be hundreds or even thousands of miles away, as is the case in Alaska. Without a local IXP, every interaction with the internet needs to follow a “trombone” route along many miles of middle-mile fiber to fetch a response and return back to the end-user.
The emerging geography of AI: Learning at the center, generative and inference at the edge
Human beings have long congregated spatially (e.g., at colleges and universities) for efficient acquisition of knowledge, and then distributed geographically for the application of knowledge to economic uses in production and decision-making. That is: they go to college, then go out and get jobs.
It is much the same with AI. Large language models (LLMs) get trained centrally on massive datasets swept up from a variety of sources. The data centers are like colleges in this metaphor. Then the LLMs are put to work in the economy through deployments that are disaggregated and widely distributed to interface with users and process generative and inference tasks on demand.
Where should those generative and inference tasks take place? It depends.
If the use case isn’t time-sensitive, the task might as well be sent to a faraway data center, located where massive compute power has been installed and power is cheap. And while a faster response is always appreciated, there’s a “compute” latency that may swamp the “network” latency and render it immaterial.
If you ask Copilot or Gemini a question, and it thinks for a second or two, then even a data journey of thousands of miles that takes, say, 1/10 of a second, hardly matters.
If the use case is highly time-sensitive, to the point where a fraction of a second makes a difference, then the task should occur at the network edge. AI models will be increasingly deployed at the network edge in regional data centers. Non-time-sensitive applications might as well be deployed there, too, if they’re not so compute-intensive that they require the resources of a major data center.
Table 1 shows how the optimal geographic allocation of tasks depends on how compute-intensive and how time-sensitive they are:
Table 1: The geography of AI
|
Ultra Compute-Intensive |
Less Compute-Intensive |
Ultra Time-Sensitive |
Difficult |
Best at the network edge |
Less Time-Sensitive |
Best at a central data center |
Either centralized or at the network edge will work |
For the many tasks that fall in the lower-right corner of Table 1 — they are neither ultra time-sensitive or ultra compute-intensive — where to perform the task is more of a business than a technical decision. But the emerging pattern involves training AI models centrally, then running generative and inference AI locally.
Above all, if AI-powered smart machines are to physically interact with their environments in real time, they will need to minimize all sources of latency, both network and compute, so they can react instantaneously.
AVs mitigate the need to minimize network latency by having ample compute power onboard, which adds up to 50 pounds of weight, occupies a square foot or two of weight and adds thousands of dollars of cost. That’s OK for a car, and necessary, because we can’t let AVs become unable to drive if connectivity is disrupted.
But smart tools with a smaller form factor will need to rely on centralized compute more and more. And that makes performance depend on the distance to an IXP, which exist most major metro areas but remain nonexistent across much of the U.S. landmass.
A bright future, for some
We’re approaching the point where AI steps out beyond the computer screen and begins impacting things in the real world. With all due caution, AI has the potential to move humanity forward in a dramatic and positive way.
Not only will all the pesky little tasks that clutter your life and get in the way of doing what you want to do be resolved through some robot, sensor, app, or smart tool; human capability, augmented by AI, will succeed in solving some of the world’s most pressing problems — like cancer and chronic disease.
But in many places, the future is currently on track to be bottlenecked by infrastructure gaps that have not yet been addressed. We must be doing more to materially reduce latency everywhere — so that everyone everywhere can benefit from the innovations that AI will bring.
About the Authors:
Nathan Smith is the Connected Nation Director of Economics and Policy. Dr. Smith monitors federal broadband policy, writes public comments for federal agencies that request advice on broadband policy implementation, and helps with business development and proposals
Brent Legg is the Executive Vice President of Government Affairs for Connected Nation. Brent has primary responsibility for leading Connected Nation’s government affairs and public policy team, working with federal, state, and local officials to advance legislation and/or executive actions that will improve access to, and the quality of, broadband service and related technologies. Brent has more than ten years of experience working at the intersection of technology, politics, and public policy, and has provided expert testimony on broadband issues before the U.S. Congress and the legislatures of 12 states. Mr. Legg also has extensive experience working on education technology and school connectivity issues, and is a recognized expert on the federal E-rate program. Contact Brent at blegg@connectednation.org.