I want to talk about two very different approaches to street-level imagery — because I think the comparison reveals something interesting about how architectural choices shape what a system can and cannot do.
The 50 images cycling above were captured by a Bee driving through downtown Los Angeles — south along Spring Street past City Hall and the DTLA courthouse district. A contributor drove this route on a Tuesday evening, and the device recorded geotagged keyframes over several blocks: crosswalks, traffic lights, construction barriers, the angular facades of Civic Center buildings. And this is just one of nearly 300 Bee drive sequences captured in this single square kilometer of downtown LA, totaling over 1,500 frames.
This is, broadly speaking, the same thing Google StreetView has been doing since 2007. But the architectural decisions underlying the two systems are so different that they end up producing meaningfully different outcomes. I think it is worth walking through why.
The scale of what Google built
It is easy to take Google StreetView for granted. Since launching in 2007 with five U.S. cities, the project has captured over 220 billion images across more than 110 countries, driven enough road miles to circle the Earth over 400 times, and deployed cameras on cars, trikes, boats, backpacks, and even camels. By any measure, it is one of the most ambitious data collection programs ever undertaken — and billions of people use the result every day to preview destinations, verify addresses, and explore places they have never been.
Two architectures, two goals
Here is where things get interesting. Google and the Bee are not trying to solve the same problem. Google StreetView is building an immersive, explorable visual record of the world — optimized for completeness and the experience of virtually visiting a place. The Bee is building a fresh, machine-readable map layer — optimized for update frequency, API accessibility, and AI-driven feature detection. These are related but distinct goals, and they lead to very different architectural choices.
Google chose what I would call a centralized fleet architecture. A fleet of company-operated vehicles — cars, trikes, boats, backpack rigs — drive predetermined routes carrying roof-mounted camera arrays, LIDAR scanners, GPS, and inertial navigation systems. The raw data is physically transported to one of three regional processing hubs, where it is stitched into 360-degree panoramas, geolocated through sensor fusion, run through automated blurring, and eventually published. The whole pipeline from capture to availability typically takes weeks to months.
The Bee takes what I would describe as a distributed contributor architecture. The Bee is not a regular dashcam — it is a purpose-built mapping device engineered from the ground up for high-quality, geolocated street-level imagery. It packs a wide-angle lens, precise GPS, an IMU, and onboard processing into a compact form factor that any driver mounts behind their windshield. Contributors drive their normal daily routes — commutes, errands, deliveries — and the device captures both geotagged keyframes and continuous video. There are no predetermined routes, no fleet operations center, no regional processing hubs. Each Bee uploads directly, and AI models scan every frame for map features: road signs, lane markings, traffic signals, crosswalks, speed limits, construction zones. The video stream also feeds an AI Events system that automatically detects and clips driving events — harsh braking, swerving, stop sign violations — complete with GPS tracks and IMU data. Here is an example — a swerving event automatically detected and clipped on a road in Ibaraki Prefecture, Japan:

Metadata
Try this API query
curl https://beemaps.com/api/developer/aievents/699258e9e834f725d21fcd89\ ?includeGnssData=true\&includeImuData=true \ -H "Authorization: Basic <your-api-key>"
These are genuinely different architectural choices, not just different scales of the same approach. And like most architectural choices, the downstream consequences compound over time.
Where the architectures diverge
I think the most useful way to compare the two systems is along several concrete dimensions. Here is a summary:
| Google StreetView | The Bee | |
|---|---|---|
| Update frequency | Every 1-8 years | Continuous — weekly or daily |
| Collection model | Centralized fleet | Distributed contributor network |
| Time to publish | Weeks to months | Minutes to hours |
| Data access | Paid API; no derived products | Unrestricted license; train models, build products |
| Video | No continuous video | Continuous video with AI event detection |
| On-demand coverage | No mechanism to request | Honey Bursts — task any area |
| Edge compute | Closed platform | Open — deploy your own AI on the device |
Freshness
This is probably the most consequential difference. Google StreetView refreshes major metro areas every 1-2 years, suburban areas every 2-3 years, and rural areas every 3-8 years. Some locations still show imagery from the early 2010s. This is not a criticism of Google — it is a structural consequence of the centralized fleet model. Re-driving millions of miles with specialized vehicles is expensive, and there are only so many vehicles.
The Bee network collects continuously. Because every contributor is mapping as they drive, popular roads get re-mapped weekly or even daily. A new speed limit sign, a road closure, or a freshly painted crosswalk can show up in the data within minutes to hours of a contributor driving past it. The world changes faster than any centralized fleet can re-photograph it.
Coverage economics
Google must physically drive every mile it wants to map with its own vehicles and contractors. Dense urban areas with high search traffic get refreshed frequently; suburban neighborhoods, small towns, and rural roads may wait years.
The Bee model inverts this. Coverage follows organic driving patterns — wherever people drive, the map updates. And when organic coverage is not enough, anyone can task the network directly: Honey Bursts let you draw a polygon on a map and incentivize contributors to drive that specific area. Need fresh imagery of a construction zone or a rural stretch of highway? You can request it and have contributors cover it within days. With Google, there is no mechanism to request coverage — you wait until the fleet comes back.
Google's per-vehicle setup has historically exceeded $100,000, with estimated per-mile collection costs of $125 or more. The Bee is designed to be affordable and mounts in any vehicle a contributor already owns. The economics are fundamentally different.
Data accessibility
Google offers a paid StreetView API, and you can embed panoramic imagery in your applications. But the terms of service prohibit creating derived products — you cannot use it to train computer vision models, extract map features, or build AI pipelines on top of it. It is designed for display, not for derivation.
Every frame in the Bee network is accessible through the Bee Maps API. Here is what it looks like to pull the latest imagery for the block you are looking at above:
curl -X POST https://beemaps.com/api/developer/imagery/latest/poly \
-H "Authorization: Basic <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "Polygon",
"coordinates": [[
[-118.254, 34.050], [-118.250, 34.050],
[-118.250, 34.054], [-118.254, 34.054],
[-118.254, 34.050]
]]
}'
You get back geotagged frames with GPS accuracy metrics, IMU data, and timestamps — building blocks for fleet management, insurance, urban planning, autonomous driving, and more.
Developers can also deploy custom AI agents directly on the Bee itself through the Edge AI platform — running your own computer vision models on the device in real time. Google runs sophisticated AI on its own fleet cameras, but that capability is entirely internal. The Bee opens it up to any developer.
Camera perspective
Google's roof-mounted cameras capture immersive 360-degree panoramas — excellent for the "virtual tourism" use case that StreetView is best known for.
The Bee captures forward-facing imagery from dashboard height — the perspective a driver actually sees. Because it was purpose-built for mapping rather than adapted from a consumer dashcam, the optics, GPS precision, and IMU data are optimized for producing imagery that machines can reason about. This is directly useful for training computer vision models, detecting road features, and feeding the AI systems that need to understand what a road looks like from a vehicle's point of view.
The deeper point
When Google StreetView launched in 2007, modern AI models were not really a thing. Deep learning was still years from its breakthrough moment. StreetView was designed for human eyes — a visual product for people to explore streets on a screen. And it solved that problem extraordinarily well. Billions of people benefit from it.
But the world has changed. Today, the most consequential consumers of street-level imagery are not humans browsing Google Maps — they are AI systems. And these systems have very specific needs that StreetView was never designed to serve.
Consider a self-driving company training a world model. That model needs to see the same intersection across hundreds of cities, every week, with full video and telemetry — not a static panorama from 2023. It needs to learn what a construction zone looks like the day it appears, how lane markings fade over months, what a school zone looks like at 3pm versus 8am. It needs this data with an unrestricted license to train on. And it needs it from every continent, refreshed continuously, at a cost that scales.
The Bee was built for exactly this. It is not asking "how do we let a person virtually visit a street?" — it is asking "how do we continuously feed AI systems the visual data they need to understand and reason about roads?" Geotagged video and keyframes from last week, accessible through an API, with AI-detected features already extracted. From LA to Tokyo to rural Poland, refreshed by thousands of contributors driving their daily routes.
These are different problems, and they call for different architectures. Google chose a centralized fleet optimized for comprehensive, immersive human browsing. The Bee chose a distributed network optimized for freshness, machine readability, and the kind of scale that AI world models demand. Both approaches produce street-level imagery. But they serve different eras of computing, and I think the world benefits from having both.
The 50 frames above were captured days ago in Los Angeles. If you want to see what that same stretch of road looks like today, a Bee contributor may have already driven it again.
Try it yourself
Access street-level imagery from Los Angeles, Phoenix, and anywhere in the Bee Maps network through the Road Intelligence API. Query any area, pull the latest frames, and see the road as it looks right now. Get an API key to start building.



