Apple: You (Still) Don’t Understand the Vision Pro

Dear Apple,

I was, given my interest in virtual and augmented reality, already primed to have a high degree of interest in the Vision Pro, but even so, I appreciate how you have gone out of your way to make sure I’m intrigued. You let me try the Vision Pro the day it was announced, and while I purchased my own the day it shipped (and had it flown over to Taiwan), you recently sent me a demo version of the M5 Vision Pro (it’s definitely snappier, although I don’t like the Dual Knit Band at all; the Solo Knit Band continues to fit my head best).

However, the reason I truly know you are trying to win my heart is that not only did you finally show a live sporting event in the Vision Pro, and not only was it an NBA basketball game, but the game actually featured my Milwaukee Bucks! Sure, I had to jump through VPN hoops to watch the broadcast, which was only available in the Lakers home market, but who am I to complain about watching Giannis Antetokounmpo seal the game with a block and a steal on LeBron James in my M5 Vision Pro?

And yet, complain I shall: you have — like almost every video you have produced for the Vision Pro — once again shown that you fundamentally do not understand the device you are selling. I’m incredibly disappointed, and cannot in good faith recommend any model of the Vision Pro to basketball fans (or anyone else for that matter).

Sports on TV

Apple, you are one of the grandfather’s of the tech industry at this point; it’s hard to believe that you are turning 50 this year! Still, you are much younger than TV generally, and sports on TV specifically. The first U.S. television broadcast of a sporting event was a Columbia-Princeton baseball game on May 17, 1939 on NBC; there was one camera accompanying the radio announcer. Three months later NBC televised the first Major League Baseball game between the Brooklyn Dodgers and Cincinnati Reds; this time they used two cameras.

All televised sports face a fundamental limitation when it comes to the fan experience: the viewer is experiencing something that is happening in real life 3D on a 2D screen; the solution NBC discovered from the very beginning was to not try and recreate the in-person experience, but to instead create something uniquely suited to this new medium. Two cameras became three, then four, then 147 — that’s how many cameras Fox used for last year’s Super Bowl broadcast. Of course many of those cameras were specialized: included in that number were 27 super slow motion cameras, 23 high resolution cameras, 16 robotic cameras, 10 wireless cameras, and two SkyCams.

The job of stitching all of those cameras together into one coherent broadcast falls on the production team, housed in a specially equipped truck outside the stadium; that team coordinates with the broadcast booth to provide a seamless experience where every jump feels natural and pre-meditated, even though it’s happening in real time. It’s a great experience!

And, of course, there is the pre-game, half-time, and post-game shows, which used an additional 64 cameras, including 12 wireless cameras, eight robotic cameras, seven augmented reality cameras, and a FlyCam. No broadcast is complete without something to fill the time when the game isn’t on. After all, as advanced as TV broadcasts may be, they still face the fundamental limitation that confronted NBC: how do you translate an in-person experience into something that is compelling for people on their couch looking at a 2D screen?

Vision Pro Disappointments

When I first tried the Vision Pro the demo included a clip from an NBA game that was later cut from the demo that shipped with the device (which was the one available in Apple Stores); it jumped out at me at the time:

What was much more compelling were a series of immersive video experiences that Apple did not show in the keynote. The most striking to me were, unsurprisingly, sports. There was one clip of an NBA basketball game that was incredibly realistic: the game clip was shot from the baseline, and as someone who has had the good fortune to sit courtside, it felt exactly the same, and, it must be said, much more immersive than similar experiences on the Quest.

It turns out that one reason for the immersion is that Apple actually created its own cameras to capture the game using its new Apple Immersive Video Format. The company was fairly mum about how it planned to make those cameras and its format more widely available, but I am completely serious when I say that I would pay the NBA thousands of dollars to get a season pass to watch games captured in this way. Yes, that’s a crazy statement to make, but courtside seats cost that much or more, and that 10-second clip was shockingly close to the real thing.

What is fascinating is that such a season pass should, in my estimation, look very different from a traditional TV broadcast, what with its multiple camera angles, announcers, scoreboard slug, etc. I wouldn’t want any of that: if I want to see the score, I can simply look up at the scoreboard as if I’m in the stadium; the sounds are provided by the crowd and PA announcer. To put it another way, the Apple Immersive Video Format, to a far greater extent than I thought possible, truly makes you feel like you are in a different place.

The first thing that has been frustrating about the Vision Pro has been the overall absence of content; Apple, you produced a number of shows for launch, and then added nothing for months. The pace has picked up a bit, but that has revealed a second frustration: I think that your production stinks!

One of the first pieces of sports content that you released was an MLS Season in Review immersive video in March 2024; I wrote in an Update:

I have a lot to say about this video and, by extension, the Vision Pro specifically, and Apple generally. Let me work my way up, starting with the video: it’s terrible. The problem — one that was immediately apparent before I got into all of the pedantry below — is that while the format is immersive, the video is not immersive at all. This is the big problem:

A stopwatch app measuring every cut in Apple's MLS video

This is a screenshot of a stopwatch Mac app I downloaded because it supported keyboard shortcuts (and could thus use it while watching the immersive video). There are, in a five minute video, 54 distinct shots; that’s an average of one cut every six seconds! Moreover, there wasn’t that much gameplay: only 2 minutes and 32 seconds. Worse, some of the cuts happen in the same highlight — there was one play where there was a sideline view of the ball being passed up the field, and then it switched to a behind-the-goal view for the goal. I actually missed the goal the first time because I was so discombobulated that it took me a few seconds to even figure out where the ball was.

In short, this video was created by a team that had zero understanding of the Vision Pro or why sports fans might be so excited about it. I never got the opportunity to feel like I was at one of these games, because the moment I started to feel the atmosphere or some amount of immersion there was another cut (and frankly, the cuts were so fast that I rarely if ever felt anything). This edit may have been perfect for a traditional 2D-video posted on YouTube; the entire point of immersive video on the Vision Pro, though, is that it is an entirely new kind of experience that requires an entirely new approach.

I had the exact same response when you released a video of a Metallica concert last March:

As for the concert itself, the video was indeed very cool. The opening shot following James Hetfield walking into the stadium was very compelling, and, well, it was immersive. And then you cut to another camera angle, and while that camera angle was also immersive, the video as a whole no longer was.

What followed was a very enjoyable 30 minutes or so — I’ll probably watch it again — but it felt like a particularly neat documentary, not like I was at a concert. You had a monologue from each member of the band, you had shots of the crowd, you had three songs, all, as Apple proudly noted in their press release, shot with “14 Apple Immersive Video cameras using a mix of stabilized cameras, cable-suspended cameras, and remote-controlled camera dolly systems that moved around the stage.” That means the final product was edited together from those 14 cameras and the four interviews, which is to say it was a produced artifact of a live experience; at no point did I feel like I was at the concert.

News flash: I didn’t watch the video again. I’m just not that interested in a TV-style documentary of Metallica. I added:

We are nearly two years on from that introduction, and over a year beyond the actual launch of the Vision Pro, and there has yet to be an experience like I envisioned and thought was coming. What is frustrating is that the limiting factor is Apple itself: the company had 14 Apple Immersive Video cameras at this concert, but what I want is only one. I want an Apple Immersive Video camera planted in the audience, and the opportunity to experience the concert as if I were there, without an Apple editor deciding what I get to see and when.

Needless to say, you probably already know why I thought Friday’s telecast was a big disappointment.

Bucks vs. Lakers

I understand, Apple, why it’s not easy to record or even take a screenshot of a copyrighted game; please bear with me while I describe the experience using text.

When I started the broadcast I had, surprise surprise, a studio show, specially tailored for the Apple Vision Pro. In other words, there was a dedicated camera, a dedicated presenter, a dedicated graphics team, etc. There was even a dedicated announcing team! This all sounds expensive and special, and I think it was a total waste.

Here’s the thing that you don’t seem to get, Apple: the entire reason why the Vision Pro is compelling is because it is not a 2D screen in my living room; it’s an immersive experience I wear on my head. That means that all of the lessons of TV sports production are immaterial. In fact, it’s worse than that: insisting on all of the trappings of a traditional sports broadcast has two big problems: first, because it is costly, it means that less content is available than might be otherwise. And second, it makes the experience significantly worse.

Jump ahead to game action. The best camera was this one on the scorer’s table:

I have, as I noted, had the good fortune of sitting courtside at an NBA game, and this very much captured the experience. The biggest sensation you get by being close to the players is just how tall and fast and powerful they are, and you got that sensation with the Vision Pro; it was amazing.

The problem, however, is that you would be sitting there watching Giannis or LeBron or Luka glide down the court, and suddenly you would be ripped out of the experience because the entirely unnecessary producer decided you should be looking through one of these baseline cameras under the hoop:

These are also not bad seats! I’ve had the good fortune of sitting under the basket as well. These are the seats where you really get a sense of not just the power but also the physicality of an NBA game: I would gladly watch an entire game from here. But alas, I was only granted a few seconds, before the camera changed again.

This was absolutely maddening — so maddening, that I am devoting a front page Article to a device no one but me cares about, in the desperate attempt to get someone at your company to listen. What makes the Vision Pro unique is the sense of presence: you really feel like you are wherever the Vision Pro takes you. In other words, when I’m wearing the Vision Pro, and the camera actually stays fixed — like, for example, when you set up a special fourth camera specifically for the Lakers Girls performance, which I think was the single longest continual shot in the entire broadcast — I get the sensation of sitting courtside at Crypto.com Arena, and it’s amazing. Suddenly $3,499 feels cheap!

However, when I’m getting yanked around from camera to camera, the experience is flat out worse than just watching on TV. Just think about it: would it be enjoyable to be teleported from sideline to baseline to baseline and back again, completely at the whim of some producer, and often in the middle of the play, such that you have to get your bearings to even figure out what is going on? It would be physically uncomfortable — and that’s exactly what it was in the Vision Pro.

What is so frustrating is that the right approach is so obvious that I wrote about it the day you announced this device: one camera, with no production. Just let me sit courtside and watch an NBA game. I don’t need a scoreboard, I can look up and see it. I don’t need a pre-game or post-game show, I can simply watch the players warm-up. I don’t need announcers, I’d rather listen to the crowd and the players on the court. You have made a device that, for this specific use case, is better than TV in every way, yet you insist on producing content for it like it is TV! Just stop!

A Ticket to Live

There will be more games this year; from your press release last October:

Basketball fans will soon be able to experience NBA games like never before in Apple Immersive on Apple Vision Pro, with a selection of live Los Angeles Lakers matchups during the 2025-26 season, courtesy of Spectrum SportsNet. Viewers will feel the intensity of each game as if they were courtside, with perspectives impossible to capture in traditional broadcasts. The schedule of games will be revealed later this fall, with the first game streaming by early next year, available through the forthcoming Spectrum SportsNet app for Vision Pro.

That schedule was announced last week, and there are six games total (including last Friday’s). Six! That’s it.

I get it, though: producing these games is expensive: you need a dedicated studio host, a dedicated broadcast crew, multiple cameras, a dedicated production crew, and that costs money. Except you don’t need those things at all. All that you need to do, to not just create a good-enough experience but a superior experience, is simply set up the cameras and let me get from the Vision Pro what I can’t get from anything else: the feeling that I am actually there.

And, I would add, you shouldn’t stop with the Lakers: there should be Vision Pro cameras at every NBA game, at every NFL game, at every NHL game, at every MLB game — they should be standard issue at every stadium in the world. There should be Vision Pro cameras at every concert hall and convention center. None of these cameras need a dedicated host or announcers or production crew, because the Vision Pro isn’t TV; it’s actual presence, and presence is all you need. $3,499 is a lot of money for physically uncomfortable TV; it’s an absolute bargain if it’s a way to experience any live experience in the world on demand.

But, alas, you refuse. So nope, I still can’t recommend the Vision Pro, not because it’s heavy or expensive or has an external battery, but because you, Apple, have no idea what makes it special.