Reflecting on Apple's Vision Pro: The Future of XR, RNDR, and Spatial Computing

In the early hours of June 6th, during the WWDC event, and on my fifth day realizing I had contracted Covid for the second time, I sat with a cup of health tea, chatting with friends over a video call. An hour had passed, I couldn’t help but wonder if this time the highly anticipated “One More Thing” would be delayed again.

Finally, at 2 am, Tim Cook appeared on screen and with a sweeping gesture, exclaimed, “One More Thing.” My friends and I cheered in excitement.

“Macintosh introduced personal computing, iPhone introduced portable computing, and Apple Vision Pro is going to introduce Spatial Computing,”

As an enthusiast of cutting-edge technology, I rejoiced at the thought of the new toy I would have next year. But as a Web3 investor with a keen interest in gaming, the metaverse, and AI, this marked the beginning of a new era that sent shivers down my spine.

You might be skeptical and wondering, “What does the upgrade of MR (Mixed Reality) hardware have to do with Web3?” Well, let’s start by discussing Mint Ventures’ thesis on the metaverse & Web3.

Contents

Our Thesis on Metaverse or the Web3 World

The premium on assets in the blockchain world can be attributed to:

The underlying foundation of trusted transactions, leading to reduced transaction costs: In the physical world, the ownership and protection of assets rely on the enforcement of state institutions, often backed by the threat of violence. In the virtual world, however, asset ownership is based on the “trust in consensus and the immutability of data” which prevents tampering. Additionally, it relies on the recognition of the asset itself. Despite the ability to copy and paste, assets like Bored Ape Yacht Club (BAYC) can fetch prices equivalent to a house in a lesser-known city. This valuation is not solely based on the slight differences between the copied and pasted image and the NFT metadata image. It is primarily driven by the market’s consensus on the “non-fungibility” of these assets, enabling their securitization.
The high securitization of assets leads to a liquidity premium.
The “permissionless premium” results from decentralized consensus mechanisms and the ability to conduct transactions without permission.

These factors contribute to the higher valuation and premium associated with assets in the blockchain world.

Virtual goods are easier to securitize compared to physical goods due to the following reasons:

The history of digital asset payments demonstrates that people’s habits of paying for virtual content took time to develop. However, it is undeniable that paying for virtual assets has become ingrained in the lives of the masses. The introduction of the iTunes Store in April 2003, for example, provided the option to purchase legitimate digital music and supported favorite artists, showing people that they could buy music instead of resorting to downloading pirated copies from the internet. In 2008, the App Store further popularized the model of one-time purchases for apps, while subsequent in-app purchases continued to contribute to Apple’s revenue from digital assets. The history of digital asset payments demonstrates that people’s habits of paying for virtual content took time to develop. However, it is undeniable that paying for virtual assets has become ingrained in the lives of the masses. The introduction of the iTunes Store in April 2003, for example, provided the option to purchase legitimate digital music and supported favorite artists, showing people that they could buy music instead of resorting to downloading pirated copies from the internet. In 2008, the App Store further popularized the model of one-time purchases for apps, while subsequent in-app purchases continued to contribute to Apple’s revenue from digital assets.

The evolution of payment models in the gaming industry played a significant role in the securitization of virtual assets. The initial version of games was arcade games, where players paid for the experience, akin to watching a movie. During the era of consoles, the payment model shifted to paying for cartridges/discs, similar to purchasing movies or music albums. In the later stages of the console era, purely digital game versions became available, and platforms like Steam emerged, introducing digital game marketplaces. In-game purchases also helped certain games achieve financial success. The history of changes in game payment models is closely tied to the decreasing distribution costs. From arcade machines to consoles, and eventually to personal computers and mobiles that provide access to digital game platforms for everyone, the distribution costs for games have continually decreased. With players deeply immersed in games, game assets have transformed from being just a part of the gaming experience to purchasable goods. (Although there has been a recent small trend of increasing distribution costs for digital assets over the past decade, primarily due to slower user growth, fierce competition, and gatekeepers like Meta / Google / ByteDance monopolizing attention.)

So, what comes next? Tradable virtual world assets will continue to be a theme that we believe in.

As the immersive experience in virtual worlds continues to improve, people will spend more and more time immersed in these virtual environments, leading to a shift in attention. This shift in attention will also drive a shift in valuation premiums from being heavily reliant on physical assets to virtual assets. The launch of Apple Vision Pro is set to completely transform the way humans interact with the virtual world, resulting in increased immersion and a significantly enhanced immersive experience.

Note: This is a variant definition of pricing strategy. In premium pricing strategy, a brand sets the price at a level significantly higher than the cost, filling the gap between pricing and cost with brand storytelling and experiences. Additionally, factors such as cost-based pricing, competition-based pricing, supply and demand considerations, etc., are also taken into account when pricing products. Here, we are specifically focusing on premium pricing strategy.

History and Present of XR

The exploration of Extended Reality (XR), including VR and AR, in modern society began over a decade ago:

In 2010, Magic Leap was founded. Their remarkable advertisement featuring a whale leaping out of a sports arena in 2015 created a sensation in the tech industry. However, when their product was officially launched in 2018, it received widespread criticism for its poor user experience. In 2021, the company raised $500 million in funding at a post-money valuation of $2.5 billion, which caused the company’s value to be less than one-third of the total amount they raised $3.5 billion. In January 2022, reports emerged that the Saudi Arabian sovereign wealth fund acquired majority control through a $450 million equity and debt transaction, causing the company’s actual valuation to drop to under $1 billion.
In 2010, Microsoft began developing HoloLens and released their first AR device in 2016, followed by a second version in 2019. Priced at $3,000, the actual user experience was underwhelming.
The prototype of Google Glass was unveiled in 2011, and the first product was released in 2013. It gained significant hype and high expectations but ultimately had privacy concerns regarding its camera and a less than satisfactory user experience. It only sold a few tens of thousands of units in total. In 2019, an enterprise edition was released, and in 2022, a new test version was field-tested with unimpressive feedback.
In 2014, Google introduced the Cardboard VR development platform and SDK. In 2016, Daydream VR was launched, which is currently the most widely used VR platform for Android devices.
Sony’s PlayStation started developing its VR platform in 2011, and in 2016, the PSVR made its debut. Although initial sales were driven by trust in the PlayStation brand, the subsequent response was not as positive.
Oculus was founded in 2012 and acquired by Facebook in 2014. In 2016, Oculus Rift was released, followed by three more models, all emphasizing portability and lower pricing. Oculus is one of the market leaders in terms of market share.
Snap acquired Vergence Labs in 2014, a company founded in 2011 that specialized in AR glasses. This acquisition served as the prototype for Snap Spectacles. The first version was released in 2016, followed by three updated versions. Similar to many other products mentioned, Spectacles initially attracted significant attention, with long queues forming outside stores. However, the subsequent user base was limited. In 2022, Snap shut down its hardware division and refocused on smartphone-based AR.
Around 2017, Amazon began developing AR glasses based on Alexa. The first version, Echo Frames, was released in 2019, followed by a second version in 2021.

When looking back on the history of XR, we can see that the expansion and cultivation of this industry have proven to be far more challenging than anticipated by everyone in the market, whether they are deep-pocketed tech giants with numerous scientists or ambitious startups that secured funding in the billions, dedicated to XR. Since the release of consumer-grade VR product Oculus Rift in 2016, the cumulative shipments of all VR brands, such as Samsung Gear, ByteDance’s Pico, Valve’s Index, Sony’s Playstation VR, HTC’s Vive, etc., amounted to less than 45 million units. As VR devices are primarily used for gaming presently, with no widespread adoption of AR devices for occasional use before the release of Vision Pro, we can roughly estimate that the monthly active users of VR devices may only amount to a few million, based on SteamVR data.

Why haven’t XR devices gained widespread popularity? Countless failed experiences from startup companies and conclusions from investment institutions can provide some answers:

Hardware is Far From Ready

In terms of visuals, VR devices, even the top-notch ones, still struggle to ignore the pixels on the screen due to the wider field of view and closer proximity to the eyes. To achieve true immersion, a resolution of single-eye 4k, equivalent to dual-eye 8k, is required. Additionally, the refresh rate is a crucial factor in maintaining a smooth visual experience. It is widely believed that XR devices need a refresh rate of 120Hz or even 240Hz to prevent motion sickness and provide a more realistic experience. However, the refresh rate needs to be balanced with rendering capabilities: while a game like Fortnite can support 4k resolution at 60Hz, it can only support 1440p resolution at 120Hz.

While auditory aspects may seem less important in comparison to visuals, most VR devices have not focused much on this aspect. However, imagine being in a virtual space where voices from people on either side, whether left or right, consistently come from above, it would greatly diminish the sense of immersion. Similarly, when a digital avatar within an AR space is fixed in a living room, and the volume of their voice remains the same as a player walks from the bedroom to the living room, it subtly reduces the sense of real space.

In terms of interaction, traditional VR devices are equipped with handheld controllers, with some devices like the HTC Vive requiring the installation of cameras in the home to track the player’s movements. Although the Oculus Quest Pro has eye tracking, it has high latency and average sensitivity, primarily used to enhance local rendering rather than for actual interactive operations. Additionally, Oculus has installed 4-12 cameras on the headset to track the user’s environment, enabling some gesture-based interaction experiences (such as picking up a virtual phone with the left hand in the VR world and using the right index finger to click the virtual space to confirm starting a game).

In terms of weight, a comfortable XR device should ideally be between 400-700g (although this is still considerably heavier compared to normal glasses weighing around 20g). However, achieving the desired clarity, refresh rate, level of interaction, matching rendering requirements (such as chip performance, size, and quantity), and several hours of basic battery life is a challenging trade-off process when it comes to the weight of XR devices.

In conclusion, for XR to become the next generation of smartphones and a mainstream consumer hardware, the device would need a resolution of 8k or higher and a refresh rate greater than 120Hz to prevent user discomfort. It should have a dozen or more cameras, a battery life of at least 4 hours (or longer, with intermittent breaks during lunch or dinner), minimal heat generation, a weight of less than 500g, and a price range of $500 – $1000. Although technology has advanced since the previous wave of XR enthusiasm between 2015-2019, achieving the aforementioned standards still presents challenges.

However, even with the current limitations, users who start experiencing existing MR (VR + AR) devices will find that the current experience, although not perfect, offers an immersion that cannot be matched by 2D screens. However, there is still significant room for improvement in this experience. Taking Oculus Quest 2 as an example, most available VR videos are at 1440p resolution, without reaching the Quest 2’s maximum resolution of 4K, and the refresh rate is far from 90Hz. Existing VR games also often feature relatively crude modeling and offer a limited selection of options for users to try out.

Killer App is Yet to Be Seen

The absence of a “yet-to-be-seen” killer app is rooted in historical hardware limitations. Even if Meta strives to minimize profit margins, MR headsets priced at a few hundred dollars and relatively underdeveloped ecosystems do not hold the same appeal as the existing rich ecosystems and established user bases of gaming consoles. The installed base of VR devices is estimated to be around 25-30 million, while the installed base of AAA game consoles (PS5, Xbox, Switch, PC) is 350 million. As a result, most manufacturers have abandoned support for VR, and the few games that do support VR are primarily aimed at expanding their overall platform reach, rather than being exclusively developed for VR devices.

Furthermore, as mentioned earlier, issues such as pixelation, motion sickness, limited battery life, and heavy weight prevent VR devices from providing a better experience than traditional AAA game consoles. The touted advantage of “immersion” by VR proponents is hampered by the lack of device penetration. Developers who include VR support in their games often struggle to create specialized experiences and interaction modes for VR due to the limited installed base.

Currently, when a player chooses a VR game over a non-VR game, they not only choose a new game but also forfeit the experience of socializing with the majority of their friends. Such gaming scenarios prioritize gameplay and immersion over social interaction. While VR Chat might come to mind as a counterexample, a deeper exploration reveals that 90% of its users are not VR users but players sitting in front of a screen, wanting to socialize with new friends using various avatars. Consequently, it’s not surprising that the most popular VR games are rhythm-based games like Beat Saber.

Therefore, we believe that the emergence of a killer app requires the following elements:

Significant improvement in hardware performance and overall details. As mentioned in the “hardware readiness” section, this goes beyond simple improvements to screens, chips, and speakers. It requires comprehensive coordination between chips, accessories, interaction design, and operating systems. This is an area where Apple excels, as they have demonstrated with their cumulative experience developing operating systems for multiple device types over several decades.

A critical mass of user adoption. As discussed earlier, the “chicken and egg” problem—whether killer apps or XR device adoption should come first—is challenging. A killer app is unlikely to emerge with only a few million monthly active users on XR devices. During the peak of The Legend of Zelda: Breath of the Wild, game sales in the US even surpassed the installed base of the Nintendo Switch console itself. This is an excellent case study of how new hardware can achieve mass adoption. Users who purchase XR devices for the sake of experiencing XR will gradually become disappointed due to the limited content offerings, leading to discussions about their headsets gathering dust. However, players attracted by games such as Zelda will likely discover and explore other games within the Switch ecosystem.

Additionally, there is a need for unified user interaction patterns and stable device update compatibility. The former is easy to understand – with controllers and without controllers, different user behaviors and experiences are brought forth, as exemplified by Apple’s Vision Pro and other VR devices in the market. As for device update compatibility, it can be seen in the iteration of Oculus hardware. The significant improvement in hardware performance within the same generation can sometimes constrain the user experience. The Meta Quest Pro, released in 2022, showed considerable improvements compared to the Oculus Quest 2 (also known as the Meta Quest 2) released in 2020. The Quest Pro boasts a resolution of 5.25K, a 75% improvement in color contrast, and an increased refresh rate from 90Hz to 120Hz compared to the 4K display of the Quest 2. In addition to the four external cameras used by the Quest 2 to understand the VR environment, the Quest Pro added eight external cameras that display colored imagery instead of black and white, significantly enhancing hand tracking and introducing facial and eye tracking. The Quest Pro also uses “foveated rendering” to concentrate computational power on the area the user’s eyes are focused on, reducing the level of detail in other areas to save processing power and battery life. As mentioned earlier, the Quest Pro offers far more features than the Quest 2, yet the user base for the Quest Pro might be less than 5% of the Quest 2 user base. This means that developers have to develop games simultaneously for both devices, greatly limiting the utilization of the advantages of the Quest Pro and reducing its attractiveness to users. History rhymes, and the same story has unfolded multiple times in the world of gaming consoles. This is why console manufacturers have a generational cycle of hardware and software updates every 6-8 years. Users who purchased the first generation of the Switch, for example, do not have to worry about the incompatibility of new game software introduced on the Switch OLED and subsequent hardware updates. However, users who purchased the Wii series are unable to play games within the Switch ecosystem. For game developers targeting console gaming, their produced games are not specifically designed for smartphones with an enormous user base (350 million vs. billions) and high user dependency (leisure use at home vs. constant portability). They need stable hardware experience over several development cycles to avoid excessive user diversion. Otherwise, they will have to, like current VR software developers, ensure backward compatibility to maintain a sufficient user base.

Can Vision Pro solve the aforementioned issues? And what kind of transformation will it bring to the industry?

The Turning Point Brought by Vision Pro

At the Apple event on June 7th, the Vision Pro was unveiled. According to our analysis of the challenges faced by MR (Mixed Reality) in both hardware and software, we can make the following analogies:

Hardware:

In terms of visuals, the Vision Pro utilizes two 4K displays, resulting in a combined resolution of approximately 6K. This makes it one of the top-tier MR devices currently available. With a refresh rate of up to 96 Hz and support for HDR video playback, the Vision Pro boasts a high level of clarity and, according to tech bloggers invited to the test, virtually eliminates the feeling of dizziness.
In terms of audio, Apple has been implementing spatial audio in AirPods since 2020, allowing users to hear sounds coming from different directions, creating a three-dimensional audio experience. However, the Vision Pro is expected to take it even further. By utilizing “audio beamforming technology” and integrating LiDAR scanning into the device, the Vision Pro can analyze the acoustic characteristics of a room (including physical materials) and create a customized “spatial audio effect” that has directionality and depth, tailored to match the room’s acoustics.

In terms of interaction, the Vision Pro offers motion capture and eye-tracking without the need for any controllers, resulting in an incredibly smooth and seamless user experience. According to tech media’s hands-on experiences, there is virtually no noticeable delay. This is not only due to the precision of the sensors and the speed of computation but also because of the introduction of eye-gaze prediction, which will be further explained later.
In terms of battery life, the Vision Pro lasts for approximately 2 hours, which is comparable to the Meta Quest Pro. Although it is not considered impressive and has been a point of criticism for the Vision Pro, it’s important to note that the Vision Pro is powered externally and includes a small 5000mAh battery in the headset. This suggests that there may be room for battery swapping or additional power sources to extend endurance.
Regarding weight, according to tech media’s experiences, it weighs around 1 pound (454g), which is roughly on par with Pico and Oculus Quest 2. It should be considered a relatively good experience among MR devices (although this does not include the weight of the power pack strapped to the waist). However, compared to pure AR glasses weighing around 80g (such as Nreal, Rokid, etc.), it is still relatively heavy and may cause heat build-up. Of course, most AR glasses require connections to other devices and can only be used as an extension screen – a blurred but large screen in front of user’s eyes. In comparison, an MR device with its own chip and a truly immersive experience may offer a completely different experience.
Additionally, in terms of hardware computing power, the Vision Pro is equipped with the cutting-edge M2 series chip for system and program operations, as well as the R1 chip specifically developed for MR displays, environmental monitoring, eye-tracking, and gesture monitoring. The R1 chip is dedicated to providing MR-specific display and interaction capabilities.

On the software front, Apple not only benefits from its vast ecosystem of millions of developers, but it has also already made significant strides in the augmented reality (AR) space with the release of ARKit.

Back in 2017, Apple introduced ARKit, a virtual reality development framework compatible with iOS devices. ARKit enables developers to create augmented reality applications that leverage the hardware and software capabilities of iOS devices. Using the device’s camera, VR Kit can create a map of the area and utilize CoreMotion data to detect objects such as desks, floors, and the device’s position in physical space. This allows digital assets to interact with the real world through the camera. For example, in Pokemon Go, you can see Pokémon hiding in the ground or perched on trees, seamlessly integrated into the camera view instead of simply appearing on the screen as it moves with the camera. Users do not need to perform any calibration, resulting in a seamless AR experience.

In 2017, the release of ARKit brought automatic detection of position, topology, and user facial expressions, allowing for modeling and expression capture.
In 2018, ARKit 2 was released, providing an improved CoreMotion experience, multiplayer AR games, 2D image tracking, and detection of known 3D objects such as sculptures, toys, and furniture.
In 2019, ARKit 3 was released, introducing further enhancements to augmented reality. People Occlusion was introduced, allowing AR content to be displayed in front of or behind people, with support for tracking up to three faces. Collaborative sessions were introduced, enabling new AR shared gaming experiences. Motion capture was made available to understand body positioning and movements, track joints and skeletons, enabling new AR experiences involving people rather than just objects.
In 2020, ARKit 4 was released, leveraging the built-in LiDAR sensor on 2020 iPhone and iPad models for improved tracking and object detection. ARKit 4 also introduced Location Anchors, allowing augmented reality experiences to be placed at specific geographical coordinates using Apple Maps data.
In 2021, ARKit 5 was released, giving developers the ability to build custom shaders, programmatic mesh generation, object capture, and character control. It also allowed for object capture using the built-in API as well as the LiDAR and camera capabilities in iOS 15 devices. Developers could scan an object and immediately convert it into a USDZ file, which could be imported into Xcode and used as a 3D model in ARKit scenes or applications. This greatly improved the efficiency of 3D model creation.
In 2022, ARKit 6 was released, bringing the “MotionCapture” feature that tracks characters in video frames and provides developers with a character “skeleton” that allows for the predicted position of the human head and limbs. This enables developers to create applications that seamlessly integrate AR content onto characters or hide behind characters, creating a more realistic integration with the scene.

Looking back at the layout of ARKit that began seven years ago, it becomes evident that Apple’s technological accumulation in the field of AR is not a result of overnight success. Instead, they stealthily integrated AR experiences into widely adopted devices. By the time Vision Pro was released, Apple had already built a substantial foundation in terms of content and developer support. Moreover, thanks to the compatibility of ARKit development, the products created are not only targeted at Vision Pro users but also adaptable to a certain extent for iPhone and iPad users. Developers may not be constrained by the ceiling of 3 million monthly active users, but rather have the potential to test and experience their products with hundreds of millions of iPhone and iPad users.

Furthermore, the 3D video capture feature of Vision Pro partially addresses the current limitation in MR content production. Existing VR videos are mostly 1440p, which appear pixelated in the circular screen experience of MR headsets. With Vision Pro’s high-resolution spatial video capture and impressive spatial audio experience, it has the potential to greatly enhance the consumption of MR content.

Although the aforementioned features are already impressive, Apple’s imagination for MR doesn’t stop there. On the day of Apple’s MR release, a developer named @sterlingcrispin, who claims to have been involved in Apple’s neuroscientific development, stated:

Generally as a whole, a lot of the work I did involved detecting the mental state of users based on data from their body and brain when they were in immersive experiences.
So, a user is in a mixed reality or virtual reality experience, and AI models are trying to predict if you are feeling curious, mind wandering, scared, paying attention, remembering a past experience, or some other cognitive state. And these may be inferred through measurements like eye tracking, electrical activity in the brain, heart beats and rhythms, muscle activity, blood density in the brain, blood pressure, skin conductance etc.
There were a lot of tricks involved to make specific predictions possible, which the handful of patents I’m named on go into detail about. One of the coolest results involved predicting a user was going to click on something before they actually did. That was a ton of work and something I’m proud of. Your pupil reacts before you click in part because you expect something will happen after you click. So you can create biofeedback with a user’s brain by monitoring their eye behavior, and redesigning the UI in real time to create more of this anticipatory pupil response. It’s a crude brain computer interface via the eyes, but very cool. And I’d take that over invasive brain surgery any day.
Other tricks to infer cognitive state involved quickly flashing visuals or sounds to a user in ways they may not perceive, and then measuring their reaction to it.
Another patent goes into details about using machine learning and signals from the body and brain to predict how focused, or relaxed you are, or how well you are learning. And then updating virtual environments to enhance those states. So, imagine an adaptive immersive environment that helps you learn, or work, or relax by changing what you’re seeing and hearing in the background.

These highly neuroscientific-related technologies may mark a groundbreaking new way for machines and human intentions to synchronize. However, it’s important to note that Vision Pro is not without its flaws, one of which is its hefty price tag of $3499. This price is more than double that of the Meta Quest Pro and over seven times the price of Oculus Quest 2. Regarding this, Siqi Chen, the CEO of Runway, commented:

it might be useful to remember that in inflation adjusted dollars, the apple vision pro is priced at less than half the original 1984 macintosh at launch (over $7K in today’s dollars)

In this analogy, the pricing of Apple Vision Pro doesn’t seem excessively unreasonable. However, considering that the first-generation Macintosh only sold 372,000 units, it’s hard to imagine Apple, a company that invests heavily in MR, being comfortable with a similar situation. The reality in the coming years may not change significantly, as AR does not necessarily require glasses. In the short term, widespread adoption of Vision Pro may be challenging, and it may primarily serve as a tool for developer experiences and testing, a production tool for creators, and an expensive toy for tech nerds.

Despite these challenges, we can observe that Apple’s MR devices have begun to shake up the market, redirecting the interest of ordinary users towards MR and making the general public aware that MR has matured beyond mere PowerPoint or presentation video-like products. It makes users realize that there is an immersive head-mounted display option available alongside tablets, TVs, and smartphones. It also raises awareness among developers that MR may truly become the new trend in next-generation hardware. Furthermore, it prompts venture capitalists to recognize that this may be an investment field with tremendous potential.

Web3 Related Field

Conceptual Target of 3D Rendering + AI: RNDR

Introduction of RNDR

RNDR (Render Token) is a project that combines the concepts of metaverse, artificial intelligence (AI), and mixed reality (MR). Over the past six months, RNDR has gained attention and become a popular meme, leading the market.

RNDR is powered by Render Network, a protocol that utilizes a decentralized network for distributed rendering. The company behind Render Network is OTOY Inc, founded in 2009. OTOY is known for its rendering software OctaneRender, which is optimized for GPU rendering.

For regular creators, rendering locally can be resource-intensive for their machines, leading to a demand for cloud rendering. However, renting servers from providers like AWS or Azure for rendering purposes can be costly. This is where Render Network comes in. It allows rendering to go beyond hardware limitations by connecting creators with regular users who have idle GPUs. This enables creators to render their projects affordably, quickly, and efficiently, while node users can utilize their idle GPUs to earn some extra income.

For Render Network, there are two types of participants:

Creators: They can publish rendering tasks and purchase credits using fiat currency or RNDR tokens. The Octane X software, used for task publishing, is available on Mac and iPad, with a fee ranging from 0.5% to 5% to cover network costs.
Node Providers (Owners of idle GPUs): Users with idle GPUs can apply to become node providers. Their eligibility for priority matching is determined based on their reputation from previous completed tasks. Once a node completes rendering, the creator reviews the rendered files and initiates the download. The fee locked in the smart contract is then transferred to the node provider’s wallet.

The tokenomics of RNDR went through changes in February, which contributed to its price surge. However, as of the time of writing, Render Network has not applied the new tokenomics to the network and has not provided a specific timeline for its implementation.

Previously, the purchasing power of $RNDR tokens was equivalent to the purchasing power of credits, with 1 credit equal to 1 euro. When the price of $RNDR was below 1 euro, it was more cost-effective to buy $RNDR tokens instead of credits using fiat currency. However, when the price of $RNDR surpassed 1 euro, most users preferred to use fiat currency, which led to a situation where $RNDR lost its use case. Although the protocol may perform buybacks of $RNDR tokens, other market participants had no incentive to purchase $RNDR tokens.

Under the new economic model, inspired by Helium’s “BME” (Burn-Mint-Emission) model, when creators purchase rendering services using either fiat currency or $RNDR, 95% of the fiat currency value of the $RNDR tokens is burned, and the remaining 5% goes to the foundation as revenue for platform development. Node providers no longer directly receive earnings from creators’ purchases but instead receive newly minted tokens as rewards, which are based not only on task completion metrics but also include factors like customer satisfaction and other comprehensive considerations.

It’s worth noting that in each new epoch (a specific time period that has not been specified), new $RNDR tokens will be minted, and the minting quantity is strictly limited. It decreases over time and is unrelated to the amount of tokens burned (see the official whitepaper’s release documentation). This change in token economics brings about a shift in the distribution of benefits for the following stakeholders:

Creators / Network Service Users: Each epoch, creators will receive a partial refund of the consumed RNDR tokens, with the proportion gradually decreasing over time.
Node Operators: Node operators will be rewarded based on the work completed, as well as factors such as real-time availability and activity.
Liquidity Providers: Liquidity providers for the DEX will also receive rewards to ensure there is sufficient RNDR available for burning.

*Source: https://medium.com/render-token/behind-the-network-btn-july-29th-2022-7477064c5cd7*

Under the new model, compared to the previous income buyback mechanism, miners can earn more income when there is low demand for rendering tasks. However, if the total price of rendering tasks exceeds the total amount of RNDR rewards released, miners will earn less income than in the previous model (when burned tokens > newly minted tokens). This may also lead to a deflationary state for the $RNDR token.

Although the price of $RNDR has seen significant growth in the past six months, the business situation of Render Network itself has not experienced a substantial increase as reflected by the token price. The number of nodes has remained relatively stable over the past two years, and the monthly allocation of $RNDR tokens to nodes has not significantly increased. However, there has been an increase in the number of rendering tasks, indicating a shift from single large transactions to multiple smaller transactions allocated by creators to the network.

*https://dune.com/lviswang/render-network-dollarrndr-mterics*

While Render Network may not have seen a five-fold increase in token price over the past year, it has experienced significant growth in Gross Merchandise Value (GMV). In 2022, the GMV increased by 70% compared to the previous year. According to the allocation of $RNDR tokens to nodes on the Dune dashboard, the GMV for the first half of 2023 is approximately $1.19 million, showing little growth compared to the same period in 2022. This level of GMV is evidently not substantial enough considering the $700 million market capitalization.

*Source: https://globalcoinresearch.com/2023/04/26/render-network-scaling-rendering-for-the-future/*

Potential Influence on RNDR Caused by Vision Pro Release

In the Medium article released on June 10th, Render Network claimed that the rendering capabilities of Octane for M1 and M2 are unique. Since Vision Pro also utilizes the M2 chip, rendering on the Vision Pro device would be no different from regular desktop rendering.

However, the question arises: why would someone choose to publish rendering tasks on a device with a battery life of only 2 hours? It is only when the Vision Pro device becomes more affordable, significantly improves battery life, and reduces weight, enabling mass adoption, that Octane might truly come into play.

What can be confirmed is that the migration of digital assets from flat devices to MR (Mixed Reality) devices will indeed lead to increased demand for infrastructure. The announcement of collaboration with Apple to research how to create a more compatible game engine Unity for Vision Pro resulted in a 17% increase in stock price on the same day, indicating market optimism. With partnerships with Disney and Apple, the 3D conversion of traditional film and television content may also experience similar demand growth. Render Network, which specializes in film and television rendering, launched its AI-powered 3D rendering technology, NeRFs, in February this year. By combining AI computation and 3D rendering, NeRFs creates real-time immersive 3D assets that can be viewed on MR devices. With Apple’s AR Kit support, anyone with a higher-end iPhone can use Photoscan to generate 3D assets, and NeRF technology then renders these rudimentary Photoscan 3D assets into immersive 3D assets that refract light from different angles. This spatial rendering will be a crucial tool for content production on MR devices, potentially providing Render Network with considerable demand.

Whether RNDR will be able to meet this demand remains uncertain. Considering its GMV of $2 million in 2022, it is relatively insignificant compared to the costs invested in the film and television industry. Therefore, while RNDR may continue to ride the wave of the “Metaverse, XR, AI” trend and enjoy a surge in token price, generating revenue that matches its valuation remains a significant challenge.

Metaverse – Otherside, Sandbox, Decentraland, HighStreet etc.

While I believe that there may be limited fundamental changes in terms of the core dynamics, discussions surrounding MR-related topics inevitably revolve around several major metaverse projects: Monkey’s Otherside, Animoca’s The Sandbox, the pioneering blockchain metaverse Decentraland, and Highstreet, which aims to become the Shopify of VR worlds. (For a detailed analysis of the metaverse sector, please refer to section 4, “Business Analysis – Industry Analysis and Potential,” in the article at https://research.mintventures.fund/2022/10/14/full-report-apecoin-values-revisited-with-regulations-overhang-and-staking-rollout/)

However, as analyzed earlier in the context of the lack of a “Killer App,” most existing developers supporting VR are not solely focused on VR. Even among those who excel in the VR space, achieving a dominant position in a million MAU (monthly active user) segmented market does not equate to overwhelming competitiveness. Existing products have not made detailed adaptations to the user habits and interactive operations of MR. Unreleased projects are actually starting from a similar starting line as other large corporations and startups that recognize the potential of Vision Pro. After better integration with Unity and Vision Pro, the learning curve for MR game development is expected to decrease, and the limited market experience accumulated thus far may not be easily applicable to a product that is on the verge of mass adoption.

It is worth noting that projects that have already ventured into the VR space may have a slight advantage in terms of development progress, technological expertise, and accumulated talent.

One More Thing

If you haven’t seen the video below, it will provide you with a firsthand experience of the MR world: convenient, immersive, yet also chaotic and disordered. The seamless integration of virtual and reality is so remarkable that people who have been spoiled by virtual reality see “losing their identity within the device” as a catastrophic event. The details in the video may still feel somewhat science-fiction and difficult to grasp for us at present, but this is likely the future we will be facing within the next few years.

This reminds me of another video from 2011, exactly 12 years ago, when Microsoft released Windows Phone 7 (as a Gen Z with limited memory of that era, it’s hard to imagine that Microsoft once put a lot of effort into smartphones). They created a satirical advertisement titled “Really?” about smartphones: people in the ad were constantly glued to their phones, staring at them while riding bikes, sunbathing on a beach, holding them tightly even in the shower, falling down stairs because they were distracted, and even dropping their phones into the toilet… Microsoft’s intention was to show users that their phones would rescue them from smartphone addiction. However, it turned out to be a disastrous attempt, and the “Really?” advertisement could have been renamed “Reality.” The immersive presence and intuitive interaction design of smartphones are more captivating than the seemingly anti-human “computer-in-a-phone” concept, just as the blending of virtual and reality is more addictive than pure reality.

To navigate such a future, we are exploring several directions:

Creating immersive experiences and narratives: Firstly, in terms of video, with the release of Vision Pro, shooting movies with “3D depth” has never been easier. This will change the way people consume digital content, shifting from “distant appreciation” to “immersive experiences”. Beyond video production, another area worth exploring is “3D spaces with experiential content”. This does not just mean building generic scenes from a template library or extracting a few seemingly explorable spaces from games. It refers to spaces that offer interactive, native content, and are more 3D friendly. These spaces could include a handsome piano instructor who sits next to you, highlighting corresponding keys and offering gentle encouragement when you’re feeling frustrated. They could feature a mischievous sprite hiding a key to the next level of a game in a corner of your room. Or they could even include a virtual girlfriend who understands you and walks with you. The emergence of such spaces will give rise to a creator economy that leverages blockchain technology for trust, automated settlements, asset digitization, and low communication friction in transactions. Creators can interact directly with their fans, without the hassle of registering a company or setting up Stripe for payments, and without giving up 10% (Substack) to 70% (Roblox) of their earnings to platforms. They won’t need to worry about platforms shutting down and taking their hard work with them. With a wallet, a composable content platform, and decentralized storage, these concerns can be addressed. Similar upgrades will occur in gaming and social spaces. In fact, the boundaries between gaming, movies, and social spaces will become increasingly blurred. When the experience is no longer a large floating screen at a distance but is right in front of you with depth, distance, and spatial audio interactions, players will no longer be mere “spectators” but active participants. Their actions can influence the virtual world environment (e.g., raising your hand in a jungle, causing butterflies to fly to your fingertips).
Infrastructure and communities for 3D digital assets: The 3D shooting capabilities of Vision Pro will significantly reduce the complexity of creating 3D videos, leading to a burgeoning market for content production and consumption. Upstream and downstream infrastructure for material trading, editing, and more may continue to be dominated by existing giants, or even open up new horizons for startups, as seen with AIGC.
Hardware and software upgrades to enhance immersive experiences: Whether it’s Apple’s research into “more detailed observations of the human body to create adaptive environments” or advancements in haptic feedback and sensory experiences, there is substantial potential in this field.

Of course, entrepreneurs in this field likely have a deeper understanding, thoughtful insights, and more creative explorations than we do. I welcome DMs at @0xscarlettw to discuss and explore the possibilities of the spatial computing era.

Acknowledgements & References

I would like to express my gratitude to @fanyayun, partner at Mint Ventures, and @xuxiaopengmint, research partner at Mint Ventures, for their valuable suggestions, review, and proofreading during the writing of this article. The analysis framework for XR is derived from the series of articles by @ballmatthew, Apple WWDC, developer courses, and my own experiences with various XR devices available in the market.

https://www.youtube.com/watch?v=YJg02ivYzSs
https://www.bilibili.com/video/BV1Ps4y1q7K2/?share_source=copy_web&vd_source=fc6336b5a0337d489d6eaf7ae486e621
https://www.youtube.com/watch?v=OFvXuyITwBI
https://twitter.com/DrJimFan/status/1665794601154916352
https://www.matthewball.vc/all/why-vrar-gets-farther-away-as-it-comes-into-focus
https://twitter.com/blader/status/1666007944285274113?s=20
https://twitter.com/FEhrsam/status/1665817199284559873
https://mirror.xyz/0x30bF18409211FB048b8Abf44c27052c93cF329F2/6xR2nFi-Q5WdXIDZpEga4xS3m3AZ61hXyu6dzIEBb_E
https://rndr.gitbook.io/render-network-foundation-governance/
https://docs.google.com/spreadsheets/d/1vgNamfJsJeCOUnFGtrdBw7GJCtN25bXEIFOluJQAO64/edit#gid=365524340

Reflecting on Apple’s Vision Pro: The Future of XR, RNDR, and Spatial Computing