Blog | 草莓视频在线/blog/Fri, 11 Oct 2024 10:06:19 +0000en-GBSite-Server v6.0.0-c03b8da1485bca8740012d218afe07f881ca296d-312418 (http://www.squarespace.com)AJ Sciutto speaks to Meta about the creation of Alo Moves XRExternalFri, 11 Oct 2024 10:06:22 +0000https://www.meta.com/en-gb/blog/quest/alo-moves-xr-magnopus-yoga-pilates-mindfulness/618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:6708f56374cff23635705746To celebrate the launch, Director of Virtual Production AJ Sciutto spoke with the Meta Quest blog team about the entire process of creating the experience, from how the partnership came about to lessons learnt, and shares why immersive experiences are the future of fitness. Read the full in-depth interview , and check out  on the Quest Store! 

Permalink

]]>AJ Sciutto speaks to Meta about the creation of Alo Moves XRFrom pages to spaces: the internet's evolution and a new era of interoperabilitySam BirleyFri, 11 Oct 2024 09:19:55 +0000/blog/from-pages-to-spaces-the-internets-evolution-and-a-new-era-of-interoperability618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:67079b3a9ee7ce49f9931dcc

Today, the internet is a series of connected pages. You move from one link to another, viewing two-dimensional pages on your device. It鈥檚 not how humans were built to reason about the world, but it鈥檚 a self-evident truth that it revolutionized how humanity communicates.

But as with all things in life, technology evolves.

With hardware such as Orion, Vision Pro, Quest, and Ray-Ban Meta smart glasses now available/actively being developed, it鈥檚 increasingly easy to imagine that the internet of tomorrow will no longer be a series of connected pages, but a series of connected spaces. A spatial web.

As humans, we're hard-wired to reason about the physical world 鈥 it's in our DNA. When we interact with objects, we have expectations about how they'll behave. When we change something in the real world, we expect it to stay changed. We expect others to see the same thing in the same place, and we expect our interactions to have meaning.

It stands to reason then that a spatial web should behave like this too. A spatial web introduces not just a literal third dimension, but also a means to, finally, meaningfully connect our digital and physical realities. It's not just about making the internet 3D; it's about bringing it into our physical world in a way where it behaves and reacts to you based on the same paradigms as reality.

That might sound impossible with today鈥檚 technology, but it鈥檚 not. With the (CSP), a platform we designed to support the concept of an interoperable spatial web, the idea of giving users a way to reason about the digital world as they do with the physical world isn't science fiction; it's engineering, and many of the technical challenges have now been solved.

For instance, the services required for a spatial web aren鈥檛 that different from those we depend on today; user accounts, asset management, geolocation, VOIP. Stuff that鈥檚 been around for decades. Sure, there are new ones that need to tie into those (such as for handling large volumes of users in persistent worlds, and anchored digital realities superimposed over the physical world), but it鈥檚 feasible. And not just theoretically feasible - we鈥檝e proven it works with our own 草莓视频在线 Cloud Services (which we鈥檝e designed to plug natively into CSP).

So we know that with a set of spatial web services backing it, whichever those are, CSP allows for many users to be co-present in the same physical space, all of whom are seeing a digital layer superimposed over reality, at the same time and in the same place. And when someone changes something in that digital layer, others can see the same change in real-time, and it persists, just as the marks we make on the real world do. That's incredibly powerful.

But communicating that idea has been a challenge. Blog posts, documentation source code, none of them really cut it when it comes to conveying the value of this idea. Much like VR, it's tough to fully grasp this concept with words alone. You can't truly understand how it feels to be in a VR experience until you put on the headset, and you can鈥檛 really appreciate the impact of a persistent spatial web without seeing it for yourself.

Companies like Epic Games have successfully shown the power of first-party titles as a vehicle to visually demonstrate the value of the software underpinning it. It鈥檚 been a key factor in Unreal Engine鈥檚 widespread adoption across the gaming industry; creating many outstanding examples that have allowed people to truly understand what can be achieved with the technology.

Which is where OKO comes in鈥

OKO

So, what exactly is OKO? Functionally, it鈥檚 a set of applications we've been creating over the past four years, all built leveraging the Connected Spaces Platform. Of those, we have three that are publicly available.

  1. for capturing reality and supporting co-presence through AR.

  2. built on top of CSP's C++ API. This allows experienced Unreal users to create best-in-class content with the tools and workflows they're already familiar with, and easily bring that content into OKO spaces.

  3. that leverages CSP's TypeScript API, providing users a frictionless way to access or even build their own spaces on any device, anytime, anywhere.

Since all of these applications are built on CSP, and since interoperability is itself a core value of CSP, they all communicate seamlessly with one another (despite being built in different engines, tech stacks, and languages). Using the best features of each engine and platform, they make use of nearly all of the features of CSP, and so are a wonderful testament to what鈥檚 possible with the library.

So, if you want to understand whether the features and concepts in CSP might be useful or valuable to you, you will find your answer through OKO.

Let鈥檚 be clear though, OKO is not a game engine like Unreal or Unity. It鈥檚 not a DCC tool like Blender or Maya. It鈥檚 not an open-source editor like Godot or even an asset sharing platform like Fab.

OKO is a suite of entirely interoperable apps that enable creators to build things like digital twins, virtual production scouting, and cross-reality events. And with OKO, they will all be interactive, persistent, and connected; synchronized across realities. OKO is a spatial web ecosystem and anyone can create an account for free.

It鈥檚 been a journey to get here, and the path we鈥檝e taken is worth knowing about, because it tells you where we鈥檙e going.

We鈥檝e been fortunate to be in this space since the beginning. Through a lot of hard work, talent, and more than a little luck, we鈥檝e gone on to work and deliver on some of the most challenging and interesting problems when it comes to mixed reality and cross-reality

The OKO journey started with the World Expo in Dubai and a wonderfully evocative brief from Her Excellency Reem Al Hashimi: 鈥淐onnecting minds, creating the future.鈥 We wondered: 鈥淐ould we share the experience of the physical site with people around the world, and connect them across both?鈥 

With the support of the leadership of Expo to pursue this ambitious vision, we assembled a diverse team of over 200 engineers and artists, working across seven countries. Together, we created a city-scale cross-reality space where on-site and remote visitors to Expo 2020 Dubai could connect in real-time in shared experiences.

As you might expect, there isn鈥檛 a lot of off-the-shelf tech to solve a problem like that. To deliver the project (which at 4.38km虏, four years on, is still the largest ever anchored digital twin) we had to build a lot of tech from scratch. And after we were done, it became clear to us that a lot of what we had invented was incredibly powerful. That a less opinionated version of the technology would be transformative for a whole range of industries.

And that鈥檚 when CSP and OKO were born.

Since then, we haven鈥檛 stopped building them. As the industry changed, and the projects rolled through, we鈥檝e continued to learn about what makes sense in this space, what problems need solving, what problems 诲辞苍鈥檛 (also interesting!), and through OKO, we鈥檝e expressed our opinion on how to solve those problems. To date, we鈥檝e mostly built OKO for us; we 诲辞苍鈥檛 expect it to be all things to all people when it comes to the spatial web, but we think it will help many see what鈥檚 possible and build their own.

One notion we increasingly get asked about is digital twins. Typically they鈥檙e hard to create, hard to deliver, and even harder to anchor to reality with their physical counterpart. Why can鈥檛 creating them be turnkey?

It鈥檚 satisfying to answer those questions with OKO, because as a spatial web ecosystem, that鈥檚 one of the things it does really well. Being able to capture a real-world environment, anchor it, and bring it into Unreal to work on it, all in the space of five minutes still feels like magic to me and I work on it daily.

As you might expect then, we frequently use OKO to create digital twins for a range of purposes; from diverse environments and industries to projects for our clients, and even ourselves.

The internal ones are my favorite. For demos and internal meetings, we have two digital anchored recreations of our own offices. Built with OKO using Unreal, accessible via our Unity iOS client or browser-based web client and anchored to the real-world office in AR.

Any change anyone makes to the digital office, stays. Anyone can visit at any time and be co-present with anyone else. Even if one person is physically there and another is not. Cross-reality meetings are a hoot.

It often blows people's minds when they see it in action.

And what gets me really excited is that none of this is hidden 草莓视频在线 magic. None of this is gated away. It鈥檚 accessible to anyone with an OKO account. We鈥檝e taken everything we鈥檝e learnt from every mixed reality project we鈥檝e delivered and found a commonality that lets us, and now others, move faster.

I could rhapsodize all day about OKO and how awesome it is. But I鈥檓 not a salesperson, just someone who鈥檚 been fortunate enough to have caught a glimpse of the future. 

So, instead, . The spatial web is closer than you think.

]]>
From pages to spaces: the internet's evolution and a new era of interoperability
Designing Alo Moves XRJesse WarfieldThu, 26 Sep 2024 09:07:00 +0000/blog/designing-alo-moves-xr618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66e951ccfa37b033745f6d4cAlo Moves XR transforms how users experience yoga, Pilates, and mindfulness, making it feel like you're attending a personalized fitness class with a stunning view 鈥 all from the comfort of your home. Staying true to Alo Moves' mission of empowering people to live healthier and more fulfilling lives, this new XR experience continues that vision. Here's a behind-the-scenes look at how we collaborated with Alo Moves and Meta to create this groundbreaking fitness experience.

Partnering with Alo Moves

From the outset, our collaboration with Alo Moves and Meta centered on redefining the quality of existing fitness experiences. We aimed to create an immersive, mixed-reality environment that went beyond the typical 2D video or VR fitness experiences, transporting users while ensuring they maintained spatial awareness and fully engaged in their practice.

We experimented with various capture methodologies to determine the best way to present instructors within the headset. Our goal was to deliver a sense of presence and fidelity that made users feel like they were receiving a personalized class from a world-renowned Alo Moves instructor.

We tested stereo footage with the Red Komodo-X using a 3ality Beamsplitter, the Z-Cam Pro, and volumetric video. While stereo footage provided some sense of presence, the volumetric instructor was the clear winner, offering a truly amazing next-generation experience. We all experience the world spatially, there is no comparison to having the ability to practice with an instructor and walk around them or pause and rotate the three-dimensional mini-instructors to study a pose. 

Volumetric Instructors

Alo Moves worked closely with their instructors to design classes specifically for the Quest headset. Partnering with Metastage, a leading Volumetric Capture & XR production studio, we captured the instructors' classes using 106 cameras on their stage. 

This process produced volumetric assets that offer users a more lifelike and immersive experience by preserving natural movements and detail of the instructors while maintaining longer, uninterrupted sessions not generally associated with traditional volcap. 

image1.png
image2.png
image3.png

The experiences are streamed directly to the Quest 3 headset via a content delivery network (CDN), utilizing video compression techniques similar to those found in traditional video streaming services. 

This data stream is then assembled in the headset into a fully immersive volumetric asset. This method not only minimizes the load on the headset, keeping the experience lightweight and responsive, but also allows for frequent and seamless content updates. This is especially important for ensuring a smooth user experience, while maintaining the high-quality performance expected of modern applications.

The combination of immersive visuals and photorealistic 3D instructors 鈥 including a 鈥渕ini-instructor鈥 that can be repositioned for better pose reference 鈥 creates a truly first- of-its-kind training experience.

The Portals

A key feature of the Alo Moves XR experience is the inclusion of two distinct portal types, each designed to transport users into different virtual environments that enhance their workouts and meditative practices. 

The Environmental Portal that accompanies the fitness experiences, transports users to serene locations like Thailand and Spain, further enhancing immersion. Initially, we experimented with stereo footage and 3D elements to heighten realism, but ultimately chose an 8K photo sphere for its crystal-clear resolution and minimal resource draw. The photo sphere鈥檚 curved design, along with subtle animations as part of the鈥渨indow effect鈥, give the illusion of depth and parallax and make you feel like you are overlooking a vast vista.

The Mindfulness Portal was crafted to deliver a more introspective experience. In collaboration with Alo Moves meditation instructors, we developed a calming visual panoramic designed to soothe rather than distract. The scene incorporates seamlessly looping, randomly generated elements that create an ever-changing yet tranquil atmosphere as well as environmental lighting elements.

The serene scenes set the tone of the meditation and allows users to choose between watching the visuals or closing their eyes and simply listening to the experience. Mixed-reality was chosen to allow users to reposition the portal, whether sitting or lying down, so they can have total flexibility with their meditation practice.

Interactions and UI

Alo Moves XR was designed to be controller-free, with interactions optimized for hand-tracking. Drawing inspiration from real-life interactions, we aimed for an intuitive and accessible design. The hands first interface includes a white highlight indicator to show interactable elements, ensuring that interactions are easy but not prone to accidental activation. 

A first-time user video helps guide newcomers through the core interactions when they first launch the experience, ensuring a smooth and intuitive onboarding process. The user interface (UI) has undergone a significant update from Alomoves.com integrating features designed to take full advantage of the mixed-reality environment. Despite these enhancements, it retains the key functionalities and familiar design elements that long-time Alo Moves users are accustomed to, ensuring that both new and experienced users feel comfortable navigating the platform.

Personalization

Alo Moves is committed to promoting a healthy lifestyle, and personalization plays a crucial role in this. Rather than creating pressure to engage daily, our achievement system encourages long-term consistency, rewarding users for building sustainable habits over time. 

User class history is stored on our backend services, generating class recommendations based on usage, surveys, and favorites. The profile page displays overall activity; mindfulness versus fitness minutes; and allows users to filter class history by week, month, or year. These personalized features will help users build long term healthy habits.

Our collaboration with Alo Moves, Meta, and Metastage has opened up new possibilities in the wellness space, blending cutting-edge technology with fitness in ways that resonate with both seasoned practitioners and newcomers alike. Alo Moves XR expands the horizons of wellness experiences, making them more accessible, immersive, and engaging. As this platform continues to grow, we're excited to see how it redefines what it means to pursue fitness and mindfulness in a progressive, digitally connected world.

]]>
Designing Alo Moves XR
Meet the Magnopians: Roo MacNeill草莓视频在线Mon, 23 Sep 2024 08:25:00 +0000/blog/meet-the-magnopians-roo-macneill618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66bc8d5ab2c586749386c9ff

With over 11 years of experience in the industry, Roo has worked on everything from small-scale TV advertisements up to AAA game titles and visual effects for Oscar and BAFTA-winning titles. Highlights include Christopher Robin, Avengers Infinity War, and Transformers Rise of the Beasts. 

Born and raised in Inverness, Roo firmly believes that location should never be a barrier to success and that hidden talent and ambition exist in the most remote places, it just needs to be nurtured.


Tell us more about your role at 草莓视频在线

I鈥檓 a Lead Artist working on real-time projects. It's a role that includes a very hands-on approach to builds and creation, as well as team management and keeping our department tied in with all the other studio disciplines.

What attracted you to 草莓视频在线 in the first place?

After years of uncertainty in studios and roles where there was a constant fear of job loss or stagnant progress, 草莓视频在线 caught my eye. It looked super dynamic, career progression looked very possible, and the fast-paced change of projects and style really matched my interests. Throughout the interview process, I was drawn to the strong focus on exploring new techniques and technologies. The application process was a little longer than usual, but it's built to ensure the right people are in the right places for the long term! It ticked every box I had and having only been here for a few months, it already feels like home.

What made you decide to pursue a career in this field?

I had a free CD trial of  from a random magazine (which barely ran on the family PC hidden in a cupboard, as the internet was still pretty fresh and no one really knew what they were meant to do with it). With that, I discovered I could make sandcastles without having to go to the beach, and *boom*, that was me off to the races.

I have built my entire career in the film side of VFX. As cool as it's been to work on massive projects, I had hit a bit of a brick wall. I was in a spot where career progression had drastically slowed down in a system where you have to wait for the next one up to drop off the board rather than progressing because of your skillset. 

Mixed in with a very pigeonholed approach to tasking, and the wild instability of film contracting, I needed to make a change and move forward. It's very energizing to feel challenged again rather than copy-paste the same process over and over for another Marvel film.

Real-time is almost an entirely new field to me, and it's so nice after grinding away to find fresh new challenges to remind me why I got into all of this in the first place 鈥 to make awesome stuff that brings me joy.

What is the biggest lesson you鈥檝e learned in your career?

No matter the stress, the budget, the company, the project, the hardware, or the software, you need your team at the end of the day. That's where the real work is done. Invest as much time as you can into developing them, growing with them, and above all, developing a social work relationship with them. Make time for meetings and catch-ups no matter what your calendar looks like. Everyone has specialties and hidden knowledge. Budgets are always tight, timelines are always messy, and these are things that can be worked around, but if you burn an artist out, more often than not this can't be repaired.

What鈥檚 your favorite thing to do when you鈥檙e not working? 

Outside of work, I'm a massive photography fan. I spend the majority of my free time shooting and editing. I find it's a great way to keep the creativity up in a fun way on my own schedule, and it's a place to further develop skills and keep up with advancing software.

What鈥檚 your special skill?

The ability to find super tasty street food in countries I have never been to. I wish I could say it's from some exquisite palette or superpower of taste senses, but in reality, it's just from my deep binges on YouTube so I know what 鈥榲ibes鈥 to look for.

If you had unlimited resources and funding, what project or initiative would you launch?

I would love to create a full-length CG film staffed entirely by final-year students and self-learners, to capture the super-high creativity of those about to get started. It would give people the opportunity to experience what it's really like to work in a studio 鈥 something which I think is massively missing from the university and self-learning experience.

How do you approach challenges and setbacks?

On a case-by-case basis. You can't always plan for things breaking. The main thing I do is to try to approach each one with a positive mindset rather than complaining and fighting against an idea. As a Lead, if you go into a problem with a defeated mindset, how can you expect your team to be motivated? Vent your frustrations in a way that doesn't affect the team and get to work on finding a solution, it's where most of my learning comes from. Going to the pub is always good. 

What are you reading/listening to right now?

In a desperate attempt to remember all the Polish that I have forgotten, I'm watching shows I know with Polish dubbing to try to get immersed in the language again. It's the best place I have lived so far, I need more of that mountain cheese in my life.

What鈥檚 your tactic for surviving a zombie apocalypse?

I would try and get the zombies on my side, by grilling up and serving up Michelin star survivors to them. Everyone wants to keep a good chef around (even the undead) so if they aren鈥檛 trying to kill me, it鈥檚 gonna save me from a massive amount of cardio.

]]>
Meet the Magnopians: Roo MacNeill
Ben Grossmann, 草莓视频在线 CEO, speaks to A16Z Games on how video game tech is powering a revolution in Hollywood.ExternalWed, 28 Aug 2024 12:18:00 +0000https://a16zgames.substack.com/p/how-video-game-tech-is-powering-a618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66d84e8dadaaf9080b09b60b

From the Mandalorian to Fallout, games tech is changing tinseltown. Oscar-winning visual effects supervisor Ben Grossmann and virtual production developer Johnson Thomasson tell A16Z Games how.

Permalink

]]>Ben Grossmann, 草莓视频在线 CEO, speaks to A16Z Games on how video game tech is powering a revolution in Hollywood.The music is the gameplayDan TaylorThu, 15 Aug 2024 10:21:40 +0000/blog/the-music-is-the-gameplay618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66bdc038a11c7b20c32d75efHow to make an interactive rock concert as epic as possible.

As soon as we started work on bringing Metallica to Fortnite, we knew we wanted to push the boundaries of both interactivity and visual immersion. We wanted to take inspiration from previous Fortnite concerts like the visually explosive Travis Scott鈥檚 Astronomical and the deeply interactive The Kid LAROI鈥檚 Wild Dreams, and combine them to create insane visuals with satisfying interactivity.

Fast-forward a few months and we鈥檇 realize that we鈥檇 have to get seriously creative to fuse spectacle and gameplay into a seamless, musical experience. Here's what we learned in the process鈥

Get the full playthrough of the Metallica x Fortnite event in this rock-powered video .

Cognitive load & simplicity

The players鈥 ability to ingest and respond to what they鈥檙e seeing is a critical factor in any interactive experience; however, when that experience is synced to music, everything happens to the beat, and players are swept along with the tune, rather than having control of what happens when. We took a few crucial steps early on to help prevent overstimulation for players.

First off, we set out a clear structure where sections of the experience would alternate between heavy interaction and a visual celebration of the band.

The 鈥淪tadium鈥 sections were designed to recreate the feeling of being at Metallica鈥檚 world-famous live shows, providing players an opportunity to get right up next to the band and appreciate the painstakingly mocapped performance.

The first stadium section - where players could rock out to Hit The Lights.

Each segment was book-ended with a brief cut-scene to provide a narrative through-line so any changes in activity or location made sense. In retrospect, these probably could have been longer, as they were critical to the flow 鈥 something to bear in mind when designing the structure of your experience.

For our interactive sections, we made sure that the core gameplay would be familiar to Fortnite Battle Royale players 鈥 driving, jumping, shooting and so forth. 

During these segments, all key visual elements were placed directly in the players鈥 sight line, so they never had to miss the show while playing.

Band members warp in on the outside of a circular path, keeping them in the players鈥 sight-line.

And finally, we employed reductive design to focus on the essence of each section, to keep things as simple as possible.

The problem with creating simplicity is that it鈥檚 harder than you might think. Less content requires less design, right? Unfortunately, the usual path to a beautifully simple design is to make something really complicated, then slowly chip away the noise until the core experience shines through. 

For example, one of our high-level objectives for this experience was to recreate the physical chaos of the mosh pit using familiar Fortnite mechanics. We spent a lot of time experimenting with bumpers, flippers, d-launchers, impulse grenades, bounce pads, and more, each deployed in myriad combinations. We found a number of executions that really nailed the vibe and fun of the mosh pit, but somehow they took the user out of the concert experience. In the end a simple up-and-down bounce was all we needed.

Difficulty & pacing

It鈥檚 worth noting that the simplicity of a game is not necessarily linked to its difficulty.  Even a concept as elegantly simple as Tetris can be challenging for the player when the appropriate parameters are switched up. We needed to make sure that players of all skill levels could enjoy the experience on their first playthrough, and never fall out of step with the musical flow. 

We also wanted to change the gameplay with each new song. We had five songs to fit into a 10-minute experience, so we limited each song to around two minutes, which didn鈥檛 allow a lot of time for gameplay mechanics to develop.

In any interactive experience, players need to instantly understand what they have to do, so we deployed short, in-world text at the start of each section to make any objectives extremely clear: 鈥淕rind the lightning!鈥 鈥淎scend the clock tower!鈥 鈥淕et to the show!鈥. These simple instructions were tricky to get right because the less words you have, the more each of them matters. When possible, we made sure these objectives had a visual component in the level to make sure players knew exactly where to go.

Players need clear objectives, both written and visual, to provide context, motivation and direction.

To correctly pace the levels we had to break our core mechanics down into tunable metrics (the numerical parameters of gameplay).

For example, if players have a gap to jump, there are parameters you can tune like the width of the gap, the relative height of the landing or the size of the landing area, all of which will affect how difficult it is for players to successfully traverse that gap. Using these parameters it鈥檚 possible to define what easy, medium, and hard versions of your gameplay elements look like. 

We made sure that the first time a player encountered a mechanic, it was the easiest possible variant, and the section contained that singular mechanic only, reducing the cognitive load and creating a gentle onboarding for each element. This was most evident in 鈥淔or Whom The Bell Tolls鈥 where we introduced players to rhythmic steps, swinging bells, and rotating cogs each in their own, isolated sections, before bringing all those elements together for the most intense part of the song.

This parametric approach can also be used to map the gameplay to the music. For each track, we mapped out the various segments of gameplay directly over the top of the music鈥檚 waveform, making sure they matched in both intensity and theme.

The initial gameplay map for For Whom The Bell Tolls.

We also went to great pains to make sure that no matter a player鈥檚 skill, nobody got left behind. Different techniques were deployed for each section 鈥 in the race section for 鈥淟ux 脝terna鈥 we tuned down the boost on the cars so the difference in speed was mostly perceptual (shhh鈥 诲辞苍鈥檛 tell anyone!) For other sections, we created a dynamic respawn system that would revive fallen players at the optimum location relative to the music (this was less effort to implement than it sounds 鈥 a single respawn point, animated to move through the level in between beats 鈥 easy). We also broke larger sections of gameplay into smaller segments with their own discreet objectives; faster players would reach these first and chill with the visuals, or dance with the band, while slower players would have a chance to catch up, before everyone warped onto the next area together. 

Musical immersion

Perhaps most importantly, it was critical for us that Metallica鈥檚 music didn鈥檛 feel like a tacked-on soundtrack: players had to feel like they were playing the music. With this in mind, we embraced something we call Ludo-rhythmic Resonance, which has three key pillars: visual, spatial, and mechanical.

Visual is the whole game world pulsing to the beat. Do anything and everything you can to make the environment pump in time with the music. Assign different visual elements to different instruments: shake the screen on a crash cymbal, pump the FOV on the kick drum, explode lava when the vocals kick in, desaturate as you build up to a drop then pop the color back in when it hits. This way you鈥檒l create a visual language for the sound that players will subconsciously translate, immersing them in the music.

The volcanic racetrack of 鈥淟ux 脝terna鈥 has eight unique visual components, each mapped to various musical elements of the track.

The spatial element is all about the metrics of rhythm. Knowing your metrics is the keystone of quality level design, so we spent a good chunk of our pre-production building test levels, or 鈥榞yms鈥, where we could reverse engineer Fortnite鈥檚 metrics, and map them to the music. For example, in the jumping sections of 鈥淔or Whom The Bell Tolls鈥, the platforms are spaced so that players can run and jump in time with the music. In our racing section, assuming you are going full speed, the curves switch direction every four bars creating a rhythmic slalom. Stimulating a rhythmic input is another key tool for reinforcing that musical immersion.

And finally, the mechanical element refers to synchronizing as much gameplay as possible to the beat. The volcanic jets in our race track all fire in time with the power cords, the boss always attacks every eight bars, machine guns have a rate of fire that matches the track鈥檚 BPM, the bells in the level swing when the bells in the music swing. As luck would have it, in 鈥淟ux 脝terna鈥 the cars鈥 turbo was already tuned to the right tempo, which meant you could double-tap it to the beat for extra boost. And the more juice you can layer onto your mechanics, the more players will feel the beat. Once again, this all gets the player thinking and playing rhythmically to completely immerse them in the sound of Metallica鈥檚 music.

The Master of Puppets shoots his death rays in time with the music, while machine guns rattle to the beat.

Put it all together

The end result of these elements creates a seamless experience full of intense gameplay, where players always know what they are doing, never get split up, play in time to the beat, and get fully immersed in the music, while still having the chance to chill and just rock out with the band.

In the 鈥淓nter Sandman鈥 finale lights, fire, lava, lightning, speakers, camera, and d-launchers all fire to the beat!

Along the way, we had a lot of help and support from the good folks at Epic, most of whom had worked on concerts before and/or were Harmonix alumni, so big thanks to them for sharing their extensive expertise and experience!

Hopefully, you got a chance to play Metallica - Fuel. Fire. Fury. (or at least ) and experienced these design principles in action. We hope you'll agree that they make for a truly epic rock experience!

]]>
The music is the gameplay
Meet the Magnopians: Tunde Glover草莓视频在线Fri, 28 Jun 2024 10:04:40 +0000/blog/meet-the-magnopians-tunde-glover618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:6667043736e7b14f12ff11c1

Tunde Glover is an artist with over 20 years experience working in games, creating wonderful 3D worlds. He has worked on quite a few different titles across many genres, with highlights such as Alien: Isolation, Halo Wars 2, and Shogun 2: Total War.
As a quarter century 3DsMax and Photoshop user, he's still fond of the original bright user interface from 2 decades ago! He has enjoyed and learned a great deal from the many talented people he's crossed paths with throughout his career.


Tell us more about your role at 草莓视频在线

Hi! At 草莓视频在线 I serve as one of the Lead Artists working on or supporting various projects throughout the studio. While delivering top-quality work for our clients is our primary focus, 草莓视频在线 is a very people-focused studio, so ensuring my team of artists is well-supported and guided is one of my most important mandates. As such, beyond creating and guiding project work, every day without exception I always make time for my art team. Even while navigating other responsibilities such as scheduling, general leadership meetings, and other company initiatives.

You鈥檝e been at 草莓视频在线 for a few months now, but what attracted you to 草莓视频在线 in the first place?

Initially, my growing interest in novel XR experiences, especially with all the new tech dev in the space, made me quite curious about who was doing all this cool work. I had heard the name 鈥槻葺悠翟谙哜 a few times.

When I was approached about the idea of creating with 草莓视频在线, I delved deeper into exploring what could be possible and I began to get excited. It was upon watching videos about 草莓视频在线 and hearing the CEO, Ben Grossmann, and Sol Rogers (Global Director of Innovation) talk about some of this stuff, that I became captivated by the prospect. As things drifted closer, I reached out to an old friend who鈥檇 been here for quite a while. It seemed he couldn鈥檛 be happier with the years he鈥檇 spent at 草莓视频在线, and I very much trust his judgment.

What made you decide to pursue a career in this field?

Much of the above, truly. It sounds all dreamy but it鈥檚 pretty much accurate! I鈥檝e spent over 20 years making games, and honestly, the potential for experiences feels so much broader with XR compared to what traditional games have offered so far. There are far less preconceptions.

I had always been interested in forging an art-based career, I just didn鈥檛 really know how to make it work. The starving artist stereotype wasn鈥檛 that attractive. However, having a keen interest in interactive experiences, I sort of put 2 & 2 together and I ended up looking into game dev.

The precise moment it all 鈥榗licked鈥 was while building something in 鈥楳acromedia Director鈥 (top marks for anyone who remembers this) about a quarter century ago. It dawned on me at the time that what I was actually making was a 鈥榞ame鈥.

What鈥檚 your favourite thing to do when you鈥檙e not working?

Spending time with family, especially playing (and now finally) talking with my boy. Working out, and of course, playing a lot of games. When someone figures out how to do 鈥榣ife鈥 at 1.5 times speed, I might find time for the large backlog of TV shows, books, and even graphic novels I need to catch up on.

If you could have any other job in the world, what would it be?

As weird as this sounds, landscaping! Specifically Indian sandstone settings. I find the idea of creating a semi-permanent jigsaw puzzle in someone鈥檚 garden really appealing. Especially if the garden or space has super weird shapes!

Where would you most like to travel to in the world?

I鈥檓 very much a homebody and not really a huge traveller, but I鈥檇 really like to visit one of those US diners in some rural town you tend to see in all of those TV shows. Where there鈥檚 some famous local speciality that everyone swears tastes incredible. Food that is both massive in size and rich in flavour. Even if it took me six months to burn off the calories, it might be worth it.

How do you want to leave a mark on the world 鈥 personally or professionally?

Earlier in life I was very keen on creating something like the defining game of the decade or inventing a whole new genre of some sort. While I still have some of these aspirations, now my main focus is trying to ensure my son has a really good life. While it sounds like the typical answer, it鈥檚 very much because I like to over-prepare for everything to try and ensure success. The thing is, there seems to be absolutely nothing you can do to prepare yourself to successfully raise a child. It鈥檚 like 鈥榯ry hard, all the time, forever鈥 is pretty much all there is, haha. A fun challenge all the same.

If you were on a gameshow, what would be your specialist subject?

My subject would be memorable one-liners from 90's movies - a personal favourite is from Back to the Future, "Roads? Where we're going, we don't need roads."

What are you reading/listening to/watching right now?

I鈥檓 still trying to finish Stephen Baxter鈥檚 Time鈥檚 Tapestry series after something like 10 years now... So much so, I鈥檇 probably need to start over again. It鈥檚 especially sad since I love his books, just seemingly not enough to read them anymore! Time really has a way of getting away from you, which is pretty funny considering what I was reading. Naturally, it鈥檚 all the internet's fault, and I have no agency at all.

What鈥檚 your life motto/ guiding principle you live your life by?

鈥淭ry hard, all the time鈥, seems to work pretty well so far 馃槀

]]>
Meet the Magnopians: Tunde Glover
How a Virtual Art Department (VAD) contributed to the Fallout TV show草莓视频在线Wed, 12 Jun 2024 20:12:30 +0000/blog/how-a-virtual-art-department-vad-helped-create-the-fallout-tv-show618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:666823a23ff8714b83612749The Virtual Art Department (VAD) is increasingly becoming a standard in TV and filmmaking as it can provide great creative flexibility and efficiency over traditional methods. Here's a look at how a Virtual Art Department works and our experience of operating one for Fallout, the post-apocalyptic series produced by Kilter Films for Amazon Prime Video. 

What is a Virtual Art Department? 

A Virtual Art Department (VAD) is a team that uses digital technology to create virtual environments for film and television. Unlike traditional art departments that build physical sets and props, a VAD works with 3D modeling software and real-time game engines, like Unreal Engine, to design and visualize these elements in a virtual space. 

The Importance of the VAD

By leveraging real-time technologies, VADs enable filmmakers to visualize and iterate on scenes live, making the creative process more flexible and efficient. This real-time feedback loop allows directors, cinematographers, and production designers to see and adjust digital environments and assets on the fly, ensuring a seamless blend with live-action footage. On the day of shooting, these elements are projected onto an LED wall where they can react to novel camera positions. Hence the phrase, in-camera visual effects (ICVFX).

Roles within our Virtual Art Department

A VAD brings together people with diverse skills to make the magic happen. On Fallout we used the following roles: 

  • VAD supervisor: Oversees the artistic vision and ensures it aligns with the overall production design.

  • VAD Lead/Virtual Gaffer: Responsible for managing digital lighting setups and ensuring the virtual scenes are lit to achieve the desired artistic and realistic effects, while seamlessly matching on-set lighting.

  • VAD Lighting & Look Dev Supervisor: Oversees the overall visual aesthetics, including lighting and material properties, to ensure consistency and quality across all digital assets.

  • VAD Environment Lead: Manages the design, modeling, texturing, and rendering of virtual landscapes, sets, and backgrounds to ensure they align with the artistic vision and technical requirements.

  • VAD 2D Art Director: Responsible for the visual style and quality of all 2D artwork. This includes overseeing concept art, matte paintings, and other 2D elements.

  • VAD Matte Painter: Creates detailed and realistic digital backgrounds and environments that seamlessly integrate into virtual scenes. They paint 2D images or composite photographic elements to achieve the desired look, enhancing the visual depth and atmosphere of scenes.

  • VAD Artist: Creates and develops various digital assets, including 3D models, textures, and environments. Collaborates with other team members to ensure that digital assets align with the overall artistic vision and technical specifications, contributing to the overall visual storytelling of the production.

  • VAD Technical Artist: While both VAD Artists and VAD Technical Artists create digital assets, the latter focuses more on ensuring that these assets are efficiently created, integrated, and perform well within the virtual environment. Technical Artists optimize workflows, develop tools, troubleshoot issues, and ensure that digital assets are efficiently created and integrated without compromising performance or quality.

  • VAD Animation & Rigging: Responsible for creating character and object animations and developing the underlying skeletal structures (rigs) that enable those animations to be applied efficiently and realistically within the virtual environment.

The 草莓视频在线 team on set for Fallout. 

How the VAD was used on Fallout 

Script Breakdown and Environment Design

One of the first things we did was provide creative input for leveraging virtual production across the entire show. At the time, only the pilot had been written and our involvement on the project was much like a creative department head. We worked with the filmmakers to break down the scripts into scenes and environments that would benefit the most from in-camera visual effects and other virtual production techniques. They settled on key environments such as the picnic area and vault door scenes in Vault 33, the cafeteria in Vault 4, and the New California Republic鈥檚 base inside the Griffith Observatory. Additionally, any scenes involving the Vertibird were earmarked for LED process shots.

Virtual Set Construction

The VAD built all virtual sets entirely inside Unreal Engine. This approach allowed the filmmakers to use Unreal鈥檚 suite of virtual production tools throughout the creative process. Virtual scouting tools enabled the filmmakers to perform tech scouts in VR, block out action, and place cameras and characters in precise locations. This meticulous planning was crucial for creating a heatmap of the environment, helping to focus creative efforts on the most critical areas and optimize resources effectively.

Real-time Modifications and Flexibility

Working in real-time 3D was essential for the production. Unreal Engine offered the flexibility to make creative modifications to the set during pre-production and even on shooting days. For instance, during the shooting of the Vault Door set, the idea to dynamically change the lighting mid-shot was implemented swiftly by the VAD, showcasing the agility and responsiveness that real-time tools provided.

In this pivotal scene, Lucy leaves the Vault for the first time. Authenticity for this moment is key 鈥 and finding the right blend between physical and virtual provides the audience with access to that authenticity. To that end, much of the set was physical 鈥 the handrails, the walkway, the plank, the control mechanisms, and even the door itself. Meanwhile, the interior walls of the vault were rendered virtually, extending the scope of the set. To make room for the giant moving vault door, we offset that set piece from the LED wall itself and filled the gap with floating LED panels instead. These were giant wild walls the crew could use to position at any point in the set to accommodate extreme angles. Mounting these panels with motion trackers allowed the team to dynamically update the image on screen no matter where they were placed 鈥 effectively creating a moving window into the virtual world. This streamlined production by eliminating the need for the crew to search for expansive caverns or construct large set pieces on a stage, all without sacrificing visual impact.

Seamless Integration with Physical Sets

A significant aspect of the VAD's work involved coordinating with the physical art department to ensure a seamless blend between physical and virtual elements. This included building 1:1 scale 3D versions of a number of sets in Unreal, even those that did not necessarily plan on using an LED volume stage. This work allowed the filmmakers to visualize scenes, compose shots, inform every department on the show about the creative intent, and help anticipate any potential difficulties. For scenes that did intend to use the LED volume stage, this close collaboration ensured that the digital set matched the physical set pieces, maintaining consistency in light and color, materials, and design.

We faced a unique challenge with this environment from episode 1. There is a cornfield inside of Vault 32 where most community activities take place, including Lucy鈥檚 wedding. The scene features layers of practical corn that extend into the virtual set, as well as apple trees, distant hallways leading to the neighboring vaults, and a synthetic nuclear-powered projection of a bucolic farm setting. The result is a sort of odd twist on LED volume production. The story calls for an in-world projection of virtual imagery onto a giant wall, meanwhile, as crew members on set, we are looking at that very same imagery projected onto an LED wall. At times, it really did feel as if we were inside the world of Fallout. This particular environment required multiple layers of virtual corn, six different lighting setups 鈥 as well as dynamic hooks to enable lighting cues in real-time 鈥 and a virtual projection surface capable of playing back 8k image sequences at 24fps. 

On-set Workflow and Collaboration

On set, the VAD worked closely with multiple departments to ensure the highest fidelity for final imagery. While on-set Virtual Production Unreal Operators maintained camera tracking, adjusted frustums, loaded and unloaded scenes, ensured the proper functioning of the LED wall, and wrangled shot data, the VAD team could come in and adjust the look of the scene in real-time between shots. This allowed the VAD to operate much like a traditional on-set Art Department 鈥 responding to the ever-changing needs of production. This close collaboration was crucial in achieving the desired visual effects and maintaining the creative vision of the show.

Challenges

One of the biggest challenges faced by the VAD was shooting on 35mm film within an LED volume, a relatively unprecedented approach. This required extensive testing and fine-tuning of genlock, color calibration, and exposure settings. Despite these challenges, the VAD's expertise allowed for the successful capture of final pixels on set, minimizing the need for extensive post-production VFX work.

While there were challenges, the VAD we put together for Fallout worked because of meticulous planning and the seamless integration of virtual and physical elements. The ability to make real-time adjustments and collaborate closely with various departments ensured that the show maintained its unique visual style. The innovative use of Unreal Engine and other virtual production tools not only enhanced the storytelling but also set new benchmarks for virtual production workflows in the television industry.

]]>
How a Virtual Art Department (VAD) contributed to the Fallout TV show
Director of Virtual Production, AJ Sciutto, speaks to Women's Wear Daily about the groundbreaking development of Alo Moves XR.ExternalThu, 30 May 2024 11:32:00 +0000https://wwd.com/business-news/business-features/alo-moves-yoga-classes-meta-quest-magnopus-mr-1236406266/618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66d87b0776f82356b385b96e

Alo Moves XR for Meta Quest 3 breaks new ground as the first fitness app employing volumetric captured 3D classes. This technology provides lifelike instructors to guide users seamlessly through every movement. The app employs room mapping and object detection enabling users to practice mindfulness, relaxation, yoga, and Pilates safely and comfortably in a virtual studio.

AJ Sciutto speaks to Women鈥檚 Wear Daily to uncover how we leveraged cutting-edge MR technology to create a truly immersive wellness experience.

Permalink

]]>Director of Virtual Production, AJ Sciutto, speaks to Women's Wear Daily about the groundbreaking development of Alo Moves XR.Meet the Magnopians: Daksh Sahni草莓视频在线Fri, 24 May 2024 10:01:24 +0000/blog/meet-the-magnopians-daksh-sahni618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66505b812655f907d3e85ee7

Daksh Sahni, Senior Product Manager at 草莓视频在线, is a successful leader with 16 years in AR/VR and game development. His experience includes contributing to some of the world鈥檚 best-selling AAA games at Activision, the first FDA-approved Virtual Reality therapeutic product, and working on AR hardware at Samsung Research of America. 


Tell us more about your role at 草莓视频在线

I lead the Product Solutions team, working on our technologies and CSP. The team operates at the intersection of business development, technology feature development, and user experience design. We interface with potential customers and clients by providing product demos and onboarding sessions. We constantly prototype new cutting-edge product use cases. Additionally, we serve as the voice of the user within the product ecosystem. By collaborating closely with other users across the studio and third-party entities, we gather feedback and translate it into actionable insights, creating UX flows and writing product requirements.

For those who 诲辞苍鈥檛 know, can you briefly explain what OKO is?

OKO enables users to easily create and share cross-reality experiences 鈥 connecting people, places, and things across physical and digital worlds. 

OKO is a suite of apps and plugins that operate across game engines, connected to a comprehensive set of cloud services, accessible across many devices. It鈥檚 built on the open-source Connected Spaces Platform (and as many other open-source and industry standards as possible).

What moment has had the most significant impact on your life?

The most impactful moment of my professional life occurred when I made a significant career pivot from traditional architecture to game development. While attending the graduate program at UCLA School of Architecture, I seized an opportunity to intern at a video game startup in Santa Monica. My role involved conducting architectural research to create a digital twin of the City of Los Angeles for a project intended to rival the mega-blockbuster Grand Theft Auto (GTA). This opportunity proved to be transformative, ultimately resulting in the acquisition of the studio by Activision. From that point on, there was no turning back.

What is the biggest lesson you鈥檝e learned in your career?

There are too many. But from my experience, I've noticed that while companies and projects may change over time, cultivating relationships and friendships with the people you collaborate with daily can have a lasting impact.

What鈥檚 your favorite thing to do when you鈥檙e not working? 

Hiking / walking /  being in nature. Appreciating art/design. Hanging out with friends.

If you had unlimited resources and funding, what project or initiative would you launch?

I'm already working on a version of the product I would launch! 馃榾 As an immigrant in this country with family all over the world, I deeply crave a connection to the people and places that have shaped my life and memories. While nothing can replace real experiences, the ability to use technology to connect and meaningfully engage with diverse cultures, people, and places is something I am deeply passionate about.

If you could wake up in the body of another person (just for one day) who would it be and why?

Without getting into the politics of it -  I鈥檇 love to be the person who can influence the end of these crazy wars !

How would your friends describe you?

Loyal, honest, reliable. 

]]>
Meet the Magnopians: Daksh Sahni
How we wrote a GPU-based Gaussian Splats viewer in Unreal with NiagaraAlessio RegalbutoWed, 15 May 2024 13:50:09 +0000/blog/how-we-wrote-a-gpu-based-gaussian-splats-viewer-in-unreal-with-niagara618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:6644bd914d6ac953d2ad8444In this article, I want to share our journey of writing a fully functional Gaussian Splat viewer for Unreal Engine 5, starting right from the ground up.

Getting the ball rolling

First of all, let鈥檚 quickly recap what Gaussian Splatting is. In short, it鈥檚 a process that produces something similar to a point cloud, where instead of each point, a colored elliptic shape is used. This changes and stretches, depending on the camera position and perspective to blend into a continuous space representation. This helps to keep visual information such as reflections and light shades intact in the captured digital twin, retaining details as realistically as possible.

For more info, feel free to check out my previous articles:

The first challenge was to understand specifically what format a Gaussian Splat file uses, and which one is the most commonly accepted by the industry. After in-depth research, we identified two main formats that are currently popular: .ply and .splat.

After some consideration, we chose the .ply format as it covered a wider range of applications. This decision was also driven by looking at other tools such as , which allows importing Gaussian Splats in the form of .ply files only, even if it also offers to export them as .splat files.

What does a .PLY file look like?

There are two different types of .ply files to start with:

  • ASCII based ply files, which store data in textual form.

  • Binary based ply files, which are less readable.

We can think of a ply file as a very flexible format for specifying a set of points and their attributes, which has a bunch of properties defined in its header. With those, it instructs the parser on how the data contained in its body should be interpreted. For reference, is a very informative guide on the generic structure of .ply files.

Here is an example of what a typical Gaussian Splat .ply file looks like:

ply
format binary_little_endian 1.0
element vertex 1534456
property float x
property float y
property float z
property float nx
property float ny
property float nz
property float f_dc_0
property float f_dc_1
property float f_dc_2
property float f_rest_0
(... f_rest from 1 to  43...)
property float f_rest_44
property float opacity
property float scale_0
property float scale_1
property float scale_2
property float rot_0
property float rot_1
property float rot_2
property float rot_3
end_header
  • The first line ensures this is a ply file. 

  • The second line establishes if the format of the data stored after the header is ASCII based or binary based (the latter in this example).

  • The third line tells the parser how many elements the file contains. In our example, we have 1534456 elements, i.e. splats.

  • From the fourth line until the 鈥渆nd_header鈥 line, the entire structure of each element is described as a set of properties, each with its own data type and name. The order of these properties is commonly followed by most of the Gaussian splat .ply files. It is worth noting that regardless of the order, the important rule is that all the non-optional ones are defined in the file, and the data follows the declared structure.

Once the header section ends, the data to be parsed for each element is provided by the ply body. Each element after the header needs to respect imperatively the order that was declared in the header to be parsed correctly.

This can give you an idea of what to expect specifically when we want to describe a single Gaussian Splat element loaded from a ply file:

  • A position in space in the form XYZ (x, y, z);

  • [Optional] Normal vectors (nx, ny, nz);

  • Zero order Spherical Harmonics (f_dc_0, f_dc_1, f_dc_2), which dictate what color the single splat should have by using a specific mathematical formula to extract the output RGB value for the rendering;

  • [Optional] Higher order Spherical Harmonics (from f_rest_0 to f_rest_44), which dictate how the color of the splat should change depending on the camera position. This is basically to improve realism for the reflections or lighting information embedded into the Gaussian splat. It is worth noting that this information is optional, and that files that embed it will be a lot larger than zero-order-only based ones;

  • An opacity (opacity), which establishes the transparency of the splat;

  • A scale in the form XYZ (scale_0, scale_1, scale_2);

  • An orientation in space in the quaternion format WXYZ (rot_0, rot_1, rot_2, rot_3).

All this information has its own coordinate system, which needs to be converted into Unreal Engine once loaded. This will be covered in more detail later in this article.

Now that you are familiar with the data we need to deal with, you are ready for the next step.

Parsing a .PLY file into Unreal

For our implementation, we wanted to support both ASCII and binary ply files, so we needed a way to quickly parse their data and store them accordingly. Luckily, ply files are not new. They have been used for 3D models for a long time, even before Gaussian Splats became popular. Therefore, several .ply parsers exist on GitHub and can be used for this purpose. We decided to adapt the implementation of , a general purpose open source header-only ply parser written in C++ (big kudos to the author ).

Starting from the implementation of Happly, we adapted its parsing capabilities to the coding standard of Unreal and ported it into the game engine, being mindful of the custom garbage collection and data types expected by Unreal. We then adapted our parsing code to align with the previous Gaussian Splat structure.

The next logical step, once we knew how the data looked and how to read it from a file, was to store it somewhere. This meant we needed a class or a struct that could hold all this data for a specific lifetime within the Engine. Time to dig into some C++ code!

How could we define a single Gaussian Splat in Unreal?

The easiest way to store each Gaussian Splat data was to define a custom USTRUCT in Unreal, optionally accessible by Blueprints, implemented along the following lines:

/**
 * Represents parsed data for a single splat, loaded from a regular PLY file.
 */
USTRUCT(BlueprintType)
struct FGaussianSplatData
{

GENERATED_BODY()

// Splat position (x, y, z)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
FVector Position;

// Normal vectors [optional] (nx, ny, nz)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
FVector Normal;

// Splat orientation coming as wxyz from PLY (rot_0, rot_1, rot_2, rot_3)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
FQuat Orientation;

// Splat scale (scale_0, scale_1, scale_2)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
FVector Scale;

// Splat opacity (opacity)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
float Opacity;

// Spherical Harmonics coefficients - Zero order (f_dc_0, f_dc_1, f_dc_2)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
FVector ZeroOrderHarmonicsCoefficients;

// Spherical Harmonics coefficients - High order (f_rest_0, ..., f_rest_44)
UPROPERTY(EditAnywhere, BlueprintReadWrite)
TArray<FVector> HighOrderHarmonicsCoefficients;

FGaussianSplatData()
: Position(FVector::ZeroVector)
	, Normal(FVector::ZeroVector)
	, Orientation(FQuat::Identity)
	, Scale(FVector::OneVector)
	, Opacity(0)
	{
	}
};

One instance per splat of this struct was generated during the parsing phase and added to a TArray of splats to use this data for our visualization in the following steps.

Now that we have the core data, let鈥檚 dive into the most challenging and fun part: transferring the data to the GPU so a Niagara system can read it!

Why Niagara for Gaussian Splats?

Niagara is the perfect candidate to represent particles inside Unreal. Specifically, a Niagara system is made up of one or multiple Niagara emitters, which are responsible for spawning particles and updating their states every frame.

In our specific case, we will use a single Niagara emitter to make a basic implementation. As an example, we will call it 鈥GaussianSplatViewer

Now that we have our new shiny emitter, we need a way to 鈥減ass鈥 the splats鈥 data into it, so that for each splat we can spawn a relative point in space, representing it. You might wonder, is there anything in Unreal we could use out of the box to do that for us? The answer is yes, and it is called the 鈥淣iagara Data Interface (NDI)鈥.

What is a Niagara Data Interface (NDI) and how to write one

Imagine you want to tell the Niagara emitter, 鈥淗ey, I have a bunch of points I read from a file that I want to show as particles. How can I make you understand what position each point should be in?鈥 Niagara would reply, 鈥淢ake me a beautiful NDI that I can use to understand your data and then retrieve the position for each particle from it鈥.

You might wonder, how do I write this NDI and what documentation can I find? The answer is simple: most of the Engine source code uses an NDI for custom particle systems, and they鈥檙e an excellent source of inspiration for building your own! The one we took the most inspiration from was the 鈥UNiagaraDataInterfaceAudioOscilloscope鈥.

Here鈥檚 how we decided to structure a custom NDI to make each splat 鈥渦nderstandable鈥 by Niagara when passing it through. Keep in mind that this class will hold the list of Gaussian Splats we loaded from the PLY file so that we can access their data from it and convert it into Niagara-compatible data types for use within the particles.

Firstly, we want our NDI class to inherit from UNiagaraDataInterface, which is the interface a Niagara system expects to treat custom data types via NDI. To fully implement this interface, we needed to override several functions, which I present below.

GetFunctions override

When overriding this function, we are telling Niagara 鈥淚 want you to see a list of functions I am defining, so that I can use them inside your Niagara modules鈥. This instructs the system to know what input and output each of these functions should expect, the name of the function, and if it鈥檚 static or non-static.

// Define the functions we want to expose to the Niagara system from
// our NDI. For example, we define one to get the position from a
// Gaussian Splat data.
virtual void GetFunctions(TArray<FNiagaraFunctionSignature>& OutFunctions) override;

Here is a sample implementation of GetFunctions, which defines a function GetSplatPosition to the Niagara system using this NDI. We want GetSplatPosition to have exactly 2 inputs and 1 output:

  • An input that references the NDI that holds the Gaussian splats array (required to access the splats data through that NDI from a Niagara system scratch pad module);

  • An input of type integer to understand which of the splats we request the position of (this will match a particle ID from the Niagara emitter, so that each particle maps the position of a specific Gaussian splat);

  • An output of type Vector3 that gives back the position XYZ of the desired Gaussian splat, identified by the provided input Index.

void UGaussianSplatNiagaraDataInterface::GetFunctions(
    TArray<FNiagaraFunctionSignature>& OutFunctions)
{   
   // Retrieve particle position reading it from our splats by index
   FNiagaraFunctionSignature Sig;
   Sig.Name = TEXT("GetSplatPosition");
   Sig.Inputs.Add(FNiagaraVariable(FNiagaraTypeDefinition(GetClass()),
       TEXT("GaussianSplatNDI")));
   Sig.Inputs.Add(FNiagaraVariable(FNiagaraTypeDefinition::GetIntDef(),
       TEXT("Index")));
   Sig.Outputs.Add(FNiagaraVariable(FNiagaraTypeDefinition::GetVec3Def(),
       TEXT("Position")));
   Sig.bMemberFunction = true;
   Sig.bRequiresContext = false;
   OutFunctions.Add(Sig);
}

Similarly, we will also define other functions inside GetFunctions to retrieve the scale, orientation, opacity, spherical harmonics, and particle count of our Gaussian splats. Each particle will use this information to change shape, color, and aspect in space accordingly.

GetVMExternalFunction override

This override is necessary to allow Niagara to use the functions we declared in GetFunctions by using Niagara nodes so that they become available within Niagara graphs and scratch pad modules. This combines with the DEFINE_NDI_DIRECT_FUNC_BINDER macro in Unreal designed for this purpose. Following is an example of the GetSplatPosition function definition.

// We bind the following function for use within the Niagara system graph
DEFINE_NDI_DIRECT_FUNC_BINDER(UGaussianSplatNiagaraDataInterface, GetSplatPosition);


void UGaussianSplatNiagaraDataInterface::GetVMExternalFunction(const FVMExternalFunctionBindingInfo& BindingInfo, void* InstanceData, FVMExternalFunction& OutFunc)
{
   if(BindingInfo.Name == *GetPositionFunctionName)
   {
       NDI_FUNC_BINDER(UGaussianSplatNiagaraDataInterface,
         GetSplatPosition)::Bind(this, OutFunc);
   }
}


// Function defined for CPU use, understandable by Niagara
void UGaussianSplatNiagaraDataInterface::GetSplatPosition(
  FVectorVMExternalFunctionContext& Context) const
{
   // Input is the NDI and Index of the particle
   VectorVM::FUserPtrHandler<UGaussianSplatNiagaraDataInterface> 
     InstData(Context);


   FNDIInputParam<int32> IndexParam(Context);
  
   // Output Position
   FNDIOutputParam<float> OutPosX(Context);
   FNDIOutputParam<float> OutPosY(Context);
   FNDIOutputParam<float> OutPosZ(Context);


   const auto InstancesCount = Context.GetNumInstances();


   for(int32 i = 0; i < InstancesCount; ++i)
   {
       const int32 Index = IndexParam.GetAndAdvance();


       if(Splats.IsValidIndex(Index))
       {
           const auto& Splat = Splats[Index];
           OutPosX.SetAndAdvance(Splat.Position.X);
           OutPosY.SetAndAdvance(Splat.Position.Y);
           OutPosZ.SetAndAdvance(Splat.Position.Z);
       }
       else
       {
           OutPosX.SetAndAdvance(0.0f);
           OutPosY.SetAndAdvance(0.0f);
           OutPosZ.SetAndAdvance(0.0f);
       }
   }
}

Note that the definition of GetSplatPosition is implemented to make this NDI CPU compatible.

Copy and Equals override

We also need to override these functions, so that when we copy or compare an NDI that uses our class, Niagara will understand how to perform these operations. Specifically, we instruct the engine to copy the list of Gaussian Splats when one NDI is copied into a new one, and to establish if two NDIs are the same if they have the same exact Gaussian Splats data.

virtual bool CopyToInternal(UNiagaraDataInterface* Destination) const override;
virtual bool Equals(const UNiagaraDataInterface* Other) const override;

This function is required to let the Niagara system understand if our NDI functions need to be executed on the CPU or on the GPU. In our case, initially we wanted it to work on the CPU for debugging, but for the final version we changed it to target the GPU instead. I will explain this choice further later.

virtual bool CanExecuteOnTarget(ENiagaraSimTarget Target) const override { return Target == ENiagaraSimTarget::GPUComputeSim; }

Additional overrides required for our NDI to work on the GPU too

We also need to override the following functions, so that we can instruct Niagara on how our data will be stored on the GPU (for a GPU compatible implementation) and how the functions we declared will be mapped onto the GPU via HLSL shader code. More on this later.

// HLSL definitions for GPU
virtual void GetParameterDefinitionHLSL(const FNiagaraDataInterfaceGPUParamInfo& ParamInfo, FString& OutHLSL) override;


virtual bool GetFunctionHLSL(const FNiagaraDataInterfaceGPUParamInfo& ParamInfo, const FNiagaraDataInterfaceGeneratedFunction& FunctionInfo, int FunctionInstanceIndex, FString& OutHLSL) override;


virtual bool UseLegacyShaderBindings() const override { return false; }


virtual void BuildShaderParameters(FNiagaraShaderParametersBuilder& ShaderParametersBuilder) const override;


virtual void SetShaderParameters(const FNiagaraDataInterfaceSetShaderParametersContext& Context) const override;

CPU vs GPU based Niagara system

Each emitter of a Niagara particle system can work on the CPU or on the GPU. It鈥檚 very important to establish which of the two to choose, because each of them has side effects.

Initially, for a simple implementation, we went for the CPU based Niagara emitter. This was to make sure that the splat data and coordinates were correctly reproduced in terms of position, orientation, and scale inside the Niagara system.

However, there are some important limitations for CPU based emitters: 

  • They cannot spawn more than 100K particles;

  • They rely only on the CPU, which means they might consume additional time taking it away from other scripts鈥 execution every frame, resulting in lower frame rates especially when dealing with the maximum amount of supported particles;

  • GPUs can handle much better than CPUs. This makes GPUs better suited than CPUs to large volumes of particles.

While it makes sense for debugging to accept the CPU 100K particle limits, it鈥檚 definitely not the right setup to scale up, especially when you want to support bigger Gaussian Splats files that may contain millions of particles.

In a second iteration, we decided to switch to a GPU based emitter. This not only relies on the GPU completely without affecting the CPU but can support up to 2 million particles spawned, which is 20x more than what is supported on the CPU.

The side effect of executing on the GPU is that we also needed to take care of GPU resource allocations and management, requiring us to get dirty with HLSL shader code and data conversion between CPU and GPU.

How? You guessed it, by extending our beautiful custom NDI.

From PLY file to the GPU via the NDI

Thanks to our custom NDI, we have full control over how our data is stored in memory and how it is converted into a Niagara compatible form. The challenge now is to implement this via code. For simplicity, let鈥檚 break our goal down into two parts:

  1. Allocate memory on the GPU to hold Gaussian Splat data coming from the CPU.

  2. Transfer Gaussian Splat data from the CPU to the prepared GPU memory.

Prepare the GPU memory to hold Gaussian Splat data

The first thing to be aware of is that we cannot use Unreal data types like TArray (which holds the list of Gaussian Splats in our NDI) when we define data on the GPU. This is because TArray is designed for CPU use and is stored in CPU-side RAM, which is only accessible by the CPU. Instead, the GPU has its own separate memory (VRAM) and requires specific types of data structures to optimize access, speed, and efficiency.

To store collections of data on the GPU, we needed to use GPU buffers. There are different types available:

  • Vertex Buffers: store vertex such as positions, normals, and texture coordinates;

  • Index Buffers: used to tell the GPU the order in which vertices should be processed to form primitives;

  • Constant Buffers: store values such as transformation matrices and material properties that remain constant for many operations across the rendering of a frame;

  • Structured Buffers and Shader Storage Buffers: more flexible as they can store a wide array of data types, suitable for complex operations.

In our case, I decided to follow a simple implementation, where each Gaussian Splat information is stored in a specific buffer (i.e. a positions buffer, a scales buffer, an orientations buffer, and a buffer for spherical harmonics and opacity).

Note that both buffers and textures are equally valid data structures to consider for splat data on the GPU. We elected for buffers as we felt the implementation was more readable, while also avoiding an issue with the texture-based approach where the last row of pixels was often not entirely full.

To declare these buffers in Unreal, we needed to add the definition for a 鈥Shader parameter struct鈥, which uses an Unreal Engine Macro to tell the engine this is a data structure supported by HLSL shaders (hence supported by GPU operations). Here is an example:

BEGIN_SHADER_PARAMETER_STRUCT(FGaussianSplatShaderParameters, )
   SHADER_PARAMETER(int, SplatsCount)
   SHADER_PARAMETER(FVector3f, GlobalTint)
   SHADER_PARAMETER_SRV(Buffer<float4>, Positions)
   SHADER_PARAMETER_SRV(Buffer<float4>, Scales)
   SHADER_PARAMETER_SRV(Buffer<float4>, Orientations)
   SHADER_PARAMETER_SRV(Buffer<float4>, SHZeroCoeffsAndOpacity)
END_SHADER_PARAMETER_STRUCT()

It is worth noting that these buffers can be further optimized since the W coordinate remains unused by position and scales (they only need XYZ). To improve their memory footprint it would be ideal to adopt channel packing techniques, which are out of the scope of this article. It is also possible to use half precision instead of full floats for further optimization.

Before the buffers we also define an integer to keep track of the splats we need to process (SplatsCount), and a GlobalTint vector, which is an RGB value that we can use to change the tint of the Gaussian Splats. This definition goes into the header file of our NDI class.

We also need to inject custom shader code for the GPU to declare our buffers so that they can be referenced later on and used by our custom shader functions. To do it, we inform Niagara through the override of GetParameterDefinitionHLSL:

void UGaussianSplatNiagaraDataInterface::GetParameterDefinitionHLSL(
  const FNiagaraDataInterfaceGPUParamInfo& ParamInfo, FString& OutHLSL)
{
  Super::GetParameterDefinitionHLSL(ParamInfo, OutHLSL);


  OutHLSL.Appendf(TEXT("int %s%s;\n"), 
    *ParamInfo.DataInterfaceHLSLSymbol, *SplatsCountParamName);
  OutHLSL.Appendf(TEXT("float3 %s%s;\n"),
    *ParamInfo.DataInterfaceHLSLSymbol, *GlobalTintParamName);
  OutHLSL.Appendf(TEXT("Buffer<float4> %s%s;\n"),
    *ParamInfo.DataInterfaceHLSLSymbol, *PositionsBufferName);
  OutHLSL.Appendf(TEXT("Buffer<float4> %s%s;\n"),
    *ParamInfo.DataInterfaceHLSLSymbol, *ScalesBufferName);
  OutHLSL.Appendf(TEXT("Buffer<float4> %s%s;\n"),
    *ParamInfo.DataInterfaceHLSLSymbol, *OrientationsBufferName);
  OutHLSL.Appendf(TEXT("Buffer<float4> %s%s;\n"),
    *ParamInfo.DataInterfaceHLSLSymbol, *SHZeroCoeffsBufferName);

Effectively, this means that a Niagara system using our custom NDI will have this shader code generated under the hood. This allows us to reference these GPU buffers within our HLSL shader code for next steps. For convenience we defined the names of the parameters as FString, and they are used to make the code more maintainable.

Transfer Gaussian Splat data from CPU to GPU

Now the tricky part: we need to 鈥減opulate鈥 the GPU buffers using C++ code as a bridge between the CPU memory and the GPU memory, specifying how the data is transferred.

To do it, we decided to introduce a custom 鈥Niagara data interface proxy鈥 鈥 a data structure used as a 鈥bridgebetween the CPU and the GPU. This proxy helped us push our buffer data from the CPU side to the buffers declared as shader parameters for the GPU. To do it, we defined in the proxy the buffers, and the functions to initialize and update them respectively.

I know this seems to be getting very complicated, but from a logical point of view it is quite simple, and I can help you understand the system by visualizing the full concept in this diagram:

Now that we have a complete overview of our system, there are some final little details we need to refine in order for it to be fully operational.

We already have the buffers鈥 definitions for the GPU as HLSL code via the GetParameterDefinitionHLSL function. Now, we need to do the same for the functions we previously defined in GetFunctions, so the GPU understands how to translate them into HLSL shader code.

Let鈥檚 take the GetSplatPosition function for example, we previously saw how it was defined for use with the CPU. Now we need to extend its definition to be also declared for the GPU. We can do this by overriding the GetFunctionHLSL in our custom NDI:

bool UGaussianSplatNiagaraDataInterface::GetFunctionHLSL(
  const FNiagaraDataInterfaceGPUParamInfo& ParamInfo, const
  FNiagaraDataInterfaceGeneratedFunction& FunctionInfo, 
  int FunctionInstanceIndex, FString& OutHLSL)
{
   if(Super::GetFunctionHLSL(ParamInfo, FunctionInfo,
     FunctionInstanceIndex, OutHLSL))
  {
    // If the function is already defined on the Super class, do not
    // duplicate its definition.
    return true;
  }
  
  if(FunctionInfo.DefinitionName == *GetPositionFunctionName)
  {
    static const TCHAR *FormatBounds = TEXT(R"(
      void {FunctionName}(int Index, out float3 OutPosition)
      {
        OutPosition = {PositionsBuffer}[Index].xyz;
      }
    )");
    const TMap<FString, FStringFormatArg> ArgsBounds =
    {
     {TEXT("FunctionName"), FStringFormatArg(FunctionInfo.InstanceName)},
     {TEXT("PositionsBuffer"),
       FStringFormatArg(ParamInfo.DataInterfaceHLSLSymbol + 
         PositionsBufferName)},
    };
    OutHLSL += FString::Format(FormatBounds, ArgsBounds);
  }
  else
  {
    // Return false if the function name does not match any expected.
    return false;
  }
  return true;
}

As you can see, this part of the code simply adds to the OutHLSL string the HLSL shader code that implements our GetSplatPosition for the GPU. Whenever Niagara is GPU based and the GetSplatPosition function is called by the Niagara graph, this shader code on the GPU will be executed.

For brevity I did not include the other HLSL shader code for the scale, orientation, spherical harmonics, and opacity getter functions. However, the idea is the same, we would just add them inside GetFunctionHLSL.

Finally, the actual code to transfer data from the CPU to the GPU via the DIProxy is handled by the override of SetShaderParameters:

void UGaussianSplatNiagaraDataInterface::SetShaderParameters(
  const FNiagaraDataInterfaceSetShaderParametersContext& Context) const
{
  // Initializing the shader parameters to be the same reference of 
  //our buffers in the proxy
  FGaussianSplatShaderParameters* ShaderParameters =
    Context.GetParameterNestedStruct<FGaussianSplatShaderParameters>();
  if(ShaderParameters)
  {
    FNDIGaussianSplatProxy& DIProxy = 
      Context.GetProxy<FNDIGaussianSplatProxy>();


      if(!DIProxy.PositionsBuffer.Buffer.IsValid())
      {
        // Trigger buffers initialization
        DIProxy.InitializeBuffers(Splats.Num());
      }


      // Constants
      ShaderParameters->GlobalTint = DIProxy.GlobalTint;
      ShaderParameters->SplatsCount = DIProxy.SplatsCount;
      // Assign initialized buffers to shader parameters
      ShaderParameters->Positions = DIProxy.PositionsBuffer.SRV;
      ShaderParameters->Scales = DIProxy.ScalesBuffer.SRV;
      ShaderParameters->Orientations = DIProxy.OrientationsBuffer.SRV;
      ShaderParameters->SHZeroCoeffsAndOpacity =
        DIProxy.SHZeroCoeffsAndOpacityBuffer.SRV;
  }
}

Specifically, this transfers the buffer data from the NDI proxy (DIProxy) into the relative HLSL shader parameters, ruled by the FGaussianSplatShaderParameters struct.

That was a lot of code! If you managed to follow the full process, congratulations! You are now pretty much done with the low-level implementation. Let鈥檚 back up one level and finish some of the leftovers to complete our Gaussian Splat viewer!

Register our custom NDI and NDI proxy with Niagara

One last thing required to access our custom NDI inside the Niagara property types is registering it with the FNiagaraTypeRegistry. For convenience, we decided to do it inside the PostInitProperties of our NDI, where we also create the NDI proxy that will transmit data from the CPU to the GPU.

void UGaussianSplatNiagaraDataInterface::PostInitProperties()
{


  Super::PostInitProperties();


  // Create a proxy, which we will use to pass data between CPU and GPU
  // (required to support the GPU based Niagara system).
  Proxy = MakeUnique<FNDIGaussianSplatProxy>();
 
  if(HasAnyFlags(RF_ClassDefaultObject))
  {
    ENiagaraTypeRegistryFlags DIFlags =
      ENiagaraTypeRegistryFlags::AllowAnyVariable |
      ENiagaraTypeRegistryFlags::AllowParameter;


    FNiagaraTypeRegistry::Register(FNiagaraTypeDefinition(GetClass()), DIFlags);
  }


  MarkRenderDataDirty();
}

Here is a screenshot of our updated shiny Niagara system making use of our custom NDI and getter functions exposed in its graph!

The big challenge of converting from PLY to Unreal coordinates

There is hardly any documentation currently available online to explicitly specify the conversions required to transform data coming from a PLY file into Unreal Engine. 

Here are some funny painful failures we had to go through before finding the right conversions.

image6.png
image5.png
image2.png
image9.png
image4.png

After many trials and mathematical calculations, we were finally able to establish the proper conversion. For your convenience, here is the list of operations to do it:

Position (x, y, z) from PLY 
Position in UE = (x, -z, -y) * 100.0f

Scale (x, y, z) from PLY
Scale in UE = (1/1+exp(-x), 1/1+exp(-y), 1/1+exp(-z)) * 100.0f

Orientation (w, x, y, z) from PLY
Orientation in UE = normalized(x, y, z, w)

Opacity (x) from PLY
Opacity in UE = 1 / 1 + exp(-x)

In order to keep performance optimal, these conversions are performed on load rather than at runtime, so that once the splats are in the scene, no update is required per frame.

Here is how the resulting Gaussian Splats viewer will show by following the right calculations at the end of the process I described in this article.

There are some more bits and bobs of code to deal with further geometric transformations and clipping, but those remain outside of the scope of this article.

The final result with some more feedback

This has been a very long journey, resulting in a very long article I admit. But I hope it has inspired you to better understand how Niagara in Unreal can be customized to interpret your custom data; how it is possible to optimize its performance via GPU-based HLSL shader code injected from your custom Niagara Data Interface and Niagara Data Interface Proxy; and finally how Gaussian Splat can be viewed in the viewport after all this hard work!

Thank you for following this journey and feel free to and on LinkedIn for more tech-based posts in the future!

Happy coding! 馃檪

]]>
How we wrote a GPU-based Gaussian Splats viewer in Unreal with Niagara
Meet the Magnopians: Chris Kinch草莓视频在线Tue, 30 Apr 2024 09:45:44 +0000/blog/meet-the-magnopians-chris-kinch618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:662fbf489fe629730e175bc5Chris Kinch is a Psychology Lecturer turned 3D Artist, using his work ethic from academia to teach himself 3D art for games during the COVID pandemic. He is a Junior Artist at 草莓视频在线, and has recently celebrated his 1 year anniversary at the company. We caught up with him to discover what attracted him to 草莓视频在线 in the first place, and hear more about his pivot from Psychology to Art. 


Tell us more about your role at 草莓视频在线

I鈥檓 a Junior Artist, so my day-to-day is mostly spent modelling and texturing assets for our current project with direction from our Senior Artists. I'd describe my current role as 'generalist' 鈥 we're a fairly small team so it's helpful to be able to move around based on the project's needs, whether that's environment work, props, set dressing, level creation, etc.

You鈥檝e been at 草莓视频在线 for a year now, but what attracted you to 草莓视频在线 in the first place?

To be 100% honest, it was my mentor at the time who recommended I apply for the position! The name 草莓视频在线 wasn't on my radar, but it turned out that 草莓视频在线 had worked on the first VR game I ever played 鈥 Mission: ISS.

Looking back, it was the recruitment process that made 草莓视频在线 stand out against other companies I was interviewing for. The art test set for me was a bespoke task based on feedback that the Art Director at the time had given on one of my portfolio pieces. It's rare to get feedback even at the end of an interview, so right from the start I got the impression that 草莓视频在线 was a company that invested in its artists. 

Mission: ISS lets users explore the International Space Station in detail and understand what it鈥檚 like to be an astronaut in a way that鈥檚 never before been possible.

What made you decide to pursue a career in this field?

Before 3D, I was part way through a PhD in Psychology but that all came to a stop during the COVID lockdowns. I've always been interested in art as a hobby, so with a lot of time suddenly on my hands, and very much on a whim, I took a stab at some 3D tutorials on YouTube. I can't say for sure what pulled me in, but I was totally hooked! It's terribly clich茅, but after some time I knew I wanted to do this as a career. After some emotional discussions, I withdrew from my PhD and put my best foot forward!

What skills are essential for anyone in your role?

More virtue than skill, but I'd say humility is really important. We can put so much effort into the work we do that sometimes it can be difficult to remove ourselves from the process of receiving feedback and iterating. Similarly, I think it's important to learn not to be precious about our work 鈥 sometimes the best approach is to start from scratch even though that can be really difficult!

I also think problem-solving is sometimes overlooked when you're getting started in 3D. It can be a bit more technical than other art disciplines, so not being discouraged when things don't work as expected and learning to troubleshoot and research solutions are important. It's kind of inseparable from the discipline, so if you can learn to enjoy it (almost) as much as the creative parts, you'll have far fewer headaches!

What鈥檚 the best piece of advice you鈥檝e ever been given?

Your 100% effort doesn't always look the same. Sometimes, you might only have 50% left in the tank - if you give that 50%, you are giving 100% of what you have in that moment. 

If you could have any other job in the world, what would it be?

Tough one. Maybe a carpenter? I feel like I'd still want to make things. 

Where would you most like to travel to in the world?

I'd love to go back to Japan. I spent a week there and it wasn't nearly enough time!

How do you want to leave a mark on the world 鈥 personally or professionally?

When I was starting out learning 3D, I relied a lot on the generosity of other artists who created free instructional content or volunteered their time to give advice (to be honest, I still do!). One day, I'd also like to be in a position where I can provide similar help to other new artists.  

What are you reading/listening to/watching right now?

Right now I'm watching The Rookie 鈥 that Cop Cuties song went viral on TikTok and has had me in a chokehold ever since. I'm also finishing up rereading The Witcher books!

]]>
Meet the Magnopians: Chris Kinch
Ben Grossmann speaks to fxguide about how LED volumes were used as a storytelling device in 鈥楩allout鈥.ExternalWed, 17 Apr 2024 09:07:00 +0000https://www.fxguide.com/fxfeatured/inside-the-led-bunker-of-fallout/618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:66e946d2bcf7e65482b7e0b6

explores the use of LED volumes in the production of Amazon Prime's Fallout. It details how physical and digital production methods combined to create immersive environments, utilizing advanced Unreal Engine technology for in-camera visual effects (ICVFX). The collaboration between the virtual art department and production teams streamlined workflows, enhancing creativity and efficiency. The LED volumes enabled seamless integration of real-time 3D assets with practical sets, reducing the need for post-production VFX replacement and capturing final pixels directly on set.

Permalink

]]>Ben Grossmann speaks to fxguide about how LED volumes were used as a storytelling device in 鈥楩allout鈥.Sharing models and custom nodes in ComfyUIMarianne GorczycaWed, 20 Mar 2024 09:24:18 +0000/blog/sharing-models-and-custom-nodes-in-comfyui618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:65f977eb1cf1dd191def902aIntroduction

Image generated using ComfyUI

The ability to create content in response to image or text-based prompts using Generative AI is a burgeoning interest in the sphere of Artificial Intelligence. Stable Diffusion models are a family of Generative AI models which was spearheaded by Stability AI in 2022 and are gaining in popularity. Stable Diffusion models are open-source and allow you to train new models on your own datasets. While all diffusion models enable generating images from prompts, the technology used for Stable Diffusion requires less processing power and is more readily utilized on consumer-grade graphics cards. is a popular node-based Stable Diffusion  graphical user interface (GUI)  that generates images in response to positive and negative prompts.

The Stable Diffusion process requires a model checkpoint as a basis for generating images. and are popular sites for downloading models.

While ComfyUI鈥檚 default nodes provide basic capabilities for generating images, it鈥檚 likely that you will need to install additional nodes for more advanced processing. Additional nodes are available from a variety of sources. has links to more than 700 Github repositories.

Our need

By default, ComfyUI accesses models and nodes via 鈥榤odels鈥 and 鈥榗ustom_nodes鈥 folders that are part of its individual installation folder hierarchy. Here at 草莓视频在线, we wanted to benefit from an installation that would be shared among project team members.

A shared installation addresses some downsides of working with ComfyUI. The files for the models and nodes take a lot of storage space. With individual installs, new users need to put the necessary models and nodes in place, which is time-consuming. Also, versions used across the team can easily get out of sync.

Our solution

We filled this need with ComfyUI鈥檚 鈥--extra-model-paths-config鈥 command-line argument and pointed it to a shared network drive.  At first glance, one might think that this argument is used to specify paths for models only. In fact, its value is a yaml file that points to additional locations for models as well as custom nodes.

Using a centralized drive created several efficiencies in our use of ComfyUI. We saved disk space by avoiding multiple installs of large files. Onboarding was faster because the models and custom nodes were immediately available for new users. By using one copy of the files, versioning was easily managed. Updates were instantly available to everyone. Cloud-based resources became an option, providing flexibility, scalability, and accessibility for team members working remotely or in distributed environments.

Sample configuration

In this example, 鈥榚xtra_model_paths.yaml鈥 is in 鈥榅:\comfyui_models` which has subfolders 鈥榤odels鈥 and 鈥榗ustom_nodes鈥. The X drive in this example is mapped to a networked folder which allows for easy sharing of the models and nodes. The contents of the yaml file are shown below.

#config for comfyui

#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.

comfyui:
    base_path: X:\\comfyui_models
    checkpoints: models\checkpoints\
    controlnet: models\controlnet\
    custom_nodes: custom_nodes\
    loras: models\loras\

The 鈥榚xtra_models_paths.yaml鈥 is supplied when ComfyUI is started from the command line as shown here:

python main.py --extra-model-paths-config X:\comfyui_models\extra_model_paths.yaml

Upon startup from the command line, the console output includes the information below, which shows that the paths specified in the yaml file are used in addition to the default locations. 

 Import times for custom nodes:
   0.3 seconds: X:\\comfyui_models\custom_nodes\ComfyUI_UltimateSDUpscale
   0.3 seconds: X:\\comfyui_models\custom_nodes\ComfyUI-Manager
   0.7 seconds: X:\\comfyui_models\custom_nodes\comfyui_controlnet_aux
   0.8 seconds: X:\\comfyui_models\custom_nodes\ComfyUI_Comfyroll_CustomNodes
   1.9 seconds: X:\\comfyui_models\custom_nodes\ComfyUI-Inference-Core-Nodes
   3.6 seconds: X:\\comfyui_models\custom_nodes\ComfyUI-Impact-Pack
   3.8 seconds:C:\comfyui\ComfyUI\custom_nodes\was-node-suite-comfyui

Sample ComfyUI workflow

In this ComfyUI workflow, the 鈥業mage Save鈥 custom node is from 鈥榳as-node-suite-comfyui鈥 on the C-drive and the 鈥業mage Luminance鈥 node is from 鈥榗omfyui_controlnet_aux鈥 on the X-drive. Both nodes are found and the image is generated as desired.

Conclusion

At 草莓视频在线 we found that this process improved productivity on our projects, both by saving setup time when adding team members and by reducing the need to debug errors for missing components. By not requiring hands-on installs for each user, this solution fosters automation of our workflow.

We hope that you see the same benefits when leveraging this tip.


It was brought to our attention after publishing this article that while ComfyUI itself remains secure, a malicious custom node called "ComfyUI_LLMVISION" was uploaded by a user and contained code designed to steal sensitive user information, including browser passwords, credit card details, and browsing history. Though this particular custom node is unrelated to this article, we wanted to remind you to remain vigilant when integrating third-party components into your workflows. Read more about this particular incident .


]]>
Sharing models and custom nodes in ComfyUI
Meet The Magnopians: Ian Villamor草莓视频在线Mon, 11 Mar 2024 09:54:13 +0000/blog/meet-the-magnopians-ian-villamor618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:65ead3e42f0cdc0adba45d17

Ian Villamor is a Senior Cultural & Office Ambassador at 草莓视频在线. He鈥檚 had a varied career history including roles in production, social media, and operations. Ian thrives in people management roles and enjoys being front of house for the LA office. 


Tell us more about your job role at 草莓视频在线

I work as part of the Client Hospitality team to make sure visitors to 草莓视频在线 have the best experience possible. That also includes looking after our employees! I am a people person and thrive on engaging with all types of people. I love being the first face people see when they walk into the office!

What attracted you to 草莓视频在线?

Believe it or not, I worked alongside Ben Grossmann, Alex Henning, and Lap Van Luu at a company called The Syndicate way back in the early 2000s. So when I was looking for a change of role after being in production, I saw that these amazing people had set up a little company called 草莓视频在线 which was taking the world by storm. 

What鈥檚 your origin story, what path did you take to get here?

I鈥檝e been in production and post-production for most of my working life. I also got into Instagram when it first launched and ran my own company doing social media, and working with the influencers of the time (before they were called influencers!). I also worked with Animal Planet and had the opportunity to travel the world to places like India, Borneo, and Indonesia. It was incredible. I鈥檝e done a lot of stuff and had many careers that have led me here. 

What three skills are essential for anyone in your role?

This is tough because I鈥檇 say number one is patience. And that鈥檚 not something that can be taught or bought. You鈥檝e just got to learn to be patient with people as you鈥檙e dealing with lots of different personalities, and also different levels of seniority. From CEOs of the biggest companies in the world to the maintenance team, to your goofball colleagues, you鈥檝e got to adjust on the fly to adapt your communication style. 

The second is remembering the small details about people (this is one of my superpowers!). Make sure to remember people鈥檚 names, the names of their partners, kids, dogs, how they like their coffee, etc. It makes people feel valued, seen, and appreciated. People will never forget you that way, and in our industry, it can be that personal touch that makes people want to come back and work with us, or work for us because they know they matter to us.

The last, and probably most important one, is anticipation. Anticipating people鈥檚 needs will get you far because it allows people to trust you. If I already know what a client or employee is thinking before they mention it, they know I鈥檝e got them, understand them, and will look after them. There鈥檚 so much stuff people need to make them be their best, but wouldn鈥檛 necessarily ask for. If you can give them those things before they even ask, you鈥檒l get the best out of your employees and clients will want to come back.

If you could have any other job in the world, what would it be?

This one鈥檚 easy, a comedian! But in all honesty, I鈥檓 too scared to do it. As I get older I realize that I 诲辞苍鈥檛 care as deeply about others' perception of me, so I should just do it! Nobody wants to be booed off stage, but if that鈥檚 the worst that can happen then I should totally give it a go.

Who are your 3 dream dinner guests?

Jodie Comer, she is beauty and brains combined, and she would keep us entertained with her incredible array of accents. And Robin Williams, because I feel like he would be an amazing mentor, and we鈥檇 feed off each other鈥檚 energy. Lastly, the legend that is Kobe Bryant, because he鈥檚 the GOAT. From all of these people, I think I鈥檇 just learn so much.

]]>
Meet The Magnopians: Ian Villamor
An approach to automatic object placement in mixed reality for more tailored experiencesDouglas Cooper-HindThu, 29 Feb 2024 15:09:10 +0000/blog/an-approach-to-automatic-object-placement-in-mixed-reality-for-more-tailored-experiences618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:65e0734efab60c5c49c04a0f

Introduction

We recently worked with Meta to develop an mixed reality (MR) adventure built in Unity. The objective was to use features to show Quest developers how they can turn any room into a tailored gameplay experience. For this project, we also built a procedural spawning system from the ground up. This allows virtual objects to be seamlessly integrated with the physical room 鈥 this is what we鈥檒l dive into in this blog post. For more information on the MR experience, check out the case study here

Scene Understanding 

To place objects in a room, the Quest headset first needs to understand the shape and layout of the room. Depending on which headset you鈥檙e using, you鈥檒l need to trace the outline of the walls or scan the room to get the shape of the room, then manually outline and label the doors, windows, and furniture. This functionality to scan and label a room is built into the headset, via the Space Setup menu. The results of the setup are available via the Meta XR SDK. This relatively simple solution to spatial understanding means that it only supports planes and cubes, but does simplify reasoning around the space for developers. 

Procedural Placement

Using the scene primitives returned by the Space API, a 3D grid of cells is generated to cover the entire room and track where scene objects are placed. This allows the room setup system to detect safe locations for placing game objects and minimize overlapping with real-world objects.

Free space in the 3D grid is visualized as green cells, while non-valid locations are shown in red. For example, in the following image, you can see available space in green, but locations blocked by tables, a sofa, and a sideboard are marked in red. As new digital content is placed in the environment more cells are marked as non-valid so digital assets do not overlap with each other. 

There are four categories of placement location; 鈥渇loor鈥, 鈥渨all鈥, 鈥渄esks鈥 and 鈥渁gainst wall鈥. 鈥淎gainst wall鈥 objects are the ones that need to be on the floor but also against a wall, for example, a cabinet or wardrobe.

As mentioned, when an object is placed, it blocks out the cells that it covers to prevent other objects from being placed in the same location. As a result, we have to be mindful of the order we place objects depending on the space requirements.

Objects are placed in the following order:

  1. Objects on the floor against a wall

  2. Objects on walls

  3. Objects on desks (if no desks are available, fallback to the floor)

  4. Objects on the floor

  5. Objects on any horizontal location

Within each category, objects are placed in order of size 鈥 from largest to smallest. If an object fails to find a safe location, it鈥檚 placed in a random logical location. For example, wall objects will stay on walls and floor objects will stay on the floor, even if they are overlapping with other digital objects or scene objects.

Objects that are not on or against a wall face towards the center of the largest open space in the room, in most spaces this should mean that items are placed in ways that users can reach them. 

Other methods we tried included:

  • Facing the user, but the user may be stood off to one corner during the setup process resulting in less ideal rotations, 

  • Or simply pointing at the geometric center of the room. This is also not ideal, as it may be in the middle of a piece of furniture such as a table, resulting in needing to awkwardly lean over the table to reach items. 

The center of the largest open space is calculated on the floor, while the Scene Understanding is loading and cells are generated. We calculate the distance to the nearest blocked cell, creating a distance field. The center location is the floor cell that is furthest from any blocked cell. 

A challenge throughout the placement system was how to use the space most effectively, specifically making it work well in small spaces. We expected the experience to mostly be played in residential settings, where large open spaces are not usually present.


To help pack the objects more effectively, the largest compromise made was to rotate floor-only objects to make them align to the cells on the floor. In this way, they block the minimal number of floor cells and would most likely pack together closely in smaller spaces. Although aligned with the floor cells, the object is still rotated to the closest rotation that faces the calculated room center. 

Since we do not have control over the size and shape of the space that the game is played in, the cell-based placement system can potentially fail to place all objects safely, especially in smaller spaces.

So, after all objects have been automatically placed, we apply a few seconds of 鈥渆asing鈥, where any objects that are overlapping are gently repelled away from each other.

The easing is done using the Unity physics engine function , which gives both a direction and distance to move objects so they no longer overlap. Rather than immediately placing the objects at these new locations, the system smoothly moves towards the location. Easing is applied gradually so if the movement causes new overlaps to occur all objects have the opportunity to move away from each other and prevent penetration through walls.

Once the placement is complete, the user then previews the generated layout. In this phase, the game objects are visualized as colored boxes that represent the volume occupied by the digital game objects:

  • Blue boxes: items that need to be physically reached during the experience because the user will interact with them using hands/controllers.

  • Green boxes: items that need to be visible, but 诲辞苍鈥檛 need to be reached.

  • Red boxes: items that are in a non-valid location, are overlapping with each other, or are overlapping with scene objects. In this phase, the user has an opportunity to manually move these boxes around, if necessary, to make sure there are no red boxes and that all blue boxes can be physically reached.

After this, the user can confirm the layout and start the experience. Object positions are then stored and used during gameplay to avoid re-doing these heavy calculations.

This system means that every time the experience is played the objects get a new arrangement which adds variety and fun.

Where Next?

This is an exciting area of XR development that鈥檚 constantly evolving, some small ideas for how this system could be expanded on to improve user experience or reliability could include:

  • A soft prompt to the user if objects continue to overlap after easing has completed, directing their attention to the issue.

  • Add functionality to allow developers to provide a map of where the digital content should be relative to each other, that is adapted to the available space. 

  • For applications that have a large number of objects, it may make sense to group smaller objects with larger ones. For example, placing objects under digital tables.

Head on GitHub for the documentation and links to the open source code.

]]>
An approach to automatic object placement in mixed reality for more tailored experiences
Trilinear Point Splatting (TRIPS) and its advantages over Gaussian Splatting and ADOPAlessio RegalbutoTue, 20 Feb 2024 10:59:25 +0000/blog/trilinear-point-splatting-trips-and-its-advantages-over-gaussian-splatting-and-adop618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:65cdfc8b98c6e56fd8e74d8f

Gif extracted from

Introduction

Since the great success of and Gaussian Splatting in the creation of digital twins of existing physical places for virtual production and immersive experiences, new approaches and innovative research have pushed the boundaries of visual realism even further.

In this article, we will have a look at 鈥溾, or 鈥淭rilinear Point Splatting鈥, a new technique in the radiance fields realm that could represent the next step of visual fidelity for real-time rendered digital twins.

A Recap on Gaussian Splatting and ADOP

What is Gaussian Splatting?

If you're not familiar with Gaussian Splatting, don't worry! Check out this comprehensive article to gain a better understanding of how it works and stay informed about developments in this research field.

In short, 3D Gaussian Splatting is a sophisticated technique in computer graphics that creates high-fidelity, photorealistic 3D scenes by projecting points, or "splats," from a point cloud onto a 3D space, using Gaussian functions for each splat instead of the traditional 鈥渄ots鈥. This produces more accurate reflections, colors, and refractions for the captured scene while providing performant rendering for the digital environment in real-time.

Despite its benefits and great performance improvements, especially when compared to traditional NeRF techniques, some of the resulting Gaussian Splats are affected by visual artifacts, such as 鈥渇loaters鈥 (pieces of the point cloud that are not visualized correctly and look like floating pieces of clouds in the wrong place) and blurry areas (especially where the input photos present small overlaps or fail camera alignment during the training process).

Luckily, workarounds exist to improve the visual quality of the final result, such as with 鈥溾, which could be a promising alternative.

Gif extracted from

However, as of today the code for this deblurring technique is still a work in progress and not yet available on GitHub. I would recommend following the updates of this specific research, which in the future might offer good results in terms of visual realism and clarity for Gaussian splats.

What is ADOP?

Another state-of-the-art technique that deals with the problem differently is ADOP (Approximate Differentiable One-Pixel Point Rendering)

Source:

There is a that shows examples of its applications and offers the tools to train custom datasets. Among the detailed explanation and powerful UI developed for this project, the experimental ADOP VR Viewer is another interesting utility that uses OpenVR/SteamVR to visualize the final result in virtual reality.

Source: ]


The ADOP pipeline supports camera alignment using , which aligns with the workflow also supported by . This involves before training them with their training models, like and . To facilitate this process, the ADOP repository provides a "colmap2adop converter," which must be used before running the "adop_train" executable to generate the final result.

Despite its advantages, ADOP can be affected by temporal instability (which leads to some areas of the viewed environment changing depending on the angle of vision) and reduced visual performances due to the specific neural reconstruction network required for it to work, which is unable to effectively address large gaps in the point cloud or to deal with unevenly distributed data (as discussed in their ).

Wouldn鈥檛 it be great to have an alternative that combines the approach of Gaussian Splatting and ADOP to achieve better results? Enter TRIPS, the new Trilinear Point Splatting technique for Real-Time Radiance Field Rendering.

Introducing TRIPS

Developed by a team from , combines the best of Gaussian Splatting and ADOP with a novel approach that rasterizes the points of a point cloud into a screen-space image pyramid

Technical Innovations of TRIPS

Source:

In contrast to Gaussian-shaped splats, TRIPS renders a point cloud using cubic (or volumetric) splats, defined trilinearly as 2x2x2 blocks, into multi-layered feature maps, which are then processed via a differentiable pipeline, and rendered into a tone-mapped final image


Each Trilinear Point Splat holds several pieces of information, namely:

  • The neural descriptors of the Point (colors).

  • The transparency of the Point.

  • The position in space of the Point.

  • The world space size of the Point.

  • Camera intrinsic parameters (lens parameters, aspect ratio, camera sensor parameters).

  • The extrinsic pose of the target view.

A novelty in their method is that they 鈥渄o not use multiple render passes with progressively smaller resolutions鈥 (as explained in their ) since it would cause severe overdraw in the lower resolution layers. Instead, their paper explains that they only 鈥渃ompute the two layers which best match the point鈥檚 projected size and render it only into these layers as 2x2 splat鈥.

Trilinear point splatting 鈥 image source:

The core of TRIPS involves projecting each point into the target image and assigning it to a specific layer of an image pyramid based on the point's screen space size and point size. Larger points are written to lower-resolution layers, which cover more space in the final image. This organization is facilitated by a trilinear write operation, optimizing the rendering of point clouds by efficiently managing spatial and resolution variability.

This technique shows a conceptual resemblance to and . Despite their operational differences (occlusion culling efficiently manages visibility through depth hierarchies, and Forward+ optimizes lighting by organizing light sources relative to the camera's view frustum), each approach relies on a foundational strategy: optimizing processing through the distribution of data across hierarchical structures. TRIPS accomplishes this via an image pyramid, hierarchical z-map-based occlusion culling through depth layers, and Forward+ lighting through spatial decomposition of the scene. These methodologies demonstrate the versatility and effectiveness of hierarchical data structures in enhancing rendering performance across various domains.

The complete rendering pipeline adopted by TRIPS at a high level is:

  1. Given a set of input images with camera parameters and a dense point cloud, which can be obtained through methods like multi-view stereo or LiDAR sensing.

  2. Project the neural color descriptors of each point into an image pyramid using the TRIPS technique mentioned above.

  3. Each pixel on each layer of the pyramid stores a depth-sorted list of colors and alpha values that are then blended together resulting in a specific pixel value for each layer of the pyramid.

  4. Each layer (鈥渇eature layer鈥) of the resulting pyramid is then given to a neural network, which computes the resulting image via a gated convolution block.

  5. A final post processing pass applies exposure correction, white balance, and color correction through a physically based tone mapper.

For more technical information about the mathematical explanation of the approach introduced by TRIPS, I recommend reading the .

Comparative Analysis

The innovative rasterization approach and the 鈥渄ifferentiable trilinear point splatting鈥 introduced by TRIPS allows it to preserve fine details in the reconstructed digital twin, avoiding the blurriness of an equivalent Gaussian Splatting generated scene, whilst also reducing the amount of floaters.

3D Gaussian Splatting (3D GS) vs TRIPS 鈥 gif extracted from

In addition, the temporal stability of TRIPS is an improvement over  ADOP. This means that the final result more accurately visualizes the scene; by showing more coherent images between different angles of vision, each object is more distinguishable and visually stable.

ADOP vs TRIPS 鈥 gif extracted from

In the GIF above, the temporal stability of TRIPS over ADOP can be specifically noticed when looking at the grass, or at the leaves of the trees in the background, as an example.

The provides many more comparisons of the approach versus traditional Gaussian Splatting, ADOP, and additionally . In terms of the fine details recovered after training, the clear winner is TRIPS, with Gaussian Splatting in second place. 

Example showing how compares to the more recent Gaussian Splatting and TRIPS training models - gif extracted from

It is also worth mentioning that TRIPS can render the scene at around 60 frames per second, depending on scene complexity and resolution. The only caveat is that camera intrinsics and extrinsics need to be estimated apriori (for example via ).

Applications for TRIPS

When we consider applications of TRIPS in the real world, there are many opportunities, most of which are similar to those afforded by Gaussian Splatting, which you can find in my previous article on the topic.

In architecture and real estate industries, TRIPS allows for immersive virtual tours, allowing clients to explore properties in stunning detail before construction or renovation.

In the entertainment industry, TRIPS could enhance virtual production, providing filmmakers with more realistic and interactive environments for movies or video games.

Additionally, in cultural heritage preservation, TRIPS could offer a tool for creating detailed digital twins of historic sites, enabling virtual exploration and aiding in restoration efforts by capturing and visualizing intricate details with unprecedented clarity.

In Conclusion

It has only been a couple of weeks since the release of the , which has already triggered a lot of interest in this novel technique. It鈥檚 great to see that the author has been very active in updating the page instructions and resolving issues reported by the community.

So far the visual results look very promising, even if some users report that the required training times could be significantly longer compared to other techniques. As an example, when using the 鈥渕ipnerf360 garden鈥 dataset, it could take around 12 hours of computation on a Desktop NVIDIA RTX 3090 GPU to complete one-sixth of the overall computation (according to a report on ). As a response to these reported times, the author mentioned that on an NVIDIA RTX 4090 GPU, the training for the same dataset would instead take around four hours, which reflects the demanding computational power required to produce the final training results.

I鈥檓 sure we鈥檒l see more improvements in the coming weeks for this repository, which could potentially make TRIPS the new king of real-time radiance field rendering and digital twin visualization.

I cannot wait to see where this technology will lead us, and I will definitely keep a star on GitHub for this repository. Congratulations to the authors of this fantastic research, they deserve gratitude from the open-source community for the great effort they put into pushing the boundaries of visual realism and digital technology even further.

I hope this article has helped you discover something new and has inspired you to research even more about this topic.

Feel free to and 草莓视频在线 for further updates on the latest technological novelties and interesting facts. Thanks for reading and good luck with your projects!

]]>
Trilinear Point Splatting (TRIPS) and its advantages over Gaussian Splatting and ADOP
It鈥檚 not all just 鈥淎I鈥Jerome MeyersThu, 15 Feb 2024 11:19:10 +0000/blog/its-not-all-just-ai618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:65ca36dd64517b595a9878d9Apps like ChatGPT and Midjourney have brought 鈥渁rtificial intelligence鈥 into the limelight鈥r is it 鈥渕achine learning鈥? And when we encounter other terms like 鈥渘eural networks鈥, 鈥渄eep learning鈥, 鈥渟table diffusion鈥, and 鈥渓arge language models鈥, it can all get a little perplexing.

We often have some idea of how these technologies can be used, but just as often we lack any kind of understanding of how everything fits together underneath the surface. In this and a follow-up article, we鈥檙e going to shoot from the hip for a quick, intuitive understanding of how these concepts and tools relate to one another.

AI vs Machine Learning

There is some confusion between 鈥淎I鈥 and 鈥淢L鈥, with many people using the terms interchangeably. This confusion is understandable, especially due to the dominance of ML approaches. However, the terms have distinct meanings whereby one subsumes the other. 

Artificial intelligence is the most generic top-level term for the collection of approaches to simulating/automating intelligent behavior. As such, AI encompasses all traditional approaches for reaching said goal. Some popular and/or historical approaches are expert systems, search algorithms, fuzzy logic, and machine learning.

Machine learning, as a sub-discipline within AI, specifically involves fitting mathematical functions to input/output data to predict outputs from novel inputs. It鈥檚 a data-driven approach, made possible by the abundance of data. Said differently, machine learning involves the development of algorithms that can automatically learn patterns and relationships from large amounts of data.

Most of the widely popular applications of AI today are achieved through the application of the subset of techniques collected together under the heading of 鈥渕achine learning鈥 鈥 for this reason, it is not surprising that AI and ML are used interchangeably outside academic contexts.  

Machine learning plays such a big role in AI these days because its techniques leverage several related advancements of recent decades. Importantly, their matrix math is easily programmed using computers and accelerable via GPUs and clustering. And training crucially depends on and benefits from the abundance of data made available in recent years. 

In essence, AI is the broader concept, while machine learning is a subset of AI that specifically focuses on training algorithms to learn from data and improve their performance over time.

Neural Networks

When it comes to the discipline of machine learning, there exist many different mathematical approaches to 鈥渓earn from data鈥 or 鈥渇it mathematical functions to data鈥. One approach is to construct multiple decision trees and combine their predictions. In contrast, the 鈥渘eural network鈥 approach leverages research and insights into the structure and organization of the human brain as well as mathematical techniques like linear algebra. 

How does this actually work?

No article of this kind should be without a densely connected neural network diagram:

. Credit: ALEXANDER LENAIL

But鈥ow do we read this? What is it actually saying? Is it saying anything? Does it translate into code? Where鈥檚 the matrix? 

Let鈥檚 use this diagram to help connect our own dots in this complicated conceptual space.

An Algorithm

At the very highest level, you can read it as describing a flow of computation from left to right. It鈥檚 an algorithm. 

One level below that, we can say that it describes a function that takes in an array of 4 numbers on the left (the 4 empty circles), expands those into 12 numbers, reduces those to 10 numbers and finally outputs an array of 2 numbers on the right. We can see these are labeled 鈥渋nput layer鈥, 鈥渉idden layer鈥 and 鈥渙utput layer鈥 in the diagram.

If we look even closer, the first element of the diagram we see on the left is the circle. Circles represent inputs that accept a value but are also 鈥渁ctivation functions鈥 that transform the value provided to them. Some common activation functions are ReLU, which keeps the value if it's greater than 0, or returns 0 if it's not. The sigmoid function turns any number into a value between 0 and 1. 

The next thing we see, starting on the left and moving right, are the lines that connect circles on the first layer and the circles on the second (labeled 鈥渉idden鈥) layer. These can be thought of as weighted connections between inputs that multiply the input value by the connection鈥檚 weight as the value 鈥減asses through it鈥. 

Lots of things could be varied about this diagram. As well as the more obvious choices of adjusting the weights of the connections, varying the number of inputs on any of the layers, and adding hidden layers, it should be noted that not every input on a layer needs to be connected to every other input on the nearby layers. Also, inputs need not be connected to inputs only one layer away 鈥 you can even connect the output layers to the input layers. 

While all that may make sense in a vacuum, how is any of it actually useful? How could this magnificent abstraction be in any way put into practice? 

This question is particularly relevant in consideration of the fact that all machine learning algorithms operate on numbers.

Tokens

Consider that a first step to, for instance, processing a paragraph of text in a ChatGPT-like LLM scenario is to convert the paragraph of text into 鈥渢okens鈥. You can see how ChatGPT does this using the where you can look at the tokenized text and the Token IDs.

For instance, the sentence, Magritte made the statement 鈥渢his is a question.鈥 ?

is broken into the following fragments:

Which are converted into the following array of numbers or integer token IDs:

[34015, 99380, 1903, 279, 5224, 1054, 576, 374, 264, 3488, 2029, 30]

The idea behind these numbers is just that they are consistent so that the same fragments of sentences will result in the same sequence of numbers. This way, as the tokens are found again and again amongst the billions of training sentences, patterns can settle out of their recurring proximities to one another. These patterns are what the mathematical functions are fitting to.

These tokens are passed to the input layer (of a much more elaborate artificial neural network than what we have pictured above). In fact, tokens using this same encoding are expected to come out of the output layer of an LLM to be converted back into human-readable text. We鈥檒l go into greater detail about how this works when we discuss LLMs specifically.

Context Windows

This already helps us to understand a common feature of tools like ChatGPT called the 鈥渃ontext window鈥 which determines how large our prompt can be. We can read that ChatGPT 4.0 Turbo has a context window of 128K so it can be given 128,000 tokens (or pieces) of text at once. This context window is the size of the input array, in tokens. So, using ChatGPT 4.0 Turbo, you can pass 128,000 tokens (about 100,000 words) to the input layer.  

In summary, the diagram above describes what happens to input values. The lines connecting nodes representing 鈥渨eights鈥 are multiplied by the values on the left. The values flow through the computation and are adjusted by weights and activation functions and are expanded, combined, and reduced by flowing through the layers. 

Deep Learning

Deep learning is simply an approach to machine learning that uses artificial neural networks with 2 or more hidden layers. The above diagram is technically a 鈥渄eep neural network鈥. What鈥檚 been found is that by adding many hidden layers and playing around with those activation functions and the connections between layers, a surprising degree of order can be automatically extracted from input data. It鈥檚 more of a craft than a science at this point.

It should be acknowledged that nobody really understands why a dense interconnection of weights and activation functions organized so-and-so can obtain the behavior, when set up correctly, but differently here and there, of extracting and functionally representing complex and implicit relationships between inputs and outputs.

There鈥檚 the actual magic.

If you鈥檝e come this far and find yourself wanting more, consider (free ).

Training & Models

Now that we鈥檝e waded into neural networks and stuck our big toe into the deep learning current, let鈥檚 consider training and the output of training. 

First, it鈥檚 necessary to understand that training occurs at a higher level than the generic computational flow described above. That was simply a computation performed on an array of inputs and like any mathematical function, by putting the same array of 4 values in, you鈥檒l always get the same array of 2 values out. 

The computational flow we described in the previous section is the 鈥榤odel鈥, and as we alluded to earlier, can be reduced to a series of matrix multiplications with the weights and other parameters included.

On the other hand, training a model occurs at a different level and involves using one or more strategies to adjust the weights between the inputs. One way to accomplish this is by running the model over and over on training data that consists of inputs paired with known outputs. And adjusting the weights to more closely align the actual output with the expected output. Each of these runs through the full training dataset is known as an 鈥渆poch鈥. A model is finalized by doing many epochs. This is called . There鈥檚 also , , and more.

When it comes to supervised learning, you want to set aside some training data as validation data. This data isn鈥檛 used for training but helps test the model on unseen examples, allowing you to determine when training is sufficient. This is an important step to avoid overfitting your model to your training data. If a model is overfit to its training data it will likely not handle new data well.

It should be noted that the output of the 鈥渢raining鈥 process is a modified version of the model itself. The adjustments were to the weights and perhaps to other aspects of the model, such as to the activation functions or other parameters. The accumulated refinements of the malleable variables of the original model are the whole tamale 鈥 it鈥檚 what鈥檚 produced by the ocean-boiling GPU clusters.

These models can鈥檛 be further flattened or simplified 鈥 they are already the flattened simplification of the complicated implicit structure we magically extracted from our training data. 

The sequential transformations of the values through the layers being multiplied by the weights while having the activation functions applied are what is being delivered as a whole. This is the model everyone keeps talking about (and downloading).

Fine Tuning & LoRAs

These facts about the model imply that if you can get a pre-trained model you can train it further, since to train it is simply to adjust some of the values contained within the model. This is called fine-tuning. There are many approaches to fine-tuning and the details will depend on the kind of model you are adjusting and what you are trying to achieve. 

A fine-tuned model will be mostly the same size as the original. As model architectures grow larger this can become a challenge, as they require more and more powerful hardware to store the entire model in memory at once, as is necessary for training. Training modern models often happens on huge clusters of computers that are not available to the average business or hobbyist.

Low-Rank Adaption, LoRA, is a method that addresses these issues by storing a smaller set of weights that can be expanded and merged into specific layers of the pretrained model, such as the cross-attention layers. While you can use a fine-tuned model the same way you use the original pre-trained model when it comes to LoRAs, you have to apply the adjusting weights after loading the model itself. LoRAs can be used with both pre-trained and fine-tuned models, although may target or work better with one or the other.

The cross-attention layers that are cleverly adjusted during LoRA training concerning stable diffusion are part of the transformer neural network architecture. 

We鈥檒l get into the role cross-attention plays when we discuss transformers in a future article that digs into large language models (LLMs) and Stable Diffusion image generators. With this article, we鈥檝e laid the foundations to gain an intuitive understanding of the elements at play in the more complicated applications making waves in society today.

]]>
It鈥檚 not all just 鈥淎I鈥
Meet the Magnopians: Blair McLaughlin草莓视频在线Fri, 09 Feb 2024 12:38:45 +0000/blog/meet-the-magnopians-blair-mclaughlin618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:6537899c9768ee2ede20ea11

Blair has over 18 years of experience in the video games industry. She has worked on titles including Kingdom Hearts 1.5 and 2.5, Disney Infinity, Star Wars Rebels: Recon Missions, Epic Mickey 1 & 2, Ducktales: Remastered, Club Penguin: Elite Penguin Force, and DDR: Disney Channel Edition. She joined 草莓视频在线 in 2022 as a Project Manager and has worked on multiple top-secret projects, as well as the Augmented Reality Education App for the Holocaust Museum Los Angeles.


Tell us more about your job role at 草莓视频在线

I鈥檓 a project manager for the operations team on the studio side of 草莓视频在线, so I help manage the internal project teams by generating reports, running meetings, coordinating people and general organization. I primarily support the team in their day-to-day operations and make sure we are meeting client specifications for a project.

What鈥檚 your origin story, how did you get to where you are today?

I grew up refurbishing arcade cabinets in the garage with my dad, so I鈥檝e always loved video games. In 2005 when I was out on my own looking for a job, a friend of mine lined up an interview for me and helped me write my first ever resume. Knowing nothing about the company or position I was interviewing for, I turned up at the location and in the lobby hung a huge Disney sign!

I ended up landing a job as a QA tester at Disney and after that, moved around various places in a myriad of jobs including licensing and legal, localization, production and project management. Then in 2022, I applied at 草莓视频在线 for this job, and the rest is history!

What attracted you to 草莓视频在线 in the first place?

I鈥檇 had my eye on 草莓视频在线 for a while. I was drawn to how secretive and low-key they were, but how amazing the work being produced was. Also, the name 草莓视频在线 / Magnum Opus, I was like 鈥渢hese people are talented. I want to work with these people!鈥.

What is one app, or software, you couldn鈥檛 do your job without?

I am obsessed with Google Sheets. I like formulas and automating things (I even have a personal budget spreadsheet that I鈥檝e been perfecting for the last five years) and I see spreadsheets as a sort of puzzle.

What skills are essential for anyone in your role?

The first one is to learn how to communicate with different types of people. Everyone likes to communicate a bit differently, and as someone whose job relies on people communicating with you, you have to figure out how people communicate and adapt to them.

The second is multitasking. As a PM you鈥檒l always be spinning 20 plates at once and it鈥檚 important to not get overwhelmed by that and just learn to embrace the chaos.

Lastly, and linked to number 2, is to be flexible. I鈥檓 constantly learning to be more flexible, change on the fly, and be open to other ways of solving problems.

What鈥檚 the biggest lesson you鈥檝e learned in your career?

Try everything once. My career history is full of loads of different types of roles and it helped me land on something I really enjoy doing now. You never know what you鈥檙e capable of until you do it. And 诲辞苍鈥檛 let the things that 诲辞苍鈥檛 fulfil you take up too much of your time.

If you could have any other job in the world, what would it be?

It sounds tedious, but I鈥檇 love to be a librarian. I鈥檝e been a student librarian before and I loved the quiet organization and repetition of the job. Plus, unlimited access to all the books.

Where would you most like to travel to in the world?

I haven鈥檛 done a lot of travelling in my life so far, but my friend and I have recently become obsessed with luxury sleeper trains. We鈥檝e been talking about taking the Orient Express journey in one of their suite cars, (with 24-hour butler service and unlimited champagne) so maybe one day, we will make that happen.

]]>
Meet the Magnopians: Blair McLaughlin
2023 immersive industry recap: Meta Quest 3, Apple Vision Pro, and a year of innovation and collaboration草莓视频在线Thu, 21 Dec 2023 11:12:21 +0000/blog/2023-immersive-industry-recap618131cf8cd8e779321e9666:6256e7c502af7b62119329a0:6584142a080fe21a3ab05b392023 was a transformative year that saw 鈥榮patial鈥 emerge as our word du jour; the groundbreaking launch of UEFN; and the long-awaited entry of Apple into the immersive industry. So here鈥檚 a quick recap of all the significant moments from the last year, as well as some things we鈥檙e proud of.

Meta Quest 3 was Released

The unveiling and launch of the marked a significant milestone in bridging the gap between the physical and digital worlds, being the world's first mass-market mixed reality headset. The new headset boasted a slimmer profile, louder audio, and improved resolution to its predecessor, with the added feature of full-color Passthrough enabling cross-reality experiences. We had the honor of working with the Meta team on an open-sourced project that shows developers the full capabilities of the MR SDK (if you are very observant you may have already spotted a sneak peek at Connect this year!) so watch this space 鈥 we can鈥檛 wait to share it with you.  

Apple Vision Pro was Announced

Announced at , Apple鈥檚 entry into the immersive industry went far beyond just another addition to the existing landscape. The company鈥檚 first 鈥樷, called the 鈥榁ision Pro鈥, promised everything you鈥檇 expect from the world鈥檚 biggest tech company: smooth user experience, ecosystem integration, and exceptional quality.

Apple's investment in this space holds the promise of driving widespread adoption, shaping the trajectory of immersive experiences, and igniting a wave of innovation within the XR domain. The momentum is only going to grow when the headset becomes available 鈥 Apple鈥檚 website says 鈥榚arly 2024鈥 鈥 not long now!

Fortnite Continued to Dominate Headlines

Fortnite OG. Credit:

Towards the back end of this year, Fortnite has been transporting users back to the 鈥榞ood 鈥榦l days鈥 of 2018, and , with each update of the game bringing a different phase of Battle Royales past. Powered by nostalgia, the decision was made to get players back into the game, before launching a trio of new experiences extending beyond the classic Battle Royale. During the at the start of December which featured a virtual concert from Eminem, Fortnite broke records with an incredible. The platform continues to innovate and find new ways to revive dormant players and bring in new ones. With the recent partnerships with toy giant and rap legend , as well as the announcement of , the momentum is only going to continue. 

Earlier this year the was released, enabling developers to design and publish games directly in Fortnite. UEFN provides access to many of Unreal Engine 5鈥檚 powerful tools and workflows to create Fortnite islands, including custom asset import, modeling, materials and VFX, Sequencer and Control Rig, and more. It鈥檚 a game changer for creators, as they can create AAA-quality games within Fortnite, without being limited to the framework or aesthetic of a traditional Fortnite design. 

Sony Immersive Music 草莓视频在线s reached out to us to collaborate on an experience that showcases the power of UEFN, so we developed a stunning 'music-first' arena battle experience 鈥 the first ever Fortnite Creative Island with licensed music. You can play using island code: 0400-8367-8548, and read more about the project here.

Working Together to Accelerate the Next Web

This past year has seen companies coming together to accelerate the creation of the next spatial web and ensure they lay the right foundations for it. 

April 2023 saw the incorporation of the , a non-profit, member-funded consortium of standards-related organizations, companies, and institutions that are cooperating to foster interoperability for an open and inclusive metaverse. The Forum generates domain and technical reports, use case and requirements recommendations, pilots, testbeds and plugfests, open source tooling, best practices and guidelines, and other data, insights, and visibility to enable standards organizations to accelerate the development and evolution of standards that will be essential to building the metaverse. Over 2500 organizations have now joined the forum, and more are joining every day!

Leading the way in this space is , a company that provides 3D geospatial software components, facilitating the creation of open standards and knowledge needed to drive the internet's progression from 2D to fully immersive 3D. Cesium created 3D Tiles, an for streaming massive 3D geospatial datasets, to be the spatial index for the metaverse. In 2023, the company announced "," an extension that enables 3D geospatial capability for NVIDIA Omniverse, a real-time 3D graphics collaboration development platform. Cesium for Omniverse is Cesium's fifth runtime engine, joining open-source CesiumJS, and the open source plugins Cesium for Unreal, Cesium for Unity, and Cesium for O3DE.

In July, we open-sourced our . We鈥檝e been working on this technology for over five years, at times with more than 60 people, and have gone through rounds of testing on public experiences and private betas with developers. The entire 草莓视频在线 team was so proud to finally release their hard work to the public.

CSP facilitates the development of interoperable spatial applications 鈥 meaning the same experience can be accessible across different devices, through different technologies, and even across physical and digital spaces. Currently, the user experience and standards for the spatial internet are being actively developed by just a handful of the world鈥檚 largest tech companies. Even though they鈥檙e making great progress in demonstrating what the future holds, developer communities believe that fostering this kind of significant change to the baseline experience requires accessibility and interoperability. We are supporting the vision of organizations like the Metaverse Standards Forum, and W3C

By making this available under a royalty-free permissive free software license, we鈥檙e helping lay the foundations for a more diverse spatial internet that communities can build on together.  We hope that other people on a journey similar to us can benefit from the work we鈥檝e done to facilitate the interoperability of spaces connected across physical and digital worlds. We know we keep telling you to 鈥榳atch this space鈥, but we really are hard at work building some big things on top of the Connected Spaces Platform, and we will be able to share them with you in the New Year - promise!

草莓视频在线 Celebrated Turning 10

2023 was the 10th anniversary of 草莓视频在线, and at such a milestone we can鈥檛 help but look back and appreciate the last decade and the 100s of projects we鈥檝e shipped (most of which we aren鈥檛 allowed to talk about!). From amplifying important stories with an augmented reality app for the Holocaust Museum Los Angeles to taking over 4 million people to space (virtually!), we鈥檝e worked with some truly brilliant teams over the years. Our own team has grown across two continents, and we continue to have some of the best and brightest minds in the industry fighting the good fight with us. Thank you to everyone who has been on this incredible journey with us.

]]>
2023 immersive industry recap: Meta Quest 3, Apple Vision Pro, and a year of innovation and collaboration