AI’s potential is limited only by our imagination, and while the possibilities are endless, at Ramen VR we’ve been investing in exploring its power within game development.
Earlier this month, Ramen VR sponsored Stanford XR's Immerse the Bay Hackathon. Our Co-founder, Lauren Frazier, and Director of Operations, Kristani Alcantara, had a blast hosting a workshop, "How to Make a Good Game...Fast" and serving as judges for the "XR Game Jam" track. In awe of what the hackers produced in only 36 hours, we decided to host our first ever internal AI Hackathon!
The Ramen VR team was challenged with a simple prompt: Use AI to produce a game development artifact in the span of three days. Participants were required to use at least one AI tool in their pipeline/artifact and could work solo or as a duo.
Our intention was to encourage creative recharge and potentially supercharge our dev processes with AI. In the end, the team gained tons of insights and produced various creations. Many of the prototypes turned out to be viable work streams for the future. In this blog post, we’ll highlight two creations: ‘Chiptune DJ’ and ‘CCG Card Generator’.
The Creation
Jordan, our Music Composer and Sound Designer, created a Python tool called 'Chiptune DJ', built to turn any song into a Chiptune remix. The tool uses Spleeter, a TensorFlow-based AI service, to split audio into stems (individual instrument layers).
Jordan was inspired by seeing fan-made chiptune remixes of his favorite bands’ songs growing up. For him, it was always amazing to see how much a simple instrument swap can transform a piece of music, and it was always fun to hear a familiar song reimagined in that way.
The Chiptune DJ tool was created to help creators quickly iterate on musical themes or motifs, a common practice in game and film composition. While the current version focuses on Chiptune remixes, the process could be adapted for other genres.
The AI Components
- ChatGPT was used to make the Python scripts/GUI
- Spleeter was used to analyze music and split them into individual stems
- OpenArtAI was used to make the app cover image
The Process
Jordan worked in a Python framework called PyQt and used ChatGPT to help him make the GUI as well as the scripts involved. He used a Deezer's source separation library, Spleeter, a Python library based on TensorFlow to split the music. Once an audio file is imported into Chiptune DJ, Spleeter is used to separate the audio into multiple files called stems, one for each layer/instrument. The files are converted to MIDI files and reimported into Chiptune DJ. You type in the desired tempo (BPM), click “Roll Chiptune Remix”, et voila! Your Chiptune remix is exported instantly. You can keep rolling remixes as many times as you’d like until you discover a version you’re happy with. With only one starting audio file, Chiptune DJ allows you to quickly iterate and produce multiple versions of a song.
The Challenges
Spleeter didn’t always produce the cleanest stems and he could sometimes hear another instrument’s stem in the separate audio files. To maximize success with Spleeter, he composed music that had very clearly defined instrument ranges (drums, bass, and a mid/high range synthesizer). This produced the cleanest results for the demo. Using AI to convert files to MIDI also proved to be a challenge. Jordan wanted to use AI to analyze the notes and spit them out as MIDI files but faced issues with consistency. AI libraries like Librosa and Mido had difficulty with polyphonic note recognition and struggled with percussion and drum parts. As a result, Jordan had to manually create the files himself in order to complete the prototype, but does see this as a solvable challenge if given more time.
Lessons & Future Development
“Music analysis is much more complex than I initially thought, and doesn’t assume anything. You have to approach it with both a creative mindset and a more technical/mathematical mindset to achieve the best results.” - Jordan
In terms of the process of working with AI, Jordan shared that AI assistance has come a long way in terms of collaboration and iteration. There was quite a bit of back and forth with ChatGPT that had previously only felt possible with another person. The tool itself has potential utility for composition of music. Composers often take a motif and then rearrange it for it to show up in different parts of a game, so this is a fun and quick way to explore and accomplish that. Once at MIDI level, the remixing possibilities are endless. You can expand the tool to remix any musical genre.
Created by Ryan (Ramen VR Gameplay Engineer) and George (Ramen VR Game Designer)
The Creation
Teammates George (Game Designer) and Ryan (Gameplay Engineer) created ‘CCG Card Generator’, a Unity editor tool that uses GPT-4 and DALL-E 3 to rapidly generate Hearthstone-style CCG cards. The generator creates card images, JSON data, and automatically uploads images to a CDN hosted on Google Cloud.
They decided to go with this project mainly because one of them already owned the CCG Kit Unity Asset, which gave them a solid game framework to build off of, and so that they could focus on making an AI tool to generate content, rather than needing to also build a game or app to use the AI in. They were basically looking to make a supercharged version of a ProcGen tool for generating cards.
The AI Components
- DALL_E 2 was used for generating the card images
- ChatGPT was used for:
- Turning deck themes into card image generation prompts, and for generating stats for each card.
- Server backend setup assistance and troubleshooting
- General coding assistance and development of the Content Delivery Network
The Process
The card generation process involves a user writing a 1-2 sentence description, which is then processed by GPT-4 to create an image generation prompt for DALL-E 3. Then an image is created that’s saved to the user’s desktop with a temp file name. That image is processed by GPT-4 again to generate new card JSON data. The system includes error handling and multiple attempts (up to 5) to ensure valid JSON generation. Following this, a card image file is renamed to match card and user id, and then reimported with standardized settings and uploaded to the CDN. A card set generation feature expands on the single card process, taking a one-sentence theme prompt to generate five themed card prompts.
The Challenges
There were several challenges that George and Ryan ran into with almost every step of the AI integration. On the card JSON data generation side, they faced issues with ability description discrepancies and hallucinations. GPT would attempt to implement card abilities that weren’t possible in the existing card system. An example of this would be generating effects that were in direct opposition to their descriptions, such as Moriarty’s Web, which was given the description “Confounds opponents by shuffling clue cards into their deck, causing strategic disruption.”, but really just made the opponent draw 3 cards. This could potentially be solved with more prompt engineering, hand crafted example card JSON data that matched the example input card prompts, or by handling description generation separately from the actual effect generation.
Lessons & Future Development
AI is very good at generating a large quantity of content, but needs refinement and curation before being used. It also was able to take rather simple prompts, and get very close to usable data which was surprising. Given more time, they likely could have fixed the description hallucinations and sent generated card JSON back to GPT for evaluation to fix cards that only benefit the opponent.
Moving forward, there could be some refinement with JSON data. Rather than retrying until GPT creates valid data, a JSON schema could be used alongside GPT’s Structured Output endpoint to ensure valid data generation. There could also be development with generation runtime and feedback. Rather than only generating cards in an editor window, it could be updated to allow users to generate in game. It could also be developed to show generated info as it is created and to provide the option to accept or reject the generated card before it is saved.
Having three days to freely explore AI tools in a friendly competitive environment produced invaluable learnings. We upleveled across disciplines (audio, engineering, art, QA, design), produced a few promising tools, and shared a few laughs over AI “hallucinations”. At Ramen VR we’ve done Game Jams in the past, but this was our first Hackathon and we hope to continue doing more.
If any of these prototypes excite you, or you have your own idea(s) based on the prompt, we’d love to hear from you! We’re looking for talented folks to build next gen game dev tools and interactive experiences. Check out our careers page for more details.