KK Slider Deepfake
Have KK Slider record a custom message for you!
---
In this project, I created a KK Slider "Deepfake" with Python. Using this program, you can have KK Slider from Animal Crossing record a custom message for your or your friend's special occasion a la cameo.com! I will walk you through this journey in case you want to create a KK Slider Deepfake program of your own. You could also use this project as a framework to make any Animal Crossing character or another animated character record a custom message.
This program generates both the audio and video for KK Slider. It accepts a custom message as input. The video is constructed using footage from Animal Crossing: New Horizons using moviepy, ImageMagick, and ffmpeg. The audio is generated using a slightly modified version of equalo-official's animalese generator.
Since this program is rather RAM-intensive and requires quite a few dependencies, I supplied two different versions of the program. The first is the DIY version that shows you how to replicate the environment and run the code locally on your machine. The second is a Jupyter Notebook which can be easily run for free in Google Colaboratory and use their computing resources.
All the code for this project can be found here: https://github.com/mjdargen/KK-Slider-Deepfake
If you want to run the program immediately without setup, jump to step 7.
Video Capture
In order to be able to cut together videos to make it look like KK Slider was speaking, I had to scour the game to find a chunk of good footage to capture. KK Slider doesn't appear often in Animal Crossing: New Horizons. When he does, he typically is surrounded by other characters or playing a song over the credits. To my knowledge, there is only a single scene in Animal Crossing: New Horizons that was suitable. It's when KK shows up on your island for the first time. I had one shot so I had to be sure to capture the video perfectly.
I cut up the video into 4 separate portions to construct the final video. First I cut off the beginning and end where he is playing guitar to use as the intro and outro for the final video. I then cut out a few distinct parts to use for portions when he is talking and portions when he is silent for the custom middle portion. These are further cut up and spliced together in the script to create the illusion that he is speaking. With the videos ready, it was time to proceed with designing the Python script!
Preparing Script
After prompting the user for input, the script has to be prepared. Since most of us don't speak Animalese, there has to be subtitles. The text_processing() function handles the subdividing of the user input to make sure everything can fit. I use nltk (Natural Language ToolKit) to tokenize the input into sentences. I then use textwrap module to wrap words onto separate lines. I used the tokenized sentence structure to try to only include entire sentences on a single subtitle card (i.e. frame of the video). It only splits up sentences if a single sentence is longer than what will fit on the card.
In the image above, you see the input text split up into multiple lists. The first value in the list is the entire dialogue for that frame. The subsequent three values in the list correspond to the text displayed on each of the 3 lines of the subtitle cards. Once the text has been processed, it's time to generate the video and audio.
Recording Script: Audio
Over time, people have picked up on how the audio is generated for Animal Crossing characters. The characters speak a language Animalese. Generally, the way the language works is that each letter of the word is pronounced. These individual pronunciations are strung together to create the word. The audio is then sped up to create the characteristic sound. The audio for characters with higher-register voices is sped up more than those with lower-register voices.
To generate the audio for this program, I used equalo-official's animalese generator. In equalo's project, they use pydub to speed up and splice together raw audio recordings of them speaking the individual words. I took their code and modified it to be used in this project. All audio generation is handled by the audio_processing() function.
Recording Script: Video
Generating Video Scenes
After creating the audio, I used the contextlib module to compute the length of the audio file. I used the length of the audio to dictate the length of the video. To create the custom video from the clips I had created earlier, I used moviepy. In this project, there were two key parts of the video generation: making it look like KK Slider was talking and adding subtitle text. All video generation is handled by the video_processing() function.
Based on the length of the audio, I would slice up the talking clip to make it look like KK was speaking for the duration of the audio clip. At the end of every cue card, there would be a pause period to give the viewer the opportunity to read the full subtitle card before it would carry on to the next portion of the dialogue. To generate the subtitles, I downloaded the FinkHeavy font to match the font style in the game. I then added the subtitles letter by letter so that the timing matched up with the audio recording.
Since video processing is very RAM-intensive, I used the Python Garbage Collector and deleted objects after usage to free up RAM before generating the next frame. moviepy can very quickly fill up your RAM and I found this happening quickly with longer messages. To mitigate these issues, I decided to save each individual portion of dialogue as its own temporary video file. These files could easily be concatenated together using ffmpeg at the end and avoid crashing my computer.
ffmpeg Video Concatenation
After all of the scenes have been recorded, ffmpeg is used to concatenate all the temporary video files. The intro, custom recorded scenes, and outro are combined together to make a single video. In order to call ffmpeg, I used the subprocess module. The program detects the operating system to use the commands recognized by the appropriate environment. All of this is handled by the video_concatentation() function. This function is also responsible for cleaning up any temporary files.
Two Options for Running the Program
As mentioned earlier, there are two options for running this program.
Option 1: Local
This program requires a lot of dependencies and consumes a lot of RAM. This option will walk you through how to set up the environment and run the program on your local machine. Only go this route if you really want to run this on your own hardware.
Option 2: Remote
This option walks you through how to use Google Colaboratory to run your program for free on a remote machine. There is no real set-up required!
Option 1 - Running on Your Local Machine
Installing Dependencies
You need to install the following programs on your computer in order to run this program:
- Python 3 with pip (tested with Python 3.7): https://www.python.org/downloads/
- ImageMagick: https://imagemagick.org/script/download.php
- Scroll down and install the appropriate distribution
- Make sure you check the box to install "convert"!!
- ffmpeg (https://ffmpeg.org/)
After installing ImageMagick on Linux, you will need to run the following command:
sed -i '/<policy domain="path" rights="none" pattern="@\*"/d' /etc/ImageMagick-6/policy.xml
Cloning the Repository
You need to clone by repository by either going to the GitHub repo and downloading the files from your browser or by running the following command using a git client:
git clone https://github.com/mjdargen/KK-Slider-Deepfake.git
Installing Python Packages
This program requires the following Python packages:
- nltk - Natural Language ToolKit
- pydub - audio manipulation
- moviepy[optional] - video editing - need at least version 1.0.3
You can install the packages with pip using the requirements.txt file I provided. Navigate with your terminal to the cloned directory, then run the following command.
pip3 install -r requirements.txt
Running the Program
To run the program, use the following command:
python3 kkdeepfake.py
The first time you run the program, you will need to download the ntlk datasets and models. Run the Python Interpreter by typing "python3" into your terminal. Then type the following commands:
import nltk nltk.download('punkt')
After you've successfully downloaded the datasets, you can type quit() to exit out of the Python interpreter.
Option 2 - Running in Google Colaboratory
Intro to Colab
Google Colaboratory is a Python development environment using Jupyter Notebooks that allows you to connect to Google’s powerful cloud computing resources and run Python code. It allows you to maintain a runtime environment where you can install packages, access data online, navigate a file system, and store/reuse data.
To run the program in Google Colaboratory, go to the following link: https://colab.research.google.com/drive/14d4U6Yhi...
Executing Code in Colab
There are two different types of cells: code & text. Text cells are used to describe what's going on. Code cells can be executed one at a time by clicking the play button. The Colab notebook is broken up into 3 steps: Install Dependencies, Run Program, and Download Video.
To run Step 1, click the play button next to the 1st code cell. This will connect to the cloud resource as shown by the green checkmark and RAM/Disk usage status towards the top right corner. This cell can take a couple of minutes to run as it is installing all of the dependencies. You only ever need to run this step once per session.
After that cell has completed running, click the play button next to the 2nd code cell to run the program. The program will prompt you for input just below the code cell. Type the message you want KK to record and hit enter when you are finished. Running the program can take a while depending on the length of your message. For the example message I supplied, it took ~8 minutes to execute.
Note: There is also a Jupyter Notebook in the GitHub repository that you can use.
Demo
Check out the demo below!
More Projects
For more projects, visit these links:
Source Code
To view the source code, visit this Github Repository.