voice-changer/Realtime_Voice_Changer_on_Colab.ipynb

207 lines
8.8 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/w-okada/voice-changer/blob/master/Realtime_Voice_Changer_on_Colab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"### [w-okada's Voice Changer](https://github.com/w-okada/voice-changer) | **Colab**\n",
"\n",
"---\n",
"\n",
"## **⬇ VERY IMPORTANT ⬇**\n",
"\n",
"You can use the following settings for better results:\n",
"\n",
"If you're using a index: `f0: RMVPE_ONNX | Chunk: 112 or higher | Extra: 8192`<br>\n",
"If you're not using a index: `f0: RMVPE_ONNX | Chunk: 96 or higher | Extra: 16384`<br>\n",
"**Don't forget to select a T4 GPU in the GPU field, <b>NEVER</b> use CPU!\n",
"> Seems that PTH models performance better than ONNX for now, you can still try ONNX models and see if it satisfies you\n",
"\n",
"\n",
"*You can always [click here](https://github.com/YunaOneeChan/Voice-Changer-Settings) to check if these settings are up-to-date*\n",
"\n",
"---\n",
"\n",
"### <font color=red>⬇ Always use Colab GPU! (**IMPORTANT!**) ⬇</font>\n",
"You need to use a Colab GPU so the Voice Changer can work faster and better\\\n",
"Use the menu above and click on **Runtime** » **Change runtime** » **Hardware acceleration** to select a GPU (**T4 is the free one**)\n",
"\n",
"---\n",
"**Credits**<br>\n",
"Realtime Voice Changer by [w-okada](https://github.com/w-okada)<br>\n",
"Notebook files updated by [rafacasari](https://github.com/Rafacasari)<br>\n",
"Recommended settings by [YunaOneeChan](https://github.com/YunaOneeChan)\n",
"\n",
"**Need help?** [AI Hub Discord](https://discord.gg/aihub) » ***#help-realtime-vc***\n",
"\n",
"---"
],
"metadata": {
"id": "Lbbmx_Vjl0zo"
}
},
{
"cell_type": "code",
"source": [
"# @title Clone repository and install dependencies\n",
"# @markdown This first step will download the latest version of Voice Changer and install the dependencies. **It can take some time to complete.**\n",
"%cd /content/\n",
"\n",
"!pip install colorama --quiet\n",
"from colorama import Fore, Style\n",
"import os\n",
"\n",
"print(f\"{Fore.CYAN}> Cloning the repository...{Style.RESET_ALL}\")\n",
"!git clone https://github.com/w-okada/voice-changer.git --quiet\n",
"print(f\"{Fore.GREEN}> Successfully cloned the repository!{Style.RESET_ALL}\")\n",
"%cd voice-changer/server/\n",
"\n",
"print(f\"{Fore.CYAN}> Installing libportaudio2...{Style.RESET_ALL}\")\n",
"!apt-get -y install libportaudio2 -qq\n",
"\n",
"print(f\"{Fore.CYAN}> Installing pre-dependencies...{Style.RESET_ALL}\")\n",
"# Install dependencies that are missing from requirements.txt and pyngrok\n",
"!pip install faiss-gpu fairseq pyngrok --quiet\n",
"!pip install pyworld --no-build-isolation --quiet\n",
"print(f\"{Fore.CYAN}> Installing dependencies from requirements.txt...{Style.RESET_ALL}\")\n",
"!pip install -r requirements.txt --quiet\n",
"\n",
"print(f\"{Fore.GREEN}> Successfully installed all packages!{Style.RESET_ALL}\")"
],
"metadata": {
"id": "86wTFmqsNMnD",
"cellView": "form",
"_kg_hide-output": false,
"execution": {
"iopub.status.busy": "2023-09-14T04:01:17.308284Z",
"iopub.execute_input": "2023-09-14T04:01:17.308682Z",
"iopub.status.idle": "2023-09-14T04:08:08.475375Z",
"shell.execute_reply.started": "2023-09-14T04:01:17.308652Z",
"shell.execute_reply": "2023-09-14T04:08:08.473827Z"
},
"trusted": true
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# @title Start Server **using ngrok**\n",
"# @markdown This cell will start the server, the first time that you run it will download the models, so it can take a while (~1-2 minutes)\n",
"\n",
"# @markdown ---\n",
"# @markdown You'll need a ngrok account, but <font color=green>**it's free**</font> and easy to create!\n",
"# @markdown ---\n",
"# @markdown **1** - Create a <font color=green>**free**</font> account at [ngrok](https://dashboard.ngrok.com/signup) or **login with Google/Github account**\\\n",
"# @markdown **2** - If you didn't logged in with Google/Github, you will need to **verify your e-mail**!\\\n",
"# @markdown **3** - Click [this link](https://dashboard.ngrok.com/get-started/your-authtoken) to get your auth token, and place it here:\n",
"Token = '' # @param {type:\"string\"}\n",
"# @markdown **4** - *(optional)* Change to a region near to you or keep at United States if increase latency\\\n",
"# @markdown `Default Region: us - United States (Ohio)`\n",
"Region = \"us - United States (Ohio)\" # @param [\"ap - Asia/Pacific (Singapore)\", \"au - Australia (Sydney)\",\"eu - Europe (Frankfurt)\", \"in - India (Mumbai)\",\"jp - Japan (Tokyo)\",\"sa - South America (Sao Paulo)\", \"us - United States (Ohio)\"]\n",
"\n",
"#@markdown **5** - *(optional)* Other options:\n",
"ClearConsole = True # @param {type:\"boolean\"}\n",
"\n",
"# ---------------------------------\n",
"# DO NOT TOUCH ANYTHING DOWN BELOW!\n",
"# ---------------------------------\n",
"\n",
"%cd /content/voice-changer/server\n",
"\n",
"from pyngrok import conf, ngrok\n",
"MyConfig = conf.PyngrokConfig()\n",
"MyConfig.auth_token = Token\n",
"MyConfig.region = Region[0:2]\n",
"#conf.get_default().authtoken = Token\n",
"#conf.get_default().region = Region\n",
"conf.set_default(MyConfig);\n",
"\n",
"import subprocess, threading, time, socket, urllib.request\n",
"PORT = 8000\n",
"\n",
"from pyngrok import ngrok\n",
"ngrokConnection = ngrok.connect(PORT)\n",
"public_url = ngrokConnection.public_url\n",
"\n",
"from IPython.display import clear_output\n",
"\n",
"def wait_for_server():\n",
" while True:\n",
" time.sleep(0.5)\n",
" sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n",
" result = sock.connect_ex(('127.0.0.1', PORT))\n",
" if result == 0:\n",
" break\n",
" sock.close()\n",
" if ClearConsole:\n",
" clear_output()\n",
" print(\"--------- SERVER READY! ---------\")\n",
" print(\"Your server is available at:\")\n",
" print(public_url)\n",
" print(\"---------------------------------\")\n",
"\n",
"threading.Thread(target=wait_for_server, daemon=True).start()\n",
"\n",
"!python3 MMVCServerSIO.py \\\n",
" -p {PORT} \\\n",
" --https False \\\n",
" --content_vec_500 pretrain/checkpoint_best_legacy_500.pt \\\n",
" --content_vec_500_onnx pretrain/content_vec_500.onnx \\\n",
" --content_vec_500_onnx_on true \\\n",
" --hubert_base pretrain/hubert_base.pt \\\n",
" --hubert_base_jp pretrain/rinna_hubert_base_jp.pt \\\n",
" --hubert_soft pretrain/hubert/hubert-soft-0d54a1f4.pt \\\n",
" --nsf_hifigan pretrain/nsf_hifigan/model \\\n",
" --crepe_onnx_full pretrain/crepe_onnx_full.onnx \\\n",
" --crepe_onnx_tiny pretrain/crepe_onnx_tiny.onnx \\\n",
" --rmvpe pretrain/rmvpe.pt \\\n",
" --model_dir model_dir \\\n",
" --samples samples.json\n",
"\n",
"ngrok.disconnect(ngrokConnection.public_url)"
],
"metadata": {
"id": "lLWQuUd7WW9U",
"cellView": "form",
"_kg_hide-input": false,
"scrolled": true,
"trusted": true
},
"execution_count": null,
"outputs": []
}
],
"metadata": {
"colab": {
"provenance": [],
"private_outputs": true,
"include_colab_link": true,
"gpuType": "T4",
"collapsed_sections": [
"iuf9pBHYpTn-"
]
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"nbformat": 4,
"nbformat_minor": 0
}