Skip to content

Get Models for BuddyGenAI

View the current version of this page

BuddyGenAI uses AI models to chat, generate text, images, TTS and Speech-to-Text. TTS and Speech-to-Text are optional, but you're required to provide a chat and an image model.

Compatibility

You'll need to determine what models your GPU is able to run. The less VRAM your GPU has, the smaller the model you'll need to use. You might be able to run larger models on a GPU with less VRAM, but it will be slower and may crash.

First, check how much VRAM your GPU has. You can do this by opening Task Manager, going to the Performance tab, and clicking on GPU. You'll see a graph of your GPU usage and the amount of VRAM your GPU has. Note how much VRAM you have available.

The following sections provide estimates of the VRAM required for each model. The list of models is not exhaustive but are the models I've used with the app (RTX 3060 12GB).

The app expects the following file extensions for models:

  • Chat Model: .gguf
  • Image Model: .safetensors
  • TTS Voice: .onnx and adjacent .json
  • Speech-to-Text Model: .bin

Chat Models (Required)

My preferred chat model has been Meta's Llama 3, quantized by bartowski. You can find links to download quanized/smaller versions of the model below.

(The higher the number e.g. Q8, the better the model, but the more VRAM it will require.)

ModelFile SizeEst. VRAM Usage
Q2_K3.17GB4.5GBDownload
Q4_K_M4.92GB5.9GBDownload
Q8_08.54GB9.1GBDownload

Image Model (Required)

I've been using Toonify SD 1.5.

ModelFile SizeEst. VRAM Usage
Toonify SD 1.51.98GB3GBDownload

TTS Voices (Optional)

The app uses Piper TTS which uses the CPU and is not very resource-intensive. Here are resources to find voices:

When importing a voice, you'll need to down load and select both a .onnx and .json file. The .json file's name should include the full name of the .onnx file (e.g. en_US-lessac-medium.onnx and en_US-lessac-medium.onnx.json).

Speech-to-Text Models (Optional)

I've been using the Distill-Whisper large-v3 model for increased speeds and lower VRAM usage.

To download, go to the link below and click the "Files and versions" tab, then click on the file "ggml-distil-large-v3.bin" and finally click on Download.

ModelFile SizeEst. VRAM Usage
Distill-Whisper large-v31.52GB??GB