Skip to content

yywh0525/yywh0821

Repository files navigation

ComfyUI GeminiOllama Extension

This extension integrates Google's Gemini API, Ollama, and various image processing tools into ComfyUI, allowing users to leverage these powerful models and features directly within their ComfyUI workflows.

Features

  • Support for Gemini and Ollama APIs
  • Text and image input capabilities
  • Streaming option for real-time responses
  • FLUX Resolution tools for image sizing
  • ComfyUI Styler for advanced styling options
  • Raster to Vector (SVG) conversion
  • Text splitting and processing
  • Easy integration with ComfyUI workflows

Nodes

1. Gemini API

image

The Gemini API node allows you to interact with Google's Gemini models:

  • Text input field for prompts
  • Model selection:
    • gemini-1.5-pro-latest
    • gemini-1.5-pro-exp-0801
    • gemini-1.5-flash
  • Streaming option for real-time responses

2. Ollama API

image

Integrate local language models running via Ollama:

  • Text input field for prompts
  • Dropdown for selecting Ollama models
  • Customizable model options

3. FLUX Resolutions

image

Provides advanced image resolution and sizing options:

  • Predefined resolution presets (e.g., 768x1024, 1024x768, 1152x768)
  • Custom sizing parameters:
    • size_selected
    • multiply_factor
    • manual_width
    • manual_height

4. ComfyUI Styler

image

Extensive styling options for various creative needs:

  • Advertising styles (e.g., automotive, corporate, fashion editorial)
  • Art styles (e.g., abstract, art deco, cubist, impressionist)
  • Futuristic styles (e.g., biomechanical, cyberpunk)
  • Additional categories like composition, environment, and texture

5. Raster to Vector (SVG) and Save SVG

image

Convert raster images to vector graphics and save them:

Raster to Vector node parameters:

  • colormode
  • filter_speckle
  • corner_threshold
  • ... (and more)

Save SVG node options:

  • filename_prefix
  • overwrite_existing

6. TextSplitByDelimiter

image

Split text based on specified delimiters:

  • Input text field

  • Delimiter options:

    • split_regex
    • split_every
    • split_count

    7. RMBG (Remove Background)

    image

The RMBG node allows you to remove the background from images using the BRIA model.

  • Model Download: RMBG-1.4 model
  • Model Location: Download and place the model file in ComfyUI/custom_nodes/ComfyUI-OllamaGemini/RMBG-1.4/.

Inputs:

  • rmbgmodel: Path to the RMBG model.
  • image: The image from which the background will be removed.

Outputs:

  • image: The image with the background removed.
  • mask: The mask representing the removed background.

Usage:

  1. Load the RMBG model using the BRIA_RMBG Model Loader node.
  2. Connect the output rmbgmodel to the RMBG node.
  3. Input the image you want to process, and connect it to the RMBG node's image input.
  4. The output image will be the image with the background removed, and the mask output will provide the corresponding mask.

Installation

  1. Clone this repository into your ComfyUI's custom_nodes directory:

    cd /path/to/ComfyUI/custom_nodes
    git clone https://github.com/yourusername/GeminiOllama.git
  2. Install the required dependencies:

    pip install google-generativeai vtracer

Configuration

Gemini API Key Setup

  1. Obtain a Gemini API key from the Google AI Studio. https://aistudio.google.com/app/u/1/apikey

  2. in config.json file in the extension directory:

    {
      "GEMINI_API_KEY": "your_api_key_here"
    }

Ollama Setup

  1. Install Ollama following the instructions on the Ollama GitHub page.

  2. Start the Ollama server (usually runs on http://localhost:11434).

  3. (Optional if you change ollama host) Add the Ollama URL to your config.json:

    {
      "GEMINI_API_KEY": "your_api_key_here",
      "OLLAMA_URL": "http://localhost:11434"
    }

Usage

After installation and configuration, the new nodes will be available in ComfyUI. Drag and drop them into your workflow to start using their features.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages