BaseLVM API Reference

class lmitf.base_lvm.BaseLVM(api_key: str | None = None, base_url: str | None = None)[source]

Bases: object

OpenAI LVM (Language Vision Model) 客户端封装类

提供对 OpenAI Vision API 的简化访问接口,支持图像处理和文本生成。 自动处理环境变量配置,维护调用历史记录。

client

OpenAI 图像处理客户端实例

Type:

openai.Image

call_history

API 调用响应的历史记录

Type:

list[str | dict[str, Any]]

__init__(api_key: str | None = None, base_url: str | None = None)[source]

初始化 VLM 客户端

Parameters:
  • api_key (str, optional) – OpenAI API 密钥。如果未提供,将从环境变量 OPENAI_API_KEY 读取

  • base_url (str, optional) – API 基础URL。如果未提供,将从环境变量 OPENAI_BASE_URL 读取

create(prompt: str, model: str = 'gpt-image-1', size: str = '1024x1024') Image[source]
edit(image: Image | list[Image], prompt: str, model: str = 'gpt-image-1', size: str = '1024x1024', input_fidelity: str = 'low') Image[source]

Edit an existing image with a prompt and optional mask.

The image and mask (if provided) are sent as file-like objects. Returns the first edited image as a PIL Image.

Overview

The BaseLVM class provides an interface for working with Large Vision Models (LVMs) that can process both text and images. It’s designed for multimodal AI tasks that require understanding visual content.

Class Reference

BaseLVM

class lmitf.base_lvm.BaseLVM(api_key: str | None = None, base_url: str | None = None)[source]

Bases: object

OpenAI LVM (Language Vision Model) 客户端封装类

提供对 OpenAI Vision API 的简化访问接口,支持图像处理和文本生成。 自动处理环境变量配置,维护调用历史记录。

client

OpenAI 图像处理客户端实例

Type:

openai.Image

call_history

API 调用响应的历史记录

Type:

list[str | dict[str, Any]]

__init__(api_key: str | None = None, base_url: str | None = None)[source]

初始化 VLM 客户端

Parameters:
  • api_key (str, optional) – OpenAI API 密钥。如果未提供,将从环境变量 OPENAI_API_KEY 读取

  • base_url (str, optional) – API 基础URL。如果未提供,将从环境变量 OPENAI_BASE_URL 读取

create(prompt: str, model: str = 'gpt-image-1', size: str = '1024x1024') Image[source]
edit(image: Image | list[Image], prompt: str, model: str = 'gpt-image-1', size: str = '1024x1024', input_fidelity: str = 'low') Image[source]

Edit an existing image with a prompt and optional mask.

The image and mask (if provided) are sent as file-like objects. Returns the first edited image as a PIL Image.

Key Features

  • Image Generation: Create images from text prompts with create method

  • Image Editing: Edit existing images with edit method

  • Template Integration: Works with template-based image generation prompts

Usage Examples

Image Generation

from lmitf import BaseLVM

vlm = BaseLVM()
result = vlm.create(
    prompt="A beautiful landscape with mountains and a river",
    model='gpt-image-1',
)

Image Editing

# Edit existing image
edited_result = vlm.edit(result, "Add flying cats in the sky")

Template-based Image Generation

# Use predefined templates
template = lmitf.prompts.lvm_prompts['character_ref']
template_lvm = TemplateLLM(template)

# Generate image with template parameters
result = template_lvm.call(
    CharacterName="Hero",
    RefCharacter="base64_image_data",
    Size="1024x1024",
    Character="warrior",
    Style="fantasy",
    GenPrompt="Create a fantasy warrior scene"
)

Method Reference

create()

Generate images from text prompts.

Parameters:

  • prompt (str): Text description of desired image

  • model (str): Image generation model to use (e.g., ‘gpt-image-1’)

Returns:

  • Generated image object/data

edit()

Edit existing images with text prompts.

Parameters:

  • image: Previously generated image object

  • prompt (str): Description of desired changes

Returns:

  • Edited image object/data

Configuration

Environment Setup

export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"

Manual Configuration

lvm = BaseLVM(
    api_key="your-api-key",
    base_url="https://your-endpoint.com/v1"
)

Best Practices

  1. Clear Prompts: Be specific about what you want to generate

  2. Template Usage: Use predefined templates for consistent results

  3. Model Selection: Choose appropriate models based on your needs