A lightweight alternative to Hunyuan's prompt rewrite model for video generation, designed to match their training data patterns and prompt structure.
The HunyuanVideo model is using this model for prompting (https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers), resulting in known dataset captions that have the following:
- ShareGPT4V annotations
- InternVL-SFT captioning patterns
- Structured JSON format with specific components
- 14 defined camera movement types
While Hunyuan provides their prompt rewrite model, it requires significant compute resources as it's based on Hunyuan-Large. This tool provides a lightweight alternative that:
- Matches their structured format
- Uses their specific camera movement types
- Maintains their two-mode system:
- Normal: Focus on accuracy and comprehension
- Master: Enhanced composition, lighting, and camera movement
- Can be used with any OpenAI API
# Clone the repository
git clone [repository-url]
cd hunyuan-prompt-generator
# Install requirements
pip install -r requirements.txt
# Set up your OpenAI API key
export OPENAI_API_KEY='your-key-here'
Generate a normal mode prompt:
python cli.py -t "A cat playing in the garden" --mode normal
Generate a master mode prompt:
python cli.py -t "A cat playing in the garden" --mode master
Save output to JSON:
python cli.py -t "A cat playing in the garden" --format json --output prompts/output.json
Input Options:
--text, -t TEXT Text input to generate prompt from
--image, -i IMAGE NOT IMPLEMENTED YET (tbd)
Model Options:
--mode {normal,master} Prompt generation mode (default: normal)
--model MODEL LLM model to use (default: openai gpt-4-turbo)
Output Options:
--output, -o OUTPUT Output file path
--format {text,json} Output format (default: text)
--save-components Save intermediate structured components
Normal Mode:
A playful cat explores a sun-drenched garden, camera pans right to follow movement.
Realistic style with natural daylight. Peaceful and serene atmosphere.
Master Mode:
A graceful feline explores a meticulously composed garden scene with balanced foreground elements.
Smooth tracking shot with shallow depth of field, professional natural lighting with subtle rim highlights.
Careful attention to motion continuity and spatial composition.
Our prompt generation follows Hunyuan's structured approach:
- Converts input to structured JSON components
- Applies specific camera movement vocabulary
- Maintains professional video production elements
- Follows their training data hierarchy
The system supports two modes matching Hunyuan's approach:
- Normal Mode: Focuses on accurate interpretation of user intent
- Master Mode: Emphasizes composition, lighting, and camera movement for higher visual quality
Supports the same 14 camera movement types used in Hunyuan's training:
- Static shot
- Pan (up/down/left/right)
- Tilt (up/down/left/right)
- Zoom (in/out)
- Dolly tracking
- Around (left/right)
- Handheld shot
hy-promptfun/
├── LICENSE
├── README.md
├── requirements.txt
├── setup.py
├── .gitignore
├── examples/
│ └── example_prompts.json
├── src/
│ └── hyprompt/
│ ├── __init__.py
│ ├── cli.py
│ ├── constants.py
│ ├── generator.py
│ └── llm_clients.py
└── tests/
├── __init__.py
└── test_generator.py