본문으로 건너뛰기

Synthetic Input Generation

Generate realistic test inputs using personas and AI-powered synthesis.

What is Synthetic Input Generation?

Synthetic input generation creates realistic test data programmatically, rather than manually writing each test case. Using personas as templates, FluxLoop generates diverse, realistic inputs that mimic how real users interact with your AI agent.

Manual vs Synthetic Testing

Manual Testing:

# tests/manual_inputs.yaml
inputs:
- "How do I start?"
- "What are the features?"
- "How much does it cost?"
# ... writing 100+ inputs by hand 😩

Synthetic Testing:

# personas.yaml
personas:
- novice_user
- expert_user
- frustrated_user

# Generate 100 diverse inputs automatically
$ fluxloop inputs generate --count 100
✅ Generated 100 inputs from 3 personas

Benefits

1. Scale Testing

Generate hundreds or thousands of test inputs in minutes:

# Manual: Days to write 1000 inputs
# Synthetic: Minutes to generate 1000 inputs
fluxloop inputs generate --count 1000

2. Uncover Edge Cases

Personas generate unexpected variations you wouldn't think of manually:

# Manual inputs (predictable)
- "How do I reset my password?"
- "How do I change my password?"

# Synthetic inputs (diverse)
- "password reset???"
- "i forgot my pwd help"
- "Can't login. Need to change credentials ASAP"
- "where is option for changing authentication"
- "🔐 reset?"

3. Maintain Test Coverage

As your product evolves, regenerate inputs automatically:

# Product adds new features
$ update_personas.sh

# Regenerate tests
$ fluxloop inputs generate --refresh
✅ Generated 150 inputs covering new features

4. Realistic Diversity

Personas ensure inputs reflect real user diversity:

  • Language variations
  • Expertise levels
  • Emotional states
  • Communication styles

How It Works

1. Define Personas

Create personas representing your users:

# personas/novice_user.yaml
persona_id: novice_user
attributes:
expertise: beginner
patience: high
technical_background: none

communication:
formality: casual
verbosity: brief
emoji_usage: occasional

See Persona Design for details.

2. Generate Inputs

Use personas to generate synthetic inputs:

fluxloop inputs generate --persona novice_user --count 50

FluxLoop uses an LLM to generate realistic inputs matching the persona:

generated_inputs:
- id: input_001
persona: novice_user
text: "um... how do i even start? 😅"
metadata:
complexity: simple
tone: uncertain

- id: input_002
persona: novice_user
text: "is there a tutorial or something?"
metadata:
complexity: simple
tone: curious

3. Review and Refine

Review generated inputs:

# List generated inputs
fluxloop inputs list

# View specific input
fluxloop inputs show input_001

Refine if needed:

# Regenerate with different temperature
fluxloop inputs generate --temperature 0.8

# Add constraints
fluxloop inputs generate --topic "authentication"

4. Create Bundles

Bundle inputs for testing:

# Create test bundle
fluxloop bundles create --name "regression_v1"

5. Run Tests

Test your agent with generated inputs:

fluxloop test --bundle regression_v1

Generation Strategies

Basic Generation

Generate inputs from a single persona:

fluxloop inputs generate \
--persona novice_user \
--count 50

Multi-Persona Generation

Generate inputs from multiple personas:

fluxloop inputs generate \
--personas novice_user,expert_user,frustrated_user \
--count 150

Each persona contributes proportionally (50 inputs each).

Weighted Generation

Weight personas by frequency:

fluxloop inputs generate \
--personas novice_user:60,expert_user:30,frustrated_user:10 \
--count 100

Generates:

  • 60 inputs from novice_user
  • 30 inputs from expert_user
  • 10 inputs from frustrated_user

Topic-Constrained Generation

Generate inputs about specific topics:

fluxloop inputs generate \
--persona expert_user \
--topic "API authentication" \
--count 30

Example outputs:

- "What's the OAuth2 flow?"
- "Rate limits for /auth/token?"
- "Refresh token rotation strategy?"

Context-Aware Generation

Provide context for generation:

fluxloop inputs generate \
--persona support_agent \
--context "User is trying to integrate the API" \
--count 40

Generates inputs relevant to API integration context.

Configuration

Input Config File

Define generation settings:

# configs/input.yaml
generation:
provider: openai # or anthropic
model: gpt-4o
temperature: 0.7
max_tokens: 100

personas:
- id: novice_user
count: 60
- id: expert_user
count: 30
- id: frustrated_user
count: 10

constraints:
topics:
- authentication
- user_management
- billing

max_length: 200
min_length: 10
language: english

Use config:

fluxloop inputs generate --config configs/input.yaml

Generation Parameters

ParameterDescriptionDefault
--providerLLM provider (openai, anthropic)openai
--modelModel namegpt-4o
--temperatureSampling temperature (0.0-1.0)0.7
--countNumber of inputs to generate50
--topicTopic constraintNone
--contextAdditional contextNone

Higher temperature = more diverse, creative inputs Lower temperature = more predictable, conservative inputs

Quality Control

Review Generated Inputs

Always review generated inputs before using in production:

# Generate and review
fluxloop inputs generate --count 50 --output inputs/review.yaml

# Review in editor
code inputs/review.yaml

# Approve
fluxloop inputs approve inputs/review.yaml

Filter Low-Quality Inputs

Automatically filter:

# config.yaml
generation:
filters:
min_length: 10
max_length: 500
remove_duplicates: true
remove_offensive: true
language: english

Manual Curation

Combine synthetic and manual inputs:

# inputs/curated.yaml
inputs:
# Synthetic (auto-generated)
- text: "How do I start?"
source: synthetic
persona: novice_user

# Manual (hand-written edge case)
- text: "What if I have 10,000,000 users?"
source: manual
importance: critical

Validation

Validate generated inputs:

fluxloop inputs validate

Checks:

  • ✅ No duplicates
  • ✅ Appropriate length
  • ✅ Matches persona
  • ✅ No offensive content
  • ✅ Language correctness

Advanced Techniques

Chain-of-Thought Generation

Generate inputs with reasoning:

generation:
chain_of_thought: true

Example:

input:
thought_process: |
User is a novice trying to complete their first task.
They're uncertain and need reassurance.
They likely don't know technical terms.

generated_text: "is it ok if i click here? will it break anything?"

Produces more realistic, nuanced inputs.

Few-Shot Examples

Provide examples to guide generation:

generation:
few_shot_examples:
- persona: expert_user
examples:
- "What's the rate limit for POST /api/v2/users?"
- "Any bulk update endpoints?"
- "Webhook retry policy?"

Ensures generated inputs match desired style.

Conversation Simulation

Generate multi-turn conversations:

fluxloop inputs generate \
--multi-turn \
--max-turns 5 \
--persona frustrated_user

Generates realistic conversation flows:

conversation:
- turn: 1
input: "This isn't working"
- turn: 2
input: "I already tried that!"
- turn: 3
input: "Can I just talk to a human?"

Domain-Specific Generation

Use domain vocabulary:

generation:
domain: healthcare
vocabulary:
- "patient records"
- "HIPAA compliance"
- "clinical notes"
- "medication history"

Generates domain-appropriate inputs:

- "How do I access patient records securely?"
- "Is this HIPAA compliant?"
- "Where are clinical notes stored?"

Best Practices

1. Start Small

❌ Don't generate 1000 inputs immediately

fluxloop inputs generate --count 1000  # Overwhelming

✅ Start with small batches

fluxloop inputs generate --count 50
# Review and refine
fluxloop inputs generate --count 50
# Iterate

2. Balance Personas

❌ Don't focus on one persona

personas:
novice_user: 100% # Misses expert scenarios

✅ Reflect real user distribution

personas:
novice_user: 40%
intermediate_user: 35%
expert_user: 25%

3. Validate Outputs

❌ Don't use generated inputs blindly

fluxloop inputs generate --count 500
fluxloop test # Without review!

✅ Always review first

fluxloop inputs generate --count 500
fluxloop inputs review
fluxloop inputs approve
fluxloop test

4. Combine with Manual Inputs

❌ Don't rely 100% on synthetic data

# All synthetic, misses critical edge cases

✅ Mix synthetic and manual

inputs:
synthetic: 80% # Bulk coverage
manual: 20% # Critical edge cases

5. Iterate Based on Results

❌ Don't generate once and forget

# Generated 6 months ago, never updated
fluxloop test --bundle old_bundle

✅ Regenerate regularly

# Update personas based on new learnings
fluxloop personas update

# Regenerate inputs
fluxloop inputs generate --refresh

# Test
fluxloop test

6. Version Control

Store inputs in version control:

git add inputs/
git commit -m "Update synthetic inputs for v2 API"

Benefits:

  • Track changes
  • Collaborate with team
  • Rollback if needed

Common Pitfalls

Over-Generation

❌ Generating too many similar inputs:

# 100 nearly identical inputs
- "How do I start?"
- "How do I begin?"
- "How to start?"
- ...

✅ Use diversity constraints:

generation:
diversity:
min_similarity: 0.3 # Prevent too-similar inputs

Unrealistic Inputs

❌ Generated inputs that no real user would ask:

- "Please enumerate the comprehensive list of all functionalities..."

✅ Ground in real user language:

- "what can this do?"

Ignoring Context

❌ Generating inputs without product context:

- "How do I purchase a widget?"
# (Your product doesn't have widgets)

✅ Provide product context:

generation:
context: |
Product: AI agent testing platform
Features: Synthetic testing, scenarios, bundles
No: widgets, physical products, e-commerce

Cost Optimization

LLM API Costs

Generation uses LLM API calls. Optimize costs:

Use Smaller Models

generation:
model: gpt-4o-mini # Cheaper than gpt-4o
# or
model: claude-3-haiku # Cheaper than sonnet

Batch Generation

# Efficient: One API call, 100 inputs
fluxloop inputs generate --count 100

# Inefficient: 100 API calls
for i in {1..100}; do
fluxloop inputs generate --count 1
done

Cache Common Patterns

generation:
caching: true # Reuse similar generations

Monitor Costs

fluxloop inputs generate --count 100 --dry-run
# Estimated cost: $0.15
# Proceed? (y/n)

Troubleshooting

Low-Quality Outputs

Problem: Generated inputs are nonsensical

Solution: Adjust temperature

# Too high temperature (1.0) = nonsense
fluxloop inputs generate --temperature 0.6 # More focused

Repetitive Outputs

Problem: All inputs are too similar

Solution: Increase temperature or add diversity constraint

fluxloop inputs generate --temperature 0.8 --diversity-threshold 0.4

Off-Topic Outputs

Problem: Inputs don't match your product

Solution: Add context and constraints

generation:
context: "AI agent testing platform"
constraints:
topics:
- testing
- agents
- synthetic_data
forbidden_topics:
- cooking
- sports

API Errors

Problem: Generation fails with API errors

Solution: Check API key and rate limits

export OPENAI_API_KEY=your_key
fluxloop inputs generate --provider openai
  • personas - Manage personas
  • inputs - Manage generated inputs
  • bundles - Bundle inputs for testing
  • test - Test with generated inputs

Next Steps

Examples

See FluxLoop Examples for:

  • Complete generation workflows
  • Domain-specific examples (SaaS, E-commerce, Healthcare)
  • Advanced techniques
  • Cost optimization strategies