Skip to main content

Instructions

Instructions define how the model should respond to different input patterns. They are structured using Input and Output components that clearly separate the model's input requirements from its expected response format.

All instructions are subject to the context snippets parameter of the protocol. This parameter determines the number of context snippets that will be provided to each instruction.

Instruction Structure

Instructions are built using three main components:

  • Input: Defines the structure for the model's input, including TokenSets. See Input for detailed information.
  • Output: Defines the structure and format of the model's response, including the response TokenSet and final tokens. See Output for detailed information.
  • Context: Optional background context strings that help the model understand the domain and situation. Context is provided when creating the Instruction.

Instruction

The Instruction class combines an InstructionInput and an InstructionOutput to create a complete instruction that defines both what the model receives and how it should respond.

Instruction Parameters

class Instruction:
def __init__(self, input: InstructionInput, output: InstructionOutput, context: List[str] | None = None, name: str | None = None):
  • input: An InstructionInput instance that defines the input structure
  • output: An InstructionOutput instance that defines the response structure
  • context: Optional list of strings providing background context for the instruction. This context helps the model understand the domain and situation. Can be None or an empty list if no context is needed.
  • name: Optional name for the Instruction. Defaults to a generated name based on the input and output.

Creating Instructions

Basic Instruction

# Create TokenSets
cat_pondering = mtp.TokenSet(tokens=(tree, cat, ponder))
cat_grinning = mtp.TokenSet(tokens=(tree, cat, grin))

# Create Input and Output
instruction_input = mtp.InstructionInput(
tokensets=[cat_pondering]
)

instruction_output = mtp.InstructionOutput(
tokenset=cat_grinning,
final=mtp.FinalToken("Continue")
)

# Create an instruction
instruction = mtp.Instruction(
input=instruction_input,
output=instruction_output,
name="cat_pondering_instruction"
)

Instruction with Context

# Create Input and Output
instruction_input = mtp.InstructionInput(
tokensets=[cat_pondering, alice_talk]
)

instruction_output = mtp.InstructionOutput(
tokenset=cat_grinning,
final=mtp.FinalToken("Continue")
)

# Create instruction with context
instruction = mtp.Instruction(
input=instruction_input,
output=instruction_output,
context=[
"Alice was beginning to get very tired of sitting by her sister on the bank.",
"The Cheshire Cat appeared in the tree, grinning mysteriously."
],
name="alice_cat_conversation"
)
important

Instruction Context Limits:

  • Maximum of 10 context lines per instruction
  • Each context line must not exceed 300 characters

Adding Samples to Instructions

add_sample() parameters:

  • input_snippets: List of context snippets or strings that will be added to the Instruction. Must match the number of TokenSets in the Input.
  • output_snippet: The model's response snippet (can be a Snippet or string)
  • value: Optional numerical value (required if final Token is a FinalNumToken)
  • final: Optional FinalToken to use for this sample (required if Output has multiple final token options)

Basic Sample

# Add sample with strings (automatically converted to snippets)
instruction.add_sample(
input_snippets=["Why do I keep vanishing and reappearing so suddenly?"],
output_snippet="Because it amuses me, and it keeps everyone wondering whether I'm truly here at all."
)

# Add sample with explicit snippets
sample_context = cat_pondering.create_snippet(
string="Why do I keep vanishing and reappearing so suddenly?"
)
sample_response = cat_grinning.create_snippet(
string="Because it amuses me, and it keeps everyone wondering whether I'm truly here at all."
)

instruction.add_sample(
input_snippets=[sample_context],
output_snippet=sample_response
)

Multiple Final Tokens

When an InstructionOutput has multiple final token options, you must specify which final token to use for each sample:

# Output with multiple final options
instruction_output = mtp.InstructionOutput(
tokenset=cat_grinning,
final=[mtp.FinalToken("Appear"), mtp.FinalToken("Vanish")]
)

instruction = mtp.Instruction(
input=instruction_input,
output=instruction_output
)

# Must specify final token for each sample
instruction.add_sample(
input_snippets=["Then it doesn't matter which way you go."],
output_snippet="Oh sure, if you only walk long enough.",
final=mtp.FinalToken("Appear") # Required when multiple finals exist
)

Numeric Output with FinalNumToken

When using a FinalNumToken, you must provide a numeric value:

# Output with FinalNumToken
numeric_output = mtp.InstructionOutput(
tokenset=cat_grinning,
final=mtp.FinalNumToken("Madness", min_value=0, max_value=10)
)

instruction = mtp.Instruction(
input=instruction_input,
output=numeric_output
)

# Must provide value within the FinalNumToken's range
instruction.add_sample(
input_snippets=["How do you know I am mad?"],
output_snippet="You must be, or you would not have come here.",
value=7 # Required and must be within 0-10 range
)

Instruction Patterns

Conversational Pattern

# Conversational instruction
conversation_input_tokenset = mtp.TokenSet(tokens=(speaker, context))
conversation_output_tokenset = mtp.TokenSet(tokens=(responder, response))

conversation_input = mtp.InstructionInput(
tokensets=[conversation_input_tokenset]
)

conversation_output = mtp.InstructionOutput(
tokenset=conversation_output_tokenset,
final=mtp.FinalToken("Continue")
)

conversation_instruction = mtp.Instruction(
input=conversation_input,
output=conversation_output
)

Question-Answer Pattern

# Q&A instruction
question_tokenset = mtp.TokenSet(tokens=(question, context))
answer_tokenset = mtp.TokenSet(tokens=(answer, response))

qa_input = mtp.InstructionInput(
tokensets=[question_tokenset]
)

qa_output = mtp.InstructionOutput(
tokenset=answer_tokenset,
final=mtp.FinalToken("Complete")
)

qa_instruction = mtp.Instruction(
input=qa_input,
output=qa_output
)

Multi-Step Pattern

Complex instructions with multiple context steps:

# Multi-step instruction
step1_tokenset = mtp.TokenSet(tokens=(step1, context))
step2_tokenset = mtp.TokenSet(tokens=(step2, context))
final_response_tokenset = mtp.TokenSet(tokens=(final, response))

multi_step_input = mtp.InstructionInput(
tokensets=[step1_tokenset, step2_tokenset]
)

multi_step_output = mtp.InstructionOutput(
tokenset=final_response_tokenset,
final=mtp.FinalToken("Complete")
)

multi_step_instruction = mtp.Instruction(
input=multi_step_input,
output=multi_step_output,
context=["Background information for the multi-step process"]
)

Best Practices

  1. Clear Context: Provide clear context in Instruction that helps the model understand the situation
  2. Appropriate Responses: Ensure InstructionOutput responses match the expected behavior
  3. Consistent Patterns: Use consistent instruction patterns throughout your protocol
  4. Adequate Samples: Provide enough samples to train the model effectively (minimum of 3 per instruction)
  5. Proper Token Usage: Use the correct token types for each instruction component. See Tokens for token types and TokenSets for TokenSet usage.
  6. Separate Concerns: Keep Input and Output definitions separate for better organization and reusability

Instruction Validation

The MTP system ensures that:

  • All TokenSets in InstructionInput are properly defined
  • InstructionOutput TokenSets do not contain NumTokens or NumListTokens
  • Instructions define structured response patterns
  • All samples match the defined instruction structure
  • Final tokens are appropriate for the instruction type
  • Numeric values are within FinalNumToken ranges when used
  • Instruction context does not exceed 10 lines per instruction
  • Each instruction context line does not exceed 300 characters