Skip to main content

Input

The Input component of an Instruction defines what information the model receives. It consists of TokenSets that structure the input patterns.

InstructionInput

The InstructionInput class defines the input structure for an instruction. It encapsulates the TokenSets that define input patterns.

InstructionInput Parameters

class InstructionInput:
def __init__(self, tokensets: List[TokenSet]):
  • tokensets: A required list of TokenSets that define the input patterns. The number of TokenSets must match the protocol's instruction_context_snippets parameter, and must be the same across all instructions.

What's Allowed in InstructionInput

  • Any TokenSet combination: Input TokenSets can contain Basic Tokens, NumTokens, and NumListTokens in any combination
  • Multiple TokenSets: InstructionInput can contain multiple TokenSets to define complex multi-step input patterns
  • Consistent TokenSet count: All instructions in a protocol must have the same number of TokenSets in their Input

Creating InstructionInput

# Create TokenSets for input patterns
cat_pondering = mtp.TokenSet(tokens=(tree, cat, ponder))
alice_talk = mtp.TokenSet(tokens=(tree, alice, talk))

# Create an InstructionInput with TokenSets
instruction_input = mtp.InstructionInput(
tokensets=[cat_pondering, alice_talk]
)

# Create a simple InstructionInput with a single TokenSet
simple_input = mtp.InstructionInput(
tokensets=[cat_pondering]
)

# Create an InstructionInput with multiple TokenSets for complex interactions
multi_input = mtp.InstructionInput(
tokensets=[cat_pondering, alice_talk, cat_pondering]
)
tip

To add background context to an instruction, pass the context parameter to the Instruction class when creating the instruction. See Instructions for examples.

Adding Samples to InstructionInput

When adding samples to an instruction, the input_snippets parameter must match the number of TokenSets in an InstructionInput. For each TokenSet:

  • TokenSets without NumTokens or NumListTokens: You can pass strings directly. The system will automatically convert them to snippets. Alternatively you can call TokenSet.create_snippet() and pass a snippet for consistency.
  • TokenSets with NumTokens: You must create snippets using TokenSet.create_snippet() and provide the numbers parameter with the numeric value.
  • TokenSets with NumListTokens: You must create snippets using TokenSet.create_snippet() and provide the number_lists parameter with a list of numbers matching the required length.
important

Snippet Length Limit: Each input snippet (string) must not exceed 300 characters. This limit applies to all snippets used in instruction inputs, whether passed as strings or created explicitly using TokenSet.create_snippet().

# For TokenSets without numeric tokens, use strings directly
instruction.add_sample(
input_snippets=["Why do I keep vanishing?", "Can you tell me a way?"],
output_snippet="Oh sure, if you only walk long enough."
)

# For TokenSets with NumTokens, create snippets with numeric values
emotion_token = mtp.NumToken("Emotion", min_value=0, max_value=10)
emotion_tokenset = mtp.TokenSet(tokens=(tree, alice, talk, emotion_token))

emotion_snippet = emotion_tokenset.create_snippet(
string="Can you tell me a way?",
numbers=5 # Required: numeric value for the NumToken
)

# For TokenSets with NumListTokens, create snippets with number lists
coordinates_token = mtp.NumListToken("Coordinates", min_value=-1000, max_value=1000, length=3)
coordinates_tokenset = mtp.TokenSet(tokens=(tree, cat, talk, coordinates_token))

coordinates_snippet = coordinates_tokenset.create_snippet(
string="Then it doesn't matter which way you go.",
number_lists=[100, 200, -50] # Required: list matching the length
)

# Use the snippets when adding samples
instruction.add_sample(
input_snippets=[emotion_snippet, coordinates_snippet],
output_snippet="Oh sure, if you only walk long enough."
)

Input Validation

The MTP system ensures that:

  • All TokenSets in an InstructionInput are properly defined
  • The number of TokenSets matches the protocol's instruction_context_snippets parameter
  • All instructions in a protocol have the same number of TokenSets in their Input
  • Input snippets match the structure defined by the TokenSets when adding samples
  • Each input snippet (string) does not exceed 300 characters in length