Overview #
FlashLearn is a modular framework designed to define, orchestrate, and process tasks for generative AI models (such as GPT‑4) using a skill concept. The framework operates on data represented as a list of dictionaries (where each dictionary corresponds to a “row”) and supports multiple input and output modalities such as text, audio, and image.
Each skill encapsulates a JSON function definition and provides a consistent interface for:
- Building request tasks from key-value data.
- Orchestrating asynchronous execution.
- Parsing model responses.
- Aggregating data for complex tasks—all while supporting multimodal inputs.
Key concepts include:
- Content Block Creation: Transforming dictionary entries into system and user messages.
- Strict JSON Schemas: Enforcing function output formats.
- Data Aggregation and Retry Logic: Dynamically learning and validating new skills.
- Debug Utilities: e.g., flattening blocks for token estimation.
Module and Class Details #
Core Concepts #
- Input Data: Always in the form of a list of dictionaries (each representing a row).
- Modalities: Each dictionary key may be accompanied by a modality:
- Default:
"text"
- Alternatives:
"audio"
,"image_url"
,"image_base64"
- Default:
- Task Structure: Tasks are dictionaries with a unique identifier and a request body. The request includes:
- A model name.
- System and user messages.
- Function tools (with JSON function definitions).
- Output parameters (if needed).
BaseSkill #
File: flashlearn/skills/base_skill.py
Description:
An abstract base class for any skill that FlashLearn can execute. It enforces a consistent interface for:
- Creating tasks from a list-of-dictionaries (each dictionary represents a data row).
- Building function definitions for JSON function calling.
- Managing asynchronous task execution and usage statistics.
Constructor Inputs:
model_name
(str): Name of the model used for the skill.system_prompt
(str; default =""
): Instruction for the system message.full_row
(bool; default =False
): Indicates if the entire row should be used.client
(FlashLiteLLMClient instance; default provided): The client interface for the generative model.
Key Methods:
-
create_tasks(df, **kwargs) → List[Dict[str, Any]]
- Inputs:
df
: A list of dictionaries. Each dictionary represents a row with key/value pairs.**kwargs
: Additional keyword arguments.
- Behavior:
Abstract method; child classes must override this to create tasks from the input data.
- Inputs:
-
_build_function_def() → Dict[str, Any]
- Behavior:
Abstract method that returns the JSON schema for the function call. Child classes must provide a concrete implementation.
- Behavior:
-
save(filepath: str = None) → Dict[str, Any]
- Input:
filepath
(str; optional): The file path where the JSON is saved. Defaults to “<ClassName>.json
” if not specified.
- Returns:
The dictionary that was saved (model_name is excluded).
- Input:
-
run_tasks_in_parallel(...)
- Inputs:
tasks
(list): List of task dictionaries.save_filepath
(str, optional): Path to save results or progress.max_requests_per_minute
(int): Request throttling.max_tokens_per_minute
(int): Token throughput throttling.max_attempts
(int): Maximum retry attempts.token_encoding_name
(str): Token encoding identifier (e.g.,"cl100k_base"
).return_results
(bool): Specifies whether to return final results.request_timeout
(int): Timeout for each request.
- Returns:
A tuple (final_results
,final_status_tracker
) and updates internal token counters.
- Inputs:
-
estimate_tasks_cost(tasks: list) → float
- Input:
tasks
: A list of tasks.
- Returns:
A cost estimate (derived from token usage multiplied by a rate).
- Input:
BaseDataSkill #
File: flashlearn/skills/base_data_skill.py
Description:
Extends BaseSkill
to provide utilities for dictionary-based data processing. This class supports:
- Conversion of key-value pairs into content blocks with appropriate modalities.
- Creation of system and user messages from dictionaries.
- Flattening of content blocks for debugging purposes.
- Merging of output modality settings into the request body.
- Parsing of function call responses.
Key Methods:
-
build_output_params(modality: str) → Dict[str, Any]
- Input:
modality
(str): Desired output modality. Supports"audio"
,"image"
, or defaults to"text"
.
- Returns:
- For
"audio"
:{ "modalities": ["text", "audio"], "audio": {"voice": "alloy", "format": "wav"} }
- For
"image"
:
{ "modalities": ["image"] }
- Otherwise:
{ "modalities": ["text"] }
- For
- Input:
-
build_content_blocks(row: Dict[str, Any], column_modalities: Dict[str, str] = None) → List[Dict[str, Any]]
- Inputs:
row
: A dictionary representing one data row.column_modalities
: (Optional) Mapping of dictionary keys to modalities (e.g.,"text"
,"audio"
,"image_url"
,"image_base64"
). Defaults to"text"
for unspecified keys.
- Behavior:
Iterates through the row and:- Skips empty values.
- Creates blocks based on the modality:
"text"
→{ "type": "text", "text": raw_value }
"audio"
→{ "type": "input_audio", "input_audio": {"data": raw_value, "format": "wav"} }
"image_url"
→{ "type": "image_url", "image_url": {"url": raw_value} }
"image_base64"
→ Adds JPEG/PNG prefix heuristically.- Any unknown modality falls back to “text”.
- Returns:
A list of content block dictionaries.
- Inputs:
-
flatten_blocks_for_debug(blocks: List[Dict[str, Any]]) → str
- Input:
blocks
: List of content block dictionaries.
- Behavior:
Converts blocks into a human-readable string:"text"
→ Actual text."image_url"
→[IMAGE_URL]
"input_audio"
→[AUDIO]
- Others →
"[<TYPE in uppercase>]"
.
- Returns:
A newline-separated string of block representations.
- Input:
-
create_tasks(df: List[Dict[str, Any]], column_modalities: Dict[str, str] = None, output_modality: str = "text", **kwargs) → List[Dict[str, Any]]
- Inputs:
df
: List of dictionaries (each representing a row).column_modalities
: Mapping of keys to modalities.output_modality
: Specifies the output type (e.g.,"text"
,"audio"
,"image"
).**kwargs
: Additional parameters (e.g.,columns
for signature compatibility).
- Behavior:
- Merges additional output parameters if needed.
- Processes each row:
- Converts row into content blocks.
- Skips rows without valid content.
- Constructs a request payload with:
- A system message (based on
system_prompt
). - A user message (with content blocks and flattened text).
- Tools from
_build_function_def()
. - A tool choice indicator.
- A system message (based on
- Returns:
A list of tasks. Each task is a dictionary with:"custom_id"
: Row index as a string."request"
: The complete request payload.
- Inputs:
-
parse_result(raw_result: Dict[str, Any]) → Any
- Input:
raw_result
: The complete response from the model.
- Behavior:
By default returns theraw_result
without modification. Intended for override.
- Input:
-
parse_function_call(raw_result: Dict[str, Any], arg_name="categories") → Any
- Inputs:
raw_result
: The response dictionary.arg_name
(optional): The key to extract (e.g.,"categories"
).
- Behavior:
- Looks for a top-level
"function_call"
key in the model response. - Uses
ast.literal_eval
to parse a Python representation of the JSON. - Returns the value associated with
arg_name
orNone
on failure.
- Looks for a top-level
- Returns:
The parsed argument value orNone
.
- Inputs:
-
_build_function_def() → Dict[str, Any]
- Behavior:
Returns a minimal default function definition with:"name": "basic_function"
"description": "A simple function call placeholder."
"strict": True
"parameters"
: JSON schema requiring a"result"
property of type"string"
.
- Note:
Child classes should override this for specialized tasks.
- Behavior:
ClassificationSkill #
File: flashlearn/skills/classification.py
Description:
This skill classifies text inputs (each provided as a dictionary) into one or more preconfigured categories.
- Features:
- Supports single- and multi-category selections.
- Overrides
_build_function_def()
to generate a strict JSON schema:- For single-category: Expects a string with enumerated valid categories.
- For multi-category: Expects an array of strings, optionally with a
maxItems
constraint.
- Overrides
parse_result()
to normalize the output (wrapping a string result in a list, if necessary).
Constructor Inputs:
model_name
(str): Model identifier.categories
(List[str]): A list of acceptable category strings.max_categories
(int; default = 1): Maximum number of categories to return (use-1
for unlimited).system_prompt
(str; default =""
): Prompt for the system message.client
(default FlashLiteLLMClient instance): The LLM client.
Key Methods:
-
_build_function_def() → Dict[str, Any]
- Behavior:
- If
max_categories == 1
, creates a schema with:"type": "string"
"enum"
of valid categories.
- Otherwise, returns a schema with:
"type": "array"
, enumerating valid categories.- Adds
"maxItems"
ifmax_categories > 0
.
- If
- Returns:
A JSON schema (wrapped in a dict with key"function"
).
- Behavior:
-
parse_result(raw_result: Dict[str, Any]) → List[str]
- Behavior:
- Uses
parse_function_call()
witharg_name="categories"
. - Returns an empty list if
None
. - Wraps a single string in a list if necessary.
- Uses
- Returns:
A list of categories.
- Behavior:
DiscoverLabelsSkill #
File: flashlearn/skills/discover_labels.py
Description:
This skill aggregates all rows into one unified user message for discovering common labels in an entire dataset.
Constructor Inputs:
model_name
(str): Model identifier.label_count
(int; default = -1): Limits the number of labels (if positive);-1
allows unlimited labels.system_prompt
(str; default =""
): Base system prompt.client
(default FlashLiteLLMClient instance): The LLM client.
Key Methods:
-
create_tasks(df: List[Dict[str, Any]], column_modalities: Dict[str, str] = None, output_modality: str = "text", columns: List[str] = None, **kwargs) → List[Dict[str, Any]]
- Inputs:
df
: List of dictionaries (each a data row).column_modalities
: (Optional) Dict mapping keys to modalities.output_modality
(str): Specifies the model’s output type.columns
(List[str]; optional): Maintained for signature consistency.
- Behavior:
- Aggregates content blocks from all rows.
- If
label_count > 0
, appends: “You may select up to X labels.” to the system prompt; otherwise, indicates unlimited labels. - Constructs a single aggregated task. If no content blocks are generated (i.e., empty rows), returns an empty list.
- Returns:
A single-item list containing one task dictionary.
- Inputs:
-
_build_function_def() → Dict[str, Any]
- Behavior:
Constructs a function definition that expects an array of string labels. - Returns:
The JSON schema wrapped in a function-def dict.
- Behavior:
-
parse_result(raw_result: Dict[str, Any]) → Any
- Behavior:
Returns the result ofparse_function_call(raw_result, arg_name="labels")
or an empty list. - Returns:
A list of labels.
- Behavior:
GeneralSkill #
File: flashlearn/skills/general_skill.py
Description:
A flexible, general-purpose skill that accepts a custom JSON function definition at initialization.
Constructor Inputs:
model_name
(str): Model identifier.function_definition
(Dict[str, Any]): Custom JSON function definition describing the expected output.system_prompt
(str; default ="You are a helpful assistant."
): Instruction for the system message.columns
(List[str]; optional): If provided, restricts which keys are used in task creation.client
(default FlashLiteLLMClient instance): The LLM client.
Key Methods:
-
create_tasks(...)
- Inputs:
df
: List of dictionaries (data rows).column_modalities
: (Optional) Mapping from keys to modalities.output_modality
(str): Desired output modality.columns
(List[str], optional): Falls back toself.columns
if not provided.**kwargs
: Additional parameters.
- Behavior:
Invokes the parent class’screate_tasks
method after setting fallback columns if necessary.
- Inputs:
-
_build_function_def() → Dict[str, Any]
- Behavior:
Returns the custom function definition passed during initialization.
- Behavior:
-
parse_result(raw_result: Dict[str, Any]) → Any
- Behavior:
Returns theraw_result
unchanged. Intended for override if further parsing is necessary.
- Behavior:
-
load_skill(config: Dict[str, Any], model_name="gpt-4o-mini", client=FlashLiteLLMClient()) → GeneralSkill
- Inputs:
config
: A dictionary containing keys like"skill_class"
,"system_prompt"
,"function_definition"
, and"columns"
.model_name
(str): The model identifier (default"gpt-4o-mini"
).client
: The LLM client instance.
- Behavior:
Instantiates and returns a newGeneralSkill
instance using the provided configuration. - Returns:
An initializedGeneralSkill
object.
- Inputs:
LearnSkill #
File: flashlearn/skills/learn_skill.py
Description:
LearnSkill orchestrates the process of “learning” a minimal function definition from sample data. It is responsible for:
- Managing the LLM client.
- Handling concurrency and batch submission.
- Implementing retry logic for validating function definitions.
- Estimating task costs.
Constructor Inputs:
model_name
(str; default ="gpt-4o"
): The model identifier.verbose
(bool; default =True
): Enables or disables verbose logging.client
(default FlashLiteLLMClient instance): The LLM client.
Key Methods:
-
learn_skill(df: List[Dict[str, Any]], task: str = "", columns: List[str] = None, model_name: str = "gpt-4o-mini", column_modalities: Dict[str, str] = None, output_modality: str = "text", retry: int = 5) → GeneralSkill or None
- Inputs:
df
: List of dictionaries representing data rows.task
(str): A brief description of the intended function.columns
(List[str], optional): The keys to include. If not provided, aggregates all available keys.model_name
(str): The model to use for generating the function definition.column_modalities
: Mapping from keys to modalities for constructing user message blocks.output_modality
(str): Desired output mode (e.g.,"text"
,"audio"
, or"image"
).retry
(int): Number of retry attempts for generating and validating the function definition.
- Behavior:
- Constructs system and user messages using sample data.
- Performs a two-pass process:
- Extracts a minimal function definition from the model’s response.
- Validates the definition with a second call.
- Retries on failure for a total of
retry
attempts.
- Returns:
AGeneralSkill
instance with the learned function definition, orNone
if learning fails.
- Inputs:
-
_extract_function_call_arguments(completion) → Any
- Input:
completion
: The model response object containingchoices → message → tool_calls
.
- Behavior:
- Extracts the
"arguments"
string from the tool call. - Uses
ast.literal_eval
to convert the string to a Python dictionary. - If nested JSON with a
"function_definition"
key is present, parses and wraps it with"strict": True
. - If parsing fails, returns either the raw string or an error message.
- Extracts the
- Returns:
The parsed function definition or an error indicator.
- Input: