Parallel execution in FlashLearn is designed to optimize the throughput of your workflow by orchestrating API calls concurrently. Each task, already constructed as a structured API call definition, is passed as keyword arguments directly to the client’s method (e.g., client.chat.completions.create(**kwargs)). This means that every task encapsulates all the information needed for the API call, reducing overhead and simplifying error handling.
FlashLearn’s built-in concurrency mechanisms allow you to process up to 1000 calls per minute, ensuring that large batches of tasks are handled efficiently. The abstraction makes it easy to scale your LLM workflows without having to manage individual API call details manually.
When you’re ready to execute these tasks concurrently, simply call:
skill.process_tasks_in_parallel(tasks)
This single call takes care of distributing tasks, managing rate limits, and gathering results, letting you focus on higher-level pipeline logic rather than the intricacies of API orchestration.