Creates a new ExecuTorch LLM instance.
Parameters for the instance.
Source of the LLM model.
Source of the tokenizer.
Source of the tokenizer config.
Download progress callback (0-1).
Callback invoked with final full response string.
Chat configuration forwarded to ExecuTorch.
Generates a completion from a list of messages, streaming tokens to callback
.
Conversation history for the model.
Token-level streaming callback.
Promise that resolves to the full generated string.
Interrupts current generation. Note: current ExecuTorch interrupt is synchronous. Awaiting this method will not guarantee completion.
Loads the model and config via react-native-executorch
, and applies configuration.
Promise that resolves to the same instance.
Unloads the underlying module. Note: current ExecuTorch unload is synchronous. Awaiting this method will not guarantee completion.
ExecuTorch-based implementation of LLM for React Native.