Local Generative AI capabilities using google mediapipe.
Non-blocking local LLM inference using quantized models.
Supports only Android.
Generative AI models are large in size and should not be bundled in apk. Ideally in production, the model must be downloaded from server upon user request. For development, we manually download the preferred model to PC and push to an android debugging device (adb).
Gemma models compatible with mediapipe format can be downloaded directly from Kaggle (no conversion needed)
# Export your Kaggle username and API key
# export KAGGLE_USERNAME=
# export KAGGLE_KEY=
curl -L -u $KAGGLE_USERNAME:$KAGGLE_KEY\
-o ~/Downloads/gemma-2b-it-cpu-int4.tar.gz\
https://www.kaggle.com/api/v1/models/google/gemma/tfLite/gemma-2b-it-cpu-int4/1/download
# Extract model
tar -xvzf ~/Downloads/gemma-2b-it-cpu-int4.tar.gz -C ~/Downloads
For other models, they need to be converted/quantized. Checkout the below links on how to download and convert models to media pipe compatible format.
- https://developers.google.com/mediapipe/solutions/genai/llm_inference#models
- https://developers.google.com/mediapipe/solutions/genai/llm_inference/android#model
For testing in Android, push the downloaded model to a physical device in developer mode using the below commands.
# Clear directory to remove previous models
adb shell rm -r /data/local/tmp/llm/
# Create directory to save model
adb shell mkdir -p /data/local/tmp/llm/
# Push model to device
cd ~/Downloads
adb push gemma-2b-it-cpu-int4.bin /data/local/tmp/llm/gemma-2b-it-cpu-int4.bin
yarn add react-native-local-gen-ai
#or
npm i react-native-local-gen-ai
Update minSdkVersion to 24 in android/build.gradle file.
Invoke chatWithLLM async method with your prompt.
import { chatWithLLM } from 'react-native-local-gen-ai';
// non-blocking prompting !!
const response = await chatWithLLM("hello !");
console.log(response)
// Response
! 👋
I am a large language model, trained by Google.
I can talk and answer your questions to the best of my knowledge.
What would you like to talk about today? 😊
[Optional] Override model options
import { setModelOptions } from 'react-native-local-gen-ai'
/* Default model path is set to
/data/local/tmp/llm/gemma-2b-it-cpu-int4.bin
For other model variants, modelPath needs to be
updated before invoking chatWithLLM
*/
useEffect(()=>{
setModelOptions({
modelPath: '/data/local/tmp/llm/gemma-2b-it-gpu-int4.bin',
randomSeed: 0, // Default 0
topK: 30, // Default 40
temperature: 0.7, // Default 0.8
maxTokens: 1000 // Default 512
});
},[])
For GPU models, add an entry in Application Manifest file (android/app/src/main/AndroidManifest.xml) to use openCL.
<application>
<!-- Add this for gpu inference -->
<uses-library android:name="libOpenCL.so"
android:required="false"/>
</application>
Use local app development instead of Expo GO. Rest of the steps remain the same.
More info on Expo local app development can be found here:
https://docs.expo.dev/guides/local-app-development/
npx expo run:android
https://github.com/nagaraj-real/expo-chat-llm
https://github.com/nagaraj-real/react-native-local-genai/tree/main/example
See the contributing guide to learn how to contribute to the repository and the development workflow.
MIT
Made with create-react-native-library
Uses Google Mediapipe under the hood.