Text Generation
Transformers
TensorBoard
Safetensors
English
retrain-pipelines
function-calling
LLM Agent
code
unsloth
conversational
Eval Results (legacy)
Instructions to use retrain-pipelines/function_caller_lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use retrain-pipelines/function_caller_lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="retrain-pipelines/function_caller_lora") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("retrain-pipelines/function_caller_lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use retrain-pipelines/function_caller_lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "retrain-pipelines/function_caller_lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retrain-pipelines/function_caller_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/retrain-pipelines/function_caller_lora
- SGLang
How to use retrain-pipelines/function_caller_lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "retrain-pipelines/function_caller_lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retrain-pipelines/function_caller_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "retrain-pipelines/function_caller_lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "retrain-pipelines/function_caller_lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use retrain-pipelines/function_caller_lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for retrain-pipelines/function_caller_lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for retrain-pipelines/function_caller_lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for retrain-pipelines/function_caller_lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="retrain-pipelines/function_caller_lora", max_seq_length=2048, ) - Docker Model Runner
How to use retrain-pipelines/function_caller_lora with Docker Model Runner:
docker model run hf.co/retrain-pipelines/function_caller_lora
source-code for model version v0.14_20250319_202714437_UTC- retrain-pipelines 0.1.1
c7ed025 verified | absl-py==2.1.0 | |
| accelerate==1.1.1 | |
| aiohappyeyeballs==2.4.3 | |
| aiohttp==3.10.10 | |
| aiosignal==1.3.1 | |
| airportsdata==20241001 | |
| annotated-types==0.7.0 | |
| anyio==4.8.0 | |
| asttokens==2.4.1 | |
| async-timeout==4.0.3 | |
| attrs==24.2.0 | |
| bitsandbytes==0.44.1 | |
| boto3==1.35.58 | |
| botocore==1.35.58 | |
| certifi==2024.8.30 | |
| charset-normalizer==3.4.0 | |
| click==8.1.7 | |
| cloudpickle==3.1.0 | |
| colorama==0.4.6 | |
| comm==0.2.2 | |
| contourpy==1.3.1 | |
| cuda-python==12.6.2 | |
| cudf-polars-cu12==24.10.1 | |
| cycler==0.12.1 | |
| datasets==3.1.0 | |
| debugpy==1.8.8 | |
| decorator==5.1.1 | |
| dill==0.3.8 | |
| diskcache==5.6.3 | |
| docker==7.1.0 | |
| docker-pycreds==0.4.0 | |
| docstring_parser==0.16 | |
| exceptiongroup==1.2.2 | |
| executing==2.1.0 | |
| fastapi==0.115.8 | |
| fastjsonschema==2.20.0 | |
| filelock==3.16.1 | |
| fonttools==4.54.1 | |
| frozenlist==1.5.0 | |
| fsspec==2024.9.0 | |
| gitdb==4.0.11 | |
| GitPython==3.1.43 | |
| graphviz==0.20.3 | |
| grpcio==1.68.1 | |
| h11==0.14.0 | |
| hf_transfer==0.1.8 | |
| httptools==0.6.4 | |
| huggingface-hub==0.27.1 | |
| idna==3.10 | |
| iniconfig==2.0.0 | |
| interegular==0.3.3 | |
| ipykernel==6.29.5 | |
| ipython==8.29.0 | |
| ipywidgets==8.1.5 | |
| jedi==0.19.2 | |
| Jinja2==3.1.4 | |
| jmespath==1.0.1 | |
| joblib==1.4.2 | |
| jsonschema==4.23.0 | |
| jsonschema-specifications==2024.10.1 | |
| jupyter_client==8.6.3 | |
| jupyter_core==5.7.2 | |
| jupyterlab_widgets==3.0.13 | |
| kiwisolver==1.4.7 | |
| lark==1.2.2 | |
| libcudf-cu12==24.10.1 | |
| litserve==0.2.6 | |
| llvmlite==0.43.0 | |
| lxml==5.3.0 | |
| Markdown==3.7 | |
| markdown-it-py==3.0.0 | |
| MarkupSafe==3.0.2 | |
| matplotlib==3.9.2 | |
| matplotlib-inline==0.1.7 | |
| mdurl==0.1.2 | |
| metaflow==2.10.0 | |
| metaflow-card-html==1.0.2 | |
| mpmath==1.3.0 | |
| multidict==6.1.0 | |
| multiprocess==0.70.16 | |
| nbformat==5.10.4 | |
| nest-asyncio==1.6.0 | |
| networkx==3.2.1 | |
| numba==0.60.0 | |
| numpy==1.26.4 | |
| nvidia-cublas-cu11==11.11.3.6 | |
| nvidia-cublas-cu12==12.4.5.8 | |
| nvidia-cuda-cupti-cu11==11.8.87 | |
| nvidia-cuda-cupti-cu12==12.4.127 | |
| nvidia-cuda-nvrtc-cu11==11.8.89 | |
| nvidia-cuda-nvrtc-cu12==12.4.127 | |
| nvidia-cuda-runtime-cu11==11.8.89 | |
| nvidia-cuda-runtime-cu12==12.4.127 | |
| nvidia-cudnn-cu11==9.1.0.70 | |
| nvidia-cudnn-cu12==9.1.0.70 | |
| nvidia-cufft-cu11==10.9.0.58 | |
| nvidia-cufft-cu12==11.2.1.3 | |
| nvidia-curand-cu11==10.3.0.86 | |
| nvidia-curand-cu12==10.3.5.147 | |
| nvidia-cusolver-cu11==11.4.1.48 | |
| nvidia-cusolver-cu12==11.6.1.9 | |
| nvidia-cusparse-cu11==11.7.5.86 | |
| nvidia-cusparse-cu12==12.3.1.170 | |
| nvidia-nccl-cu11==2.21.5 | |
| nvidia-nccl-cu12==2.21.5 | |
| nvidia-nvjitlink-cu12==12.4.127 | |
| nvidia-nvtx-cu11==11.8.86 | |
| nvidia-nvtx-cu12==12.4.127 | |
| nvtx==0.2.10 | |
| outlines==0.1.3 | |
| outlines_core==0.1.14 | |
| packaging==24.2 | |
| pandas==2.2.3 | |
| parso==0.8.4 | |
| peft==0.13.2 | |
| pexpect==4.9.0 | |
| pillow==11.0.0 | |
| platformdirs==4.3.6 | |
| plotly==5.24.0 | |
| pluggy==1.5.0 | |
| polars==1.8.2 | |
| prompt_toolkit==3.0.48 | |
| propcache==0.2.0 | |
| protobuf==3.20.3 | |
| psutil==6.1.0 | |
| ptyprocess==0.7.0 | |
| pure_eval==0.2.3 | |
| pyarrow==17.0.0 | |
| pycountry==24.6.1 | |
| pydantic==2.9.2 | |
| pydantic_core==2.23.4 | |
| pydot==1.4.2 | |
| Pygments==2.18.0 | |
| pylibcudf-cu12==24.10.1 | |
| pyparsing==3.2.0 | |
| pytest==8.3.3 | |
| python-dateutil==2.9.0.post0 | |
| python-dotenv==1.0.1 | |
| python-multipart==0.0.20 | |
| pytz==2024.2 | |
| PyYAML==6.0.2 | |
| pyzmq==26.2.0 | |
| referencing==0.35.1 | |
| regex==2024.11.6 | |
| requests==2.32.3 | |
| -e git+https://github.com/aurelienmorgan/retrain-pipelines.git@9bbdca8a19b421b90a2640a250d8549680898f9b#egg=retrain_pipelines&subdirectory=pkg_src | |
| rich==13.9.4 | |
| rmm-cu12==24.10.0 | |
| rpds-py==0.21.0 | |
| s3transfer==0.10.3 | |
| safetensors==0.4.5 | |
| scikit-learn==1.5.1 | |
| scipy==1.14.1 | |
| sentencepiece==0.2.0 | |
| sentry-sdk==2.19.0 | |
| setproctitle==1.3.4 | |
| shtab==1.7.1 | |
| six==1.16.0 | |
| smmap==5.0.1 | |
| sniffio==1.3.1 | |
| stack-data==0.6.3 | |
| starlette==0.45.3 | |
| sympy==1.13.1 | |
| tenacity==9.0.0 | |
| tensorboard==2.18.0 | |
| tensorboard-data-server==0.7.2 | |
| threadpoolctl==3.5.0 | |
| tokenizers==0.20.3 | |
| tomli==2.1.0 | |
| torch==2.5.0 | |
| tornado==6.4.1 | |
| tqdm==4.67.0 | |
| traitlets==5.14.3 | |
| transformers==4.46.2 | |
| triton==3.1.0 | |
| trl==0.12.0 | |
| typing_extensions==4.12.2 | |
| tyro==0.8.14 | |
| tzdata==2024.2 | |
| unsloth==2024.11.5 | |
| unsloth_zoo==2024.11.4 | |
| urllib3==2.2.3 | |
| uvicorn==0.34.0 | |
| uvloop==0.21.0 | |
| wandb==0.18.7 | |
| watchfiles==1.0.4 | |
| wcwidth==0.2.13 | |
| websockets==15.0 | |
| Werkzeug==3.1.3 | |
| widgetsnbextension==4.0.13 | |
| xformers==0.0.28.post2 | |
| xxhash==3.5.0 | |
| yarl==1.17.1 | |