SharpInference.Cli
0.8.2-alpha.0.25
This is a prerelease version of SharpInference.Cli.
There is a newer prerelease version of this package available.
See the version list below for details.
See the version list below for details.
dotnet tool install --global SharpInference.Cli --version 0.8.2-alpha.0.25
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
dotnet tool install --local SharpInference.Cli --version 0.8.2-alpha.0.25
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=SharpInference.Cli&version=0.8.2-alpha.0.25&prerelease
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
nuke :add-package SharpInference.Cli --version 0.8.2-alpha.0.25
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
SharpInference.Cli
sharpi-cli — a command-line tool for LLM inference and image generation, powered by SharpInference. Reads GGUF models and runs transformer inference on CPU (AVX2/AVX-512 SIMD) or GPU (Vulkan / CUDA).
Install
dotnet tool install -g SharpInference.Cli
Or update:
dotnet tool update -g SharpInference.Cli
Usage
# Text generation (CPU)
sharpi-cli -m models/SmolLM2-1.7B-Instruct-Q4_K_M.gguf -p "Once upon a time" --temp 0.7
# All layers on GPU (Vulkan or CUDA, auto-selected)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf -p "Explain mmap" -g -1
# Interactive chat (omit -p to enter chat mode)
sharpi-cli -m models/Qwen3-8B-Q4_K_M.gguf
# Image generation (Z-Image-Turbo, requires CUDA)
sharpi-cli image \
-m models/z_image_turbo-Q5_K_M.gguf \
--vae models/z-image-turbo/vae \
--qwen-encoder models/Z-Image-AbliteratedV1.Q5_K_M.gguf \
--qwen-tokenizer models/z-image-turbo/tokenizer/tokenizer.json \
-p "a serene mountain lake at sunrise" -W 512 -H 512 --steps 4 -o out.png
Flag names are intentionally compatible with llama.cpp / llama-cli.
| Flag | Default | Description |
|---|---|---|
-m, --model |
auto-detect | Path to GGUF model file |
-p, --prompt |
(interactive) | Input prompt; omit to enter chat |
-n, --n-predict |
512 |
Maximum tokens to generate |
--temp |
0.7 |
Sampling temperature (0 = greedy) |
--top-k |
40 |
Top-k sampling |
--top-p |
0.95 |
Top-p nucleus sampling |
--min-p |
0.05 |
Min-p sampling |
-g, --n-gpu-layers |
0 |
Layers on GPU (0 = CPU only, -1 = all) |
-c, --ctx-size |
model default | Context / max sequence length |
--tq |
off | TurboQuant KV cache compression (3-bit, ~5× VRAM reduction) |
Run sharpi-cli --help for the full reference.
Requirements
- .NET 10 runtime (the tool installs framework-dependent)
- x86-64 CPU with AVX2 support
- For GPU inference: Vulkan-capable GPU (any vendor) or NVIDIA GPU with CUDA 11.x / 12.x
Links
License
MIT. Copyright (c) 2026 Pekka Heikura.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
This package has no dependencies.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.8.2-alpha.0.42 | 3 | 6/22/2026 |
| 0.8.2-alpha.0.41 | 22 | 6/22/2026 |
| 0.8.2-alpha.0.40 | 26 | 6/22/2026 |
| 0.8.2-alpha.0.39 | 26 | 6/22/2026 |
| 0.8.2-alpha.0.38 | 21 | 6/22/2026 |
| 0.8.2-alpha.0.37 | 28 | 6/22/2026 |
| 0.8.2-alpha.0.36 | 28 | 6/22/2026 |
| 0.8.2-alpha.0.35 | 26 | 6/22/2026 |
| 0.8.2-alpha.0.34 | 34 | 6/22/2026 |
| 0.8.2-alpha.0.33 | 38 | 6/22/2026 |
| 0.8.2-alpha.0.32 | 32 | 6/22/2026 |
| 0.8.2-alpha.0.31 | 34 | 6/22/2026 |
| 0.8.2-alpha.0.30 | 34 | 6/22/2026 |
| 0.8.2-alpha.0.29 | 44 | 6/22/2026 |
| 0.8.2-alpha.0.28 | 38 | 6/22/2026 |
| 0.8.2-alpha.0.27 | 47 | 6/22/2026 |
| 0.8.2-alpha.0.26 | 36 | 6/22/2026 |
| 0.8.2-alpha.0.25 | 39 | 6/22/2026 |
| 0.8.2-alpha.0.24 | 36 | 6/22/2026 |
| 0.8.1 | 50 | 6/19/2026 |
Loading failed