Build a production small language model in 90 seconds.
Describe what you want to classify. We build a 10MB intent classifier with sub-millisecond latency and a hosted API. Free tier available.
text,intent header · max 1 MB
Or start from a template
Model size
10MB
int8 quantized ONNX
Latency
<1ms
p50 on CPU
Time to ship
90s
from prompt to API