model layer

Try it

stream raw

thinking

result appears here

free resources per host — to see what can fit a new model

name	type	mode	conc	status	latency	last seen	endpoint	actions

job	model	status	enqueued	finished

time	model	backend	path	status	queued	latency	tokens (p/c)