Release notes
v1.2.2
- Fixes a bug for streaming requests that made optional parameters required if
stream_options: {"include_usage": true}
wasn't set
v1.2.0
- Add
arm64
support for thecontinuum-proxy
. Find information on how to use it in the Continuum-proxy section. - Token tracking is now automatically enabled for streaming requests by transparently setting
include_usage
in thestream_options
.
v1.1.0
- Increase peak performance by more than 40% through improved request scheduling
- Increase performance by about 6% through vLLM upgrade to
v0.6.1