Skip to main content
Version: Next

Release notes

v1.2.0

  • Add arm64 support for the continuum-proxy. Find information on how to use it in the Continuum-proxy section.
  • Token tracking is now automatically enabled for streaming requests by transparently setting include_usage in the stream_options.

v1.1.0

  • Increase peak performance by more than 40% through improved request scheduling
  • Increase performance by about 6% through vLLM upgrade to v0.6.1