Skip to content

Conversation

@aresnow1
Copy link
Contributor

Support TensorRT-LLM backend.

  • Implements TRTModel with generate method.
  • Expose launch_trt_model to client.
  • Doc and example
@XprobeBot XprobeBot added this to the v0.6.3 milestone Nov 14, 2023
@aresnow1
Copy link
Contributor Author

Python API of in-flight batching is needed for this PR, and TensorRT-LLM team says it will be implemented in next versions.

@XprobeBot XprobeBot modified the milestones: v0.10.2, v0.10.3, v0.11.0 Apr 19, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1, v0.11.2 May 11, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.3, v0.11.4, v0.12.0, v0.12.1 May 31, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.1, v0.12.2 Jun 14, 2024
@XprobeBot XprobeBot modified the milestones: v0.12.2, v0.12.4, v0.13.0, v0.13.1 Jun 28, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.1, v0.13.2 Jul 12, 2024
@XprobeBot XprobeBot modified the milestones: v0.13.2, v0.13.4 Jul 26, 2024
@XprobeBot XprobeBot modified the milestones: v0.14, v0.15 Sep 3, 2024
@XprobeBot XprobeBot modified the milestones: v0.15, v0.16 Oct 30, 2024
@XprobeBot XprobeBot modified the milestones: v0.16, v1.x Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 participants