A layer that compute logits from hidden_stats.
 
  Bases: CustomOp
Process logits and apply logits processors from sampling metadata.
This layer does the following: 1. Gather logits from model hidden_states. 2. Scale logits if needed. 3. Apply logits processors (if any).
Source code in vllm/model_executor/layers/logits_processor.py
  
 __init__(
    vocab_size: int,
    org_vocab_size: int | None = None,
    scale: float = 1.0,
    logits_as_input: bool = False,
    soft_cap: float | None = None,
) -> None
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| scale | float | A scaling factor to apply to the logits. | 1.0 | 
Source code in vllm/model_executor/layers/logits_processor.py
  
  gather/all-gather the logits tensor across model parallel group.
Source code in vllm/model_executor/layers/logits_processor.py
  
 _get_logits(
    hidden_states: Tensor,
    lm_head: VocabParallelEmbedding,
    embedding_bias: Tensor | None,
) -> Tensor | None
Source code in vllm/model_executor/layers/logits_processor.py
  
 forward(
    lm_head: VocabParallelEmbedding,
    hidden_states: Tensor,
    embedding_bias: Tensor | None = None,
) -> Tensor | None