Modules:
| Name | Description | 
|---|---|
| activation | Custom activation functions. | 
| attention_layer_base | Base class for attention-like layers. | 
| batch_invariant |  | 
| fla |  | 
| fused_moe |  | 
| layernorm | Custom normalization layers. | 
| lightning_attn |  | 
| linear |  | 
| logits_processor | A layer that compute logits from hidden_stats. | 
| mamba |  | 
| mla |  | 
| pooler |  | 
| quantization |  | 
| resampler | Shared resampler perceiver network used in multimodal models and | 
| rotary_embedding | Rotary Positional Embeddings. | 
| utils | Utility methods for model layers. | 
| vocab_parallel_embedding |  |