Jamba-v0.1 - AI Language Models Tool
Overview
Jamba-v0.1 is a hybrid SSM-Transformer large language model from AI21 Labs. It is a pretrained, mixture-of-experts generative text model with 12B active parameters and 52B total parameters across experts, supporting a 256K context length. Designed for high throughput, Jamba-v0.1 is intended as a base model for fine-tuning into chat or instruction-following variants and for research into MoE and long-context use cases.
Key Features
- Hybrid SSM-Transformer architecture
- Mixture-of-experts: 12B active, 52B total across experts
- Supports 256K context length
- Pretrained generative text model
- Optimized for high throughput
- Suitable base for chat/instruct fine-tuning
Ideal Use Cases
- Fine-tune into instruction-following chat models
- Long-context document understanding and summarization
- High-throughput batch text generation
- Research on MoE and SSM-Transformer architectures
Getting Started
- Open the model page on Hugging Face: https://huggingface.co/ai21labs/Jamba-v0.1
- Review the model card and available documentation
- Request access or download weights if provided
- Prepare datasets for fine-tuning on target tasks
- Evaluate performance on representative workloads
Pricing
Pricing and access details are not disclosed on the model page. Check the model page or contact AI21 Labs for licensing and access information.
Limitations
- Not delivered as a chat/instruct fine-tuned model; requires fine-tuning for instruction tasks
- Total parameters across experts exceed active parameters, reflecting MoE architecture characteristics
Key Information
- Category: Language Models
- Type: AI Language Models Tool