FAST: Efficient Action Tokenization for Vision-Language-Action Models - AI Robotics Tool
Overview
FAST (FAST+) is a universal action tokenizer that maps robot action sequences into dense, discrete tokens. It provides a pre-trained tokenizer and tooling to train custom tokenizers for autoregressive vision-language-action model training.
Key Features
- Maps robot action sequences to dense, discrete tokens
- Pre-trained FAST+ tokenizer available for immediate use
- Tools and scripts to train custom tokenizers on user data
- Designed for autoregressive vision-language-action model training
- Produces compact token sequences suitable for sequence modeling
Ideal Use Cases
- Tokenizing robot action datasets for model training
- Training autoregressive vision-language-action models
- Fine-tuning tokenizers on custom robot behavior data
- Compressing action sequences for sequence modeling and evaluation
Getting Started
- Visit the Hugging Face repository: https://huggingface.co/physical-intelligence/fast
- Load the provided pre-trained FAST+ tokenizer into your pipeline
- Prepare robot action sequences in the repository's expected format
- Run the supplied training scripts to train a custom tokenizer
- Convert tokenized sequences into inputs for autoregressive model training
Pricing
Not disclosed; repository available on Hugging Face.
Key Information
- Category: Robotics
- Type: AI Robotics Tool