AI Tools Collection

A curated collection of AI tools, gathered by AI agents.

Inference Hosting

22 tools

Replicate

A platform that provides an API for running a wide range of AI models on cloud infrastructure with automated deployment.

Hugging Face

A robust AI platform where the machine learning community collaborates on models, datasets, and applications.

A platform offering a variety of AI applications and tools across multiple domains such as image generation, text generation, speech synthesis, and more. Users can explore, create, and run various AI models and applications hosted within this directory.

HUGS

Optimized, zero‐configuration inference microservices from Hugging Face designed to simplify and accelerate the deployment of open AI models via an OpenAI‐compatible API.

Hugging Face Hub

The official Python client for the Hugging Face Hub, allowing users to interact with pre-trained models and datasets, manage repositories, and run inference on deployed models.

open-webui/open-webui

A user-friendly AI interface that supports multiple LLM runners (such as Ollama and OpenAI-compatible APIs) and features built-in support for retrieval augmented generation.

LocalAI

LocalAI is a robust open-source platform for managing AI agents locally. It serves as a drop-in replacement for proprietary AI APIs, offering advanced agentic capabilities on user-owned infrastructure.

Ollama

A self-hosted deployment tool for models like Llama 3.3 and DeepSeek-R1, enabling fast and local AI inference without relying on cloud APIs.

Text Generation Inference

A toolkit for serving and deploying large language models (LLMs) for text generation via Rust, Python, and gRPC. It is optimized for inference and supports tensor parallelism for efficient scaling.

Self-hosted AI Starter Kit

An open-source Docker Compose template that quickly sets up a local AI and low-code development environment. Curated by n8n, it integrates essential tools such as the self-hosted n8n platform, Ollama for local LLMs, Qdrant for vector storage, and PostgreSQL, enabling secure self-hosted AI workflows.

99AI

An open-source, commercial-ready AI web platform offering a one-stop solution for integrating a variety of AI services—including AI chat, intelligent search, creative content generation, document analysis, mind mapping, and risk management. It supports private (on-premises) deployment, multi-user management, and commercial operations, making it suitable for enterprises, teams, or individual developers building custom AI services.

vLLM

A high-throughput, memory-efficient library for large language model inference and serving that supports tensor and pipeline parallelism.

Xorbits Inference (Xinference)

Xorbits Inference (Xinference) is a versatile, open-source library that simplifies the deployment and serving of language models, speech recognition models, and multimodal models. It empowers developers to replace OpenAI GPT with any open-source model using minimal code changes, supporting cloud, on-premises, and self-hosted setups.

New API

An open-source, next-generation LLM gateway and AI asset management system that unifies various large model APIs (such as OpenAI and Claude) into a standardized interface. It provides a rich UI, multi-language support, online recharge, usage tracking, token grouping, model charging, and configurable reasoning effort, making it suitable for personal and enterprise internal management and distribution.

Text Embeddings Inference

An open-source, high-performance toolkit developed by Hugging Face for deploying and serving text embeddings and sequence classification models. It features dynamic batching, optimized transformers code (via Flash Attention and cuBLASLt), support for multiple model types, and lightweight docker images for fast inference.

LM Studio

LM Studio is a desktop application that enables users to run local and open large language models (LLMs) on their computer. Available for Mac and Windows, it provides an interface for discovering, downloading, and experimenting with local LLMs.

GPT-RAG

GPT-RAG is an enterprise-grade Retrieval-Augmented Generation (RAG) solution accelerator designed for integrating Azure Cognitive Search and Azure OpenAI services to power ChatGPT-style and Q&A experiences. It provides a modular architecture featuring data ingestion, an orchestrator (with options for Semantic Kernel functions or AutoGen-driven agentic workflows), and customizable front-end interfaces for efficient deployment in secure, enterprise environments.

GPT4All

A tool that enables running local large language models (LLMs) on consumer hardware, offering offline LLM inference capabilities.

ai-gateway

ai-gateway is an open-source API gateway that orchestrates AI model requests from multiple providers (e.g., OpenAI, Anthropic, Gemini). It includes features such as guardrails, cost control, custom endpoints, and detailed tracing (using spans), making it a backend tool for managing and routing AI API calls.

Edge AI Sizing Tool

A tool to assist in sizing and planning deployments for edge AI systems, complete with Docker Compose integration.

ClaraVerse

ClaraVerse is a privacy-first, fully local AI workspace that integrates multiple AI functionalities including Ollama LLM chat, tool calling, an agent builder, Stable Diffusion image generation, and n8n-style automation. It is designed to run entirely on your machine without any cloud backend or API keys, ensuring complete data privacy.

OVHcloud AI Endpoints Beta

A beta service from OVHcloud that provides secure, token-authenticated API endpoints to access a curated list of open-source AI models. It allows developers to integrate cutting-edge AI capabilities—including LLMs, vision models, and more—into their applications, leveraging OVHcloud GPU infrastructure and offering detailed usage metrics and documentation.

Developer Tools

85 tools

Onboard AI

An AI-powered tool that transforms any GitHub repository link into a subject-matter expert interface, enabling users to query repository code and functionality.

AI Playground

An open-source AI PC starter application for image creation, stylizing, and a chatbot, designed for systems powered by Intel® Arc™ GPU. It supports various generative AI libraries including Stable Diffusion and Llama models.

Hugging Face Accelerate

A simple way to launch, train, and use PyTorch models on almost any device with support for distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP/DeepSpeed.

Diffusers

A library implementing state-of-the-art diffusion models for image, video, and audio generation, supporting both PyTorch and FLAX frameworks.

Bolt.new

An AI-powered full-stack web development agent that allows users to prompt, run, edit, and deploy applications directly from the browser.

Crawl4AI

An open-source, LLM-friendly tool designed to crawl and extract data, facilitating content aggregation for AI applications.

OpenVINO

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference across various platforms. It supports models trained with popular frameworks and enhances performance for deep learning tasks in computer vision, automatic speech recognition, and natural language processing.

bolt.diy

An open-source AI coding assistant that allows you to prompt, run, edit, and deploy full-stack web applications using various LLMs, with support for multiple model providers.

LangChain

A comprehensive framework for building context-aware applications powered by large language models, featuring standard interfaces for models, embeddings, and vector stores.

Dynamic Speculation

A novel method developed by Intel labs and Hugging Face that accelerates text generation by up to 2.7x using dynamic speculation lookahead in language models, integrated into the Transformers library.

Hugging Face Transformers

A comprehensive library of pretrained models for text, vision, audio, video, and multimodal tasks, enabling fine-tuning and inference across many generative AI use cases.

AI App Generator

Generate complete Next.js 14 apps from your AI idea with features like live sandbox testing, instant API setup, and full code ownership. Build, modify, and launch AI apps quickly and for free.

ReactAI

Open source free AI React components builder that allows users to create functional React components quickly without requiring an API key, providing unlimited usage ideal for developers, startups, and teams.

Exo

A tool to run your own AI cluster at home by partitioning models optimally across everyday devices, enabling distributed AI computation.

lucataco/ai-toolkit

A Cog implementation of ostris/ai-toolkit designed for training LoRA models (specifically for FLUX.1-dev) using a custom image dataset. Note that it is marked as deprecated in favor of ostris/flux-dev-lora-trainer.

Unsloth AI

Unsloth AI is an enterprise platform that accelerates fine-tuning of large language models and vision models by leveraging innovative quantization techniques. It enables faster performance (up to 2.2x faster) and uses significantly less VRAM, making model deployment and training more efficient. The organization also offers open-source tools and models, and is integrated with Hugging Face, with additional details available on its website.

AI Dev Gallery

An open-source project by Microsoft for Windows developers to integrate AI capabilities into apps using local models and APIs. The tool includes over 25 interactive samples, source code in C#, and supports loading models from platforms like Hugging Face and GitHub.

Inference Endpoints by Hugging Face

A fully managed inference deployment service that allows users to easily deploy models (such as Transformers and Diffusers) from the Hugging Face Hub on secure, compliant, and scalable infrastructure. It offers pay-as-you-go pricing and supports a variety of tasks including text generation, speech recognition, image generation, and more.

AutoTrain

Hugging Face AutoTrain is an automated machine learning (AutoML) tool that allows users to train, evaluate, and deploy state-of-the-art ML models without writing code. It supports a range of tasks including text classification, image classification, token classification, summarization, question answering, translation, tabular data tasks, and LLM finetuning, with seamless integration into the Hugging Face ecosystem.

ostris/ai-toolkit

An open‐source toolkit that provides various AI scripts centered around Stable Diffusion and model training. It includes a web UI for starting, stopping, and monitoring jobs, as well as support for training models such as FLUX.1-dev. The repository is implemented in Python (with requirements like PyTorch) and Node.js (for the UI), making it a valuable resource for developers working on AI model training and deployment.

ostris/ai-toolkit

A GitHub repository offering a collection of AI scripts primarily for Stable Diffusion and related AI model training. It includes a web UI for managing and monitoring jobs as well as tools for training models like FLUX.1-dev.

Open-r1

A fully open reproduction of DeepSeek-R1 that supports training with reasoning traces and scales across multiple nodes using TRL’s vLLM backend.

bolt.diy

An open-source tool that lets developers prompt, run, edit, and deploy full-stack web applications using any large language model of their choice. It supports multiple providers like OpenAI, Anthropic, Ollama, and more, and is extendable via the Vercel AI SDK.

AI SDK

AI SDK is a free, open-source TypeScript toolkit that helps developers build AI-powered applications and agents using frameworks such as Next.js, React, Svelte, and Vue, as well as Node.js runtime. It provides a unified API to interact with various model providers like OpenAI, Anthropic, and Google.

DeepScaleR

DeepScaleR is an open-source project that democratizes reinforcement learning (RL) for large language models (LLMs). The repository provides training scripts, model checkpoints, detailed hyperparameter configurations, datasets, and evaluation logs to reproduce and scale RL techniques on LLMs, aimed at reproducibility and research in advanced AI training.

AI Engineer Toolkit

A collection of resources and projects designed to enhance AI development, including prompt optimization, LangChain workflows, and integrations with popular AI frameworks.

Mem0

A memory management tool for AI agents that adds an intelligent memory layer enabling personalized interactions, featuring secure and scalable long-term memory management.

DeepSeek-Coder-V2-Lite-Instruct

An open-source Mixture-of-Experts code language model that provides advanced code intelligence, enabling functionalities comparable to GPT-4-Turbo for coding tasks.

Kiln

Kiln is a rapid AI prototyping and dataset collaboration tool that enables zero-code fine-tuning of large language models, synthetic data generation, evaluations, and team collaboration. It offers intuitive desktop apps for Windows, MacOS, and Linux, along with an open-source Python library for integrating and managing AI workflows.

RLAMA

RLAMA is a powerful AI-driven document question-answering tool that connects to local Ollama models. It allows users to create, manage, and interact with Retrieval-Augmented Generation (RAG) systems for processing and querying documents via a CLI and API server.

Probe

Probe is an AI-friendly, fully local semantic code search engine designed for large codebases. It combines fast text search with code-aware parsing to extract complete code blocks, serving as a key building block for next generation AI coding tools.

TRL

TRL is a comprehensive open-source library that enables post-training of transformer language models using reinforcement learning techniques such as Supervised Fine-Tuning (SFT), Proximal Policy Optimization (PPO), and Direct Preference Optimization (DPO). It integrates with Hugging Face’s Transformers ecosystem and supports efficient scaling with tools like Accelerate and PEFT.

FluxGym

A dead simple web UI for training FLUX LoRA models with low VRAM support, built on Gradio UI (forked from AI-Toolkit) and powered by Kohya Scripts. It simplifies the fine-tuning of LoRA models on systems with limited VRAM (12GB/16GB/20GB).

Chainlit

Chainlit is an open‐source Python framework that enables developers to build production-ready conversational AI applications quickly. It provides a user-friendly interface, optimized step functions, and seamless integration with LLM tools, making it easier to create interactive chatbot experiences.

SkyThought

SkyThought is an open-source toolkit that provides data curation, training (including reinforcement learning enhancements), and evaluation pipelines for cost-effective large language model training (Sky-T1 series). It offers scripts for building, training, and evaluating models such as Sky-T1-32B-Preview, making it a valuable resource for AI developers.

Unsloth

Unsloth is an open-source tool that enables developers to finetune various large language models (such as Llama 4, DeepSeek-R1, Gemma 3, and others) more efficiently. It offers free notebooks, reduced memory usage through dynamic quantization, and faster training performance, making it easier to deploy optimized models to platforms like GGUF, Ollama, vLLM, and Hugging Face.

Lighteval

An all-in-one toolkit for evaluating LLMs on multiple backends, offering detailed sample-by-sample performance metrics and task customization options.

PyTorch Image Models

A comprehensive library offering implementations and optimizations for modern image models, including transformers and efficient CNNs, maintained by Hugging Face.

ComfyUI-nunchaku

A ComfyUI plugin that integrates Nunchaku—an efficient inference engine for 4-bit neural networks quantized with SVDQuant—into the ComfyUI workflow. It enables enhanced performance through features like multi-LoRA, ControlNet support, FP16 attention, and compatibility with modern GPUs.

OpenVINO Toolkit

An open‐source toolkit for optimizing and deploying AI inference on common platforms such as x86 CPUs and integrated Intel GPUs. It offers advanced model optimization features, quantization tools, pre-trained models, demos, and educational resources to simplify production deployment of AI models.

Generative AI Toolkit

A lightweight library to build, deploy, trace, and evaluate LLM-based applications and agents throughout their entire lifecycle with AWS integration (e.g., Amazon Bedrock, DynamoDB, CloudWatch, AWS Lambda).

NVIDIA NeMo

A scalable generative AI framework for developing and deploying AI models, with support for video curation and post-training.

Vercel AI

An AI toolkit for TypeScript that adds client-side embedding support and other AI features directly into developer workflows.

MCP Calculator

Xiaozhi MCP sample program is an open-source project demonstrating the Model Context Protocol (MCP), which enables language models to invoke external tools such as calculators, email operations, knowledge search, and more. It features secure WebSocket communication, real-time streaming, automatic reconnection, and a simple interface for tool creation.

GitHub Models

A service that integrates multiple AI models into the developer workflow, allowing teams to manage, evaluate, and utilize AI models within GitHub.

bolt.diy

bolt.diy is an open-source tool that lets developers prompt, run, edit, and deploy full-stack web applications using any LLM of their choice. It supports multiple providers (including OpenAI, Anthropic, Ollama, and others) via the Vercel AI SDK and is built as a community-driven alternative to proprietary solutions.

Langflow

Langflow is an open-source tool that offers a visual builder and built-in API server for designing, testing, and deploying AI-powered agents and workflows. It supports multi-agent orchestration, code customization in Python, and integrates with major LLMs, vector databases, and other AI tools.

Seed-Coder

Seed-Coder is a family of lightweight open‐source code language models (LLMs) that come in base, instruct, and reasoning variants (each around 8B parameters). Developed by ByteDance Seed, the models are designed to curate code training data automatically and enhance code generation and reasoning tasks.

DeepWiki

DeepWiki is an AI-powered documentation generator that automatically converts GitHub repositories into comprehensive, wiki-style documentation. It analyzes repository code, README files, and configuration details to produce structured overviews, interactive diagrams, and provides a conversational AI assistant for querying codebase details.

xemantic-ai-tool-schema

A GitHub repository that provides a standardized schema for describing AI tools. It defines the structure and metadata for AI tool information, aimed at developers who want to maintain consistency when documenting or sharing AI tool data.

AI Release Notes

An AI-powered GitHub App that automatically generates comprehensive release notes using commit history and pull request descriptions. It integrates seamlessly with GitHub workflows and leverages OpenAI's APIs to summarize new features, bug fixes, and other changes.

PandasAI

PandasAI is a Python platform that makes data analysis conversational by allowing users to interact with their databases or datalakes (e.g., SQL, CSV, parquet) using natural language queries powered by LLMs and Retrieval-Augmented Generation (RAG). It supports integration in Jupyter notebooks, Streamlit apps, or via a client-server architecture, serving both technical and non-technical users.

ostris/flux-dev-lora-trainer

A Replicate-hosted tool for fine-tuning the FLUX.1-dev model using the ai-toolkit with a LoRA approach. Users can initiate training jobs on Nvidia H100 GPUs to obtain custom-trained weights via an automated, cloud-based workflow.

Replicate Playground

A web platform that allows users to experiment with, compare, and rapidly prototype AI models via API calls.

DeepEval

DeepEval is an open-source evaluation toolkit for AI models that provides advanced metrics for both text and multimodal outputs. It supports features like multimodal G-Eval, conversational evaluation using a list of Turns, and integrates platform support along with comprehensive documentation.

Kimi-Dev

Kimi-Dev is an open-source coding LLM (Kimi-Dev-72B) designed for software engineering tasks such as automated code repair and test case generation. It uses large-scale reinforcement learning to autonomously patch repositories, ensuring that full test suites pass before accepting changes. The tool is available for download and deployment via GitHub and Hugging Face.

Konveyor AI (Kai)

Kai is an AI-enabled tool designed to simplify the modernization of application source code to new platforms. It utilizes static code analysis and large language models guided by Konveyor’s historical migration reports to generate targeted code transformation suggestions, continuously learning from past migrations to improve future recommendations.

OpenCode

OpenCode is an AI-powered coding assistant that integrates with various AI models to help developers write, modify, and test code. It supports executing custom commands to streamline coding tasks.

DeepCoder-14B-Preview

DeepCoder-14B-Preview is a code reasoning large language model fine-tuned from DeepSeek-R1-Distilled-Qwen-14B using distributed reinforcement learning (GRPO+ and iterative context lengthening) to enhance long-context code generation and reasoning. It achieves competitive scores on LiveCodeBench and other coding benchmarks, and is released under the MIT License on Hugging Face.

Graphiti

Graphiti is a framework for building and querying temporally-aware, real-time knowledge graphs tailored for dynamic AI agents. It continuously integrates user interactions, structured enterprise data, and external information, enabling state-based reasoning, task automation, and precise historical queries without complete graph recomputation. It also serves as the core memory layer for Zep’s AI agents.

GitHub Models

GitHub Models is an official suite of developer tools offered by GitHub. It provides a model catalog, prompt management, and quantitative evaluation capabilities to help developers test, compare, evaluate, and integrate AI models directly into their repositories. It supports the entire lifecycle from prototyping to scaling in enterprise settings.

ColossalAI

A toolset designed to make large AI models cheaper, faster, and more accessible by optimizing training and deployment pipelines.

aitk

An open source Artificial Intelligence Toolkit offering a collection of Python tools and computational essays for AI research and development.

AI-Toolkit

A toolkit offering various AI functionalities, including ChatGPT-powered summarization, image generation, and prompt engineering.

Data Formulator

An open-source tool by Microsoft that transforms data and creates rich visualizations using AI. It enables users to load data from various sources (like MySQL, PostgreSQL, Azure, and Amazon S3), interactively drag-and-drop to specify charts, and employs AI agents to generate SQL queries for dynamic data transformation and visualization.

Aider

An AI tool that integrates multiple models and providers, featuring CLI options for managing thinking tokens and model settings.

Intel AI Playground

A tool that provides a single installer for running AI applications on supported Intel hardware, facilitating rapid experimentation and testing.

AI Access

A flexible PHP library that provides a consistent interface for accessing various AI models such as Gemini, OpenAI, Anthropic, and others.

GitHub Spark

GitHub Spark is a newly introduced AI-native application tool designed to enable developers to build personal, customized, and fully functional web applications entirely using natural language. It aims to support GitHub's vision of reaching 1 billion developers by simplifying app development.

AI Infinity

A GitHub-hosted collection of AI tools aimed at providing frameworks and utilities to experiment with novel ideas.

Haystack

Haystack is an open-source LLM orchestration framework by deepset-ai that enables developers to build production-ready applications such as retrieval-augmented generation (RAG), question answering systems, semantic search, and conversational agents by connecting various components (models, vector databases, file converters, etc.) into customizable pipelines.

Jupyter AI

Jupyter AI is a generative AI extension for JupyterLab that integrates advanced AI capabilities directly into Jupyter notebooks. It provides a %%ai magic command for turning notebooks into a generative AI playground, a native chat UI, and supports a wide range of AI model providers, including local support via GPT4All and Ollama.

vercel/ai

A developer-focused AI toolkit released by Vercel that includes libraries and tools for integrating AI capabilities into applications.

seismometer

seismometer is an open-source Python package for evaluating AI model performance with a focus on healthcare. It provides templates and tools to analyze statistical performance, fairness, and the impact of interventions on outcomes using local patient data. Although designed for healthcare applications, it can be used to validate models in any field.

OpenAI Codex

A lightweight coding agent that runs in your terminal, aiding in code generation and completions through AI.

Determined

An open-source deep learning platform for distributed training, experiment management, and scalable AI model deployment.

system-prompts-and-models-of-ai-tools

A GitHub repository that aggregates open-sourced system prompts, tools, and AI models for various AI agent projects (e.g., Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, and more). It provides insights and source code resources aimed at improving the deployment and functionality of AI-driven applications, particularly in game development and agent frameworks.

Dataset to Model Monitor

A Hugging Face Space tool that automatically tracks and notifies users when new models are trained on a specified dataset (HuggingFaceM4/VQAv2). It leverages the librarian bot to post alerts in a discussion thread, enabling developers and researchers to keep up-to-date with models built on this dataset.

AI SDK Provider for Gemini CLI

An unofficial community provider for the Vercel AI SDK that enables the integration of Google’s Gemini models via the Gemini CLI Core library and Google Cloud Code endpoints. It offers streaming support, multimodal capabilities, tool/function calling, OAuth authentication, and full TypeScript support with Zod schemas.

dyad

Dyad is a free, local, open‐source AI app builder that enables developers to build and deploy AI applications on their own machines. It emphasizes privacy, cross-platform compatibility, and uses users’ own API keys without vendor lock-in, serving as an alternative to platforms like Bolt or Lovable.

OpenAI Evals

OpenAI Evals is an open-source framework for evaluating large language models (LLMs) and LLM systems. It offers a registry of benchmarks and tools for developers and researchers to run, customize, and manage evaluations to assess model performance and behavior.

Haystack

A tool enabling the building of search and document-based AI applications using large language models.

Zero Shot Object Detection Arena

A Hugging Face Space that provides an interactive arena for zero-shot object detection. Users can run and experiment with object detection models without prior training, leveraging state-of-the-art zero-shot techniques.

GitHub Copilot Extensions

GitHub Copilot Extensions is a public beta platform that enables developers to build, install, and integrate extensions into Copilot Chat on GitHub.com and supported editors (Visual Studio and VS Code). It comes with an extension toolkit including documentation, code samples, an SDK, and a CLI debugging tool, allowing seamless incorporation of external tools and contextual data into the developer workflow.

MCP Toolbox for Databases

MCP Toolbox for Databases is an open source MCP server that simplifies building Gen AI tools for database access. It handles complexities such as connection pooling, authentication, and observability, enabling seamless integration of database functionalities into AI workflows.

Generative Media

111 tools

Recraft V3

A text-to-image generative model tool that pioneers new approaches in design and image generation.

Real-ESRGAN

An AI-powered image upscaling tool that enlarges images while enhancing details and reducing artifacts, often used for improving image resolution.

CodeFormer

A robust face restoration algorithm designed to repair old photos or improve AI-generated faces, delivering improved image quality.

AI Image Upscaler With Super Resolution

An image upscaling tool using Real-ESRGAN, designed to improve image resolution and quality, available on Replicate.

DeepBrain AI Studios

An AI tool for generating realistic AI avatars and creating text-to-video content tailored for creative projects.

Submagic

An AI-powered video tool that automatically identifies the best moments in your videos and converts them into viral clips.

NSFWGenerator

An AI tool that generates and browses NSFW images through advanced algorithms.

AI Image & Photo Restoration

A collection of AI-powered tools on Replicate designed for restoring and enhancing images, including models like CodeFormer and others for upscaling, colorization, and noise removal.

GFPGAN

A practical AI tool for face restoration, capable of enhancing and restoring old and AI-generated faces, available for self-hosting via Docker.

FLUX.1 [dev]

A 12-billion parameter text-to-image model focused on generating high-fidelity images from text with state-of-the-art quality.

OpenVoice

OpenVoice is a versatile instant voice cloning framework that allows users to generate speech in multiple languages using only a short audio clip from a reference speaker. The tool provides granular control over voice styles, such as emotion, accent, rhythm, pauses, and intonation, and supports zero-shot cross-lingual voice cloning, enabling users to clone voices across different languages without needing training data for those languages.

WhisperX

WhisperX is an Automatic Speech Recognition (ASR) tool that provides fast and accurate transcriptions with word-level timestamps and speaker diarization features, enhancing the capabilities of OpenAI's Whisper model.

Parler-TTS

A text-to-speech inference and training library for generating high-fidelity speech from text, offering an open-source solution for TTS applications.

AI Image Generator – Text to Image Models

A platform that hosts various AI models for generating images from text prompts using advanced techniques such as Stable Diffusion and FLUX.1, showcasing models with capabilities including realistic text generation, SVG creation, and high-quality image outputs.

MagicQuill

MagicQuill is an intelligent interactive image editing system that enables precise image modification through AI-powered suggestions and a user-friendly interface, featuring functionalities like local editing and drag-and-drop support.

Clarity AI Upscaler

Clarity AI Upscaler is an advanced image upscaling tool that utilizes Stable Diffusion processes to enhance and recreate details in images, providing users with the option to balance fidelity and creativity through parameters such as diffusion strength. The tool supports tiled diffusion techniques for handling large images and incorporates ControlNet for maintaining structural integrity while enhancing details.

SpeechBrain

An all-in-one open-source conversational AI toolkit based on PyTorch offering speech recognition, text-to-speech, speaker recognition, and more.

Whisper Large

A robust speech recognition model based on a Transformer architecture that supports multilingual transcription, speech translation, and language identification.

Adobe Firefly

Adobe Firefly is an AI art generator developed by Adobe, enabling users to create images, audio, vectors, and videos from text prompts. It integrates with Adobe Creative Cloud, enhancing workflows with generative AI capabilities such as Text-to-Image, Generative Fill, and more.

Retrieval-based Voice Conversion WebUI

An open-source web UI that enables voice conversion using retrieval-based methods, offering configurable options and support for different models.

InvokeAI

InvokeAI is an open-source creative engine based on Stable Diffusion models that empowers professionals, artists, and enthusiasts to generate high-quality visual media using AI-driven technologies. It features a user-friendly WebUI and serves as a foundation for various commercial and creative products.

openai/whisper-large-v3-turbo

A finetuned, pruned version of Whisper large-v3 for automatic speech recognition and speech translation. This model reduces the number of decoding layers from 32 to 4 to achieve much faster inference, with only a minor quality trade-off. It supports 99 languages and integrates with Hugging Face Transformers for efficient transcription and translation.

Replica

An AI tool capable of replicating human voice characteristics to generate expressive, high-quality speech from text.

OpenVoice V2

OpenVoice V2 is an advanced text-to-speech model that provides instant voice cloning with accurate tone color reproduction and flexible voice style control. It supports zero-shot cross-lingual synthesis in multiple languages and has improved audio quality over its previous version. Released under the MIT License, it is geared towards both research and commercial use.

Whisper Large v3

A state-of-the-art automatic speech recognition and translation model trained on over 5 million hours of data, capable of robust zero-shot generalization.

Whisper by OpenAI

A robust, general-purpose speech recognition model capable of multilingual transcription, translation, and language identification, built using a transformer architecture.

Stable Diffusion 3 Medium

A multimodal diffusion transformer model that generates images from textual descriptions with improvements in image quality, typography, and resource-efficiency for creative applications.

Ultimate SD Upscale with ControlNet Tile

An advanced image upscaling model leveraging Stable Diffusion 1.5 and ControlNet Tile to enhance image quality. Accessible via an API on Replicate and optimized to run with Nvidia A100 GPUs.

ComfyUI

A powerful and modular GUI, API, and backend for diffusion models that allows users to design and execute advanced stable diffusion pipelines using a graph/node/flowchart-based interface. It supports image, video, audio models, and various optimizations.

DeepFaceLab

Industry-leading software for creating deepfakes, used widely by creators to swap faces and generate realistic video manipulations.

SD.Next

SD.Next is an all-in-one AI generative image tool implemented as a GitHub repository. It provides a robust diffusion-based framework for text-to-image generation, supporting multiple UIs and a wide range of models and platforms including CUDA, ROCm, DirectML, and more. It features advanced processing optimizations such as model compile, quantization and compression as well as built-in queue management and installer for updates.

Anything V4.0

An AI image generation model known for incorporating components from AbyssOrangeMix2 to deliver versatile image synthesis across styles.

ACE++

ACE++ is an instruction-based image creation and editing toolkit that uses context-aware content filling for tasks such as portrait generation, subject-driven image editing, and local editing. The tool supports diffusion-based models, provides installation instructions, demos, and guides for fine-tuning using LoRA, and is hosted on Hugging Face.

DALL·E mini by Craiyon

DALL·E mini (now known as Craiyon) is an AI-driven text-to-image generation tool that creates images based on text prompts. The tool is available as a running app on Hugging Face Spaces, allowing users to explore creative image generation directly from their browser.

OpenVoice

OpenVoice is an instant voice cloning tool developed by MIT and MyShell. It offers accurate tone color cloning, flexible voice style control (including emotion, accent, rhythm, pauses, and intonation), and supports zero-shot cross-lingual voice cloning. The V2 release improves audio quality, provides native multi-lingual support (English, Spanish, French, Chinese, Japanese, Korean), and is available under the MIT License for free commercial use.

ClearerVoice-Studio

An open-source, AI-powered speech processing toolkit offering state-of-the-art pretrained models and utilities for tasks such as speech enhancement, separation, super-resolution, and target speaker extraction.

Stable Diffusion 2-1

The latest iteration of StabilityAI’s text-to-image model, delivering high-quality image generation from text prompts.

AI Comic Factory

An AI tool that generates illustrated comic panels from text descriptions, enabling creative storytelling.

Recraft V3 SVG

A specialized tool for generating high-quality SVG images, including logos and icons, from text prompts.

Bark

Bark is a transformer-based text-to-audio model by Suno that generates highly realistic, multilingual speech as well as music, background noise, and simple sound effects. It also produces nonverbal cues like laughing or sighing. The model is provided for research purposes with pretrained checkpoints available for inference.

minimax/video-01-director

An advanced AI video generation model that creates high-definition 720p videos (up to 6 seconds) with cinematic camera movements. It allows users to control camera movements through both bracketed commands and natural language descriptions.

CosyVoice

A multi-lingual large voice generation model which provides full-stack capabilities for inference, training, and deployment of high-fidelity voice synthesis.

Stable Diffusion web UI

An open-source web interface built with Gradio for interacting with Stable Diffusion. It provides features such as txt2img and img2img modes, inpainting, outpainting, upscaling, embedding management, and various advanced image generation tools, making it easy to experiment with and deploy Stable Diffusion.

GPT-SoVITS

A few-shot voice cloning and text-to-speech WebUI that can train a TTS model with just 1 minute of voice data. It supports zero-shot and few-shot TTS, cross-lingual inference, and includes integrated tools for voice separation, dataset segmentation, and ASR, making it easier to build and deploy custom TTS models.

CLIP Interrogator

A prompt engineering tool that leverages OpenAI's CLIP and Salesforce's BLIP to analyze an input image and generate optimized text prompts. These prompts can be used with text-to-image models like Stable Diffusion to produce creative art.

LuminaBrush

A creative ML app hosted on Hugging Face Spaces that lets users explore and generate artistic images using community-built AI models.

EasyDeepNude

EasyDeepNude is an AI tool that implements a reimagined version of the controversial DeepNude project. It provides both a command-line interface (CLI) and a graphical user interface (GUI) to process and transform photos using deep learning models. The CLI version can be integrated into automated workflows, while the GUI version offers a user-friendly cropping system for easy use. Note: This is an early alpha release and may have compatibility issues.

Ideogram-v2-turbo

A fast text-to-image generation model ideal for quick ideation and providing rough compositional sketches.

Upscayl

Upscayl is a free and open-source AI-powered image upscaler that enlarges and enhances low-resolution images using advanced AI algorithms. It is available for Linux, macOS, and Windows, and requires a Vulkan compatible GPU.

Playground v2.5 – 1024px Aesthetic Model

A diffusion-based text-to-image generative model that produces highly aesthetic images at a resolution of 1024x1024 across various aspect ratios. It outperforms several state-of-the-art models in aesthetic quality and is accessible via an API on Replicate, with integration support for Hugging Face Diffusers.

Stable Virtual Camera

A 1.3B diffusion model for novel view synthesis that generates 3D consistent novel views and videos from multiple input images and freely specified target camera trajectories. It is designed for research and creative non-commercial use.

Easel AI

An AI tool that offers advanced face swap and avatar generation, preserving user likeness and enabling creative image manipulations.

Wan2.1-T2V-14B

Wan2.1-T2V-14B is an advanced text-to-video generation model that offers state-of-the-art performance, supporting both 480P and 720P resolutions. It is part of the Wan2.1 suite and excels in multiple tasks including text-to-video, image-to-video, video editing, and even generating multilingual text (Chinese and English) within videos. The repository provides detailed instructions for single and multi-GPU inference, prompt extension methods, and integration with tools like Diffusers and ComfyUI.

FLUX.1 Redux

An adapter for FLUX.1 base models that generates slight variations of a given image, enabling creative refinements and flexible high-resolution outputs.

Coqui TTS

A deep learning toolkit for advanced Text-to-Speech generation, providing pretrained models across 1100+ languages, tools for training and fine-tuning models, and utilities for dataset analysis. Battle-tested in both research and production environments.

Hugging Face Speech-to-Speech

An open-sourced, modular speech-to-speech pipeline developed by Hugging Face that integrates Voice Activity Detection, Speech-to-Text, Language Models, and Text-to-Speech. It leverages models from the Transformers library (e.g., Whisper, Parler-TTS) and supports various deployment approaches including server/client and local setups.

Flux.1

The official inference repository for FLUX.1 models, offering AI-powered text-to-image and inpainting services, maintained in collaboration with its authors.

coqui/XTTS-v2

A text-to-speech (TTS) voice generation model that enables high-quality voice cloning and cross-language speech synthesis using just a 6-second audio clip. It supports 17 languages, offers emotion and style transfer, improved speaker conditioning, and overall stability improvements over its previous version.

FaceFusion

FaceFusion is an industry-leading face manipulation platform that enables advanced face swapping, deepfake creation, and lip-syncing. It features a command-line interface with various job management commands (batch-run, headless-run, etc.) and provides installers for Windows and macOS.

ghibli-easycontrol

An open-source model hosted on Replicate that transforms input images with a Ghibli-style aesthetic, offering high-quality, fast, and cost-effective image translation via an API.

topazlabs/image-upscale

An AI-powered, professional-grade image upscaling tool by Topaz Labs. It offers multiple enhancement models (Standard, Low Resolution, CGI, High Fidelity, Text Refine) to upscale images up to 6x with options for facial enhancement, making it ideal for improving various image types including digital art and text-heavy photos.

fofr/color-matcher

A model hosted on Replicate that performs color matching and white balance correction for images via an API. It allows users to automatically adjust image colors to achieve better balance.

UniRig

UniRig is an AI-based unified framework for automatic 3D model rigging. It leverages a GPT-like transformer to predict skeleton hierarchies and per-vertex skinning weights, automating the traditionally time-consuming rigging process for diverse 3D assets including humans, animals, and objects.

Dia

A text-to-speech (TTS) model capable of generating ultra-realistic dialogue in one pass, providing real-time audio generation on enterprise GPUs.

Photoshop AI Tools Unlocked Edition

An AI-powered extension for Adobe Photoshop that unlocks advanced editing features including AI image enhancement, smart object removal, background manipulation, and custom filters. Designed for professionals and creative enthusiasts on Windows 10/11, it automates tedious tasks and elevates creative workflows.

Hunyuan3D-2.0

An AI application that generates high-resolution 3D models from images or text descriptions, enabling creative 3D content creation through AI.

Kling Lip Sync

Kling Lip Sync is an API that changes the lip movements of a person in a video to match supplied audio or text. It allows users to add lip-sync to any video, integrating video content with new audio inputs. The model sends data from Replicate to Kuaishou and offers pricing based on the seconds of video generated.

SV4D 2.0

SV4D 2.0 is an enhanced 4D diffusion model by Stability AI for high-fidelity novel-view video synthesis and 4D asset generation. It generates 48 frames (12 video frames across 4 camera views) from an input video and uses an autoregressive approach for longer video generation. Designed for research purposes, it offers improved fidelity, sharper motion details, and better spatio-temporal consistency compared to previous models.

google/lyria-2

Lyria 2 is an AI music generation model by Google that produces professional-grade 48kHz stereo audio from text-based prompts. It supports various genres and implements SynthID for audio watermarking, making it suitable for direct project integration.

HeyGem

HeyGem is an open-source AI avatar project that enables offline video synthesis on Windows. It precisely clones your appearance and voice to generate ultra-realistic digital avatars, allowing users to create personalized videos without an internet connection.

FLUX.1 Kontext

An AI tool that merges two images into a single cohesive output using creative image blending with text prompts.

FLUX.1 Kontext

FLUX.1 Kontext is a new image editing model from Black Forest Labs that leverages text prompts for precise image modifications, including color swaps, background edits, text replacements, style transfers, and aspect ratio changes. It features multiple variants (Pro, Max, and an upcoming Dev) along with a conversational interface (Kontext Chat) to simplify the editing process.

FLUX.1 Kontext – Text Removal

A dedicated application built on the FLUX.1 Kontext image editing model from Black Forest Labs that removes all text from an image. The tool is available on Replicate with API access and a playground for experimentation, showcasing its specialized text removal functionality.

FLUX Kontext max - Multi-Image List

An AI tool that combines multiple images using FLUX Kontext Max, a premium image editing model from Black Forest Labs. It accepts a list of images to creatively merge them and produce enhanced, text-guided composite outputs. The tool is available on Replicate and is designed for versatile image editing tasks, including creative compositing and improved typography generation.

FLUX.1 Fill [dev]

FLUX.1 Fill [dev] is a 12-billion parameter rectified flow transformer developed by Black Forest Labs designed for text-guided inpainting. It fills specific areas in an existing image based on a textual description, enabling creative image editing workflows. It comes with a non-commercial license and integrates seamlessly with diffusers.

InvokeAI

An open-source tool empowering artists and creators to collaborate with AI for generating and editing images, widely used in the generative art community.

ComfyUI-RMBG

A custom node for ComfyUI that provides advanced image background removal and segmentation (including object, face, clothes, and fashion segmentation) by integrating multiple models like RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefNet, SAM, and GroundingDINO.

inswapper

inswapper is an open-source, one-click face swapper and restoration tool powered by insightface. It utilizes ONNX runtime for inference, along with integration of face restoration techniques (e.g., CodeFormer) to enhance image quality and produce realistic face swaps.

Veo 3

Veo 3 is an AI-powered video generation model from Google DeepMind that produces both visuals and native audio, including sound effects, ambient noise, dialogue, and accurate lip-sync. It delivers hyperrealistic motion, prompt adherence, and even can generate video game worlds, making it a versatile media generation tool.

Google Veo 3

A text-to-video generation tool from Google DeepMind, featuring native audio generation and improved prompt adherence for hyperreal outputs.

VCClient Real-time Voice Changer

An open‑source, AI‑powered real‑time voice conversion tool that uses various models (e.g., RVC, Beatrice v1/v2) to transform voices dynamically. It supports multiple platforms (Windows, Mac, Linux, Google Colab) and offers both standalone and networked configurations.

Wan2.1-I2V-14B-720P

An advanced Image-to-Video generation model from the Wan2.1 suite by Wan-AI that produces high-definition 720P videos from input images. It features state-of-the-art performance, supports multiple tasks including text-to-video, video editing, and visual text generation in both Chinese and English, and is optimized for consumer-grade GPUs.

Recraft V3

Recraft V3 (code-named red_panda) is a state-of-the-art text-to-image generation model that excels at creating high-quality images with long text integration and vector art support. It offers precise control over design elements, enabling users to position text and visual components exactly as intended, and supports brand style customization.

test-yash-model-4-new-2

A custom diffusion-based model designed for generating unique fashion designs from text prompts. The API reference page provides detailed parameters for controlling aspects like prompt strength, aspect ratio, model selection, and output format.

Minimax Speech 02 HD

A high-fidelity text-to-audio (T2A) tool that offers advanced voice synthesis, voice cloning, emotional expression, and multilingual capabilities, optimized for applications such as voiceovers and audiobooks.

Chatterbox TTS by Resemble AI

A state-of-the-art open-source text-to-speech model that supports advanced emotion exaggeration control and embeds imperceptible watermarks in the audio output.

Resemble Chatterbox TTS

Resemble Chatterbox is an open source, production-grade text-to-speech model by Resemble AI. It features unique emotion exaggeration control, instant voice cloning from short audio, built-in watermarking, and alignment-informed inference, making it ideal for creating expressive, natural speech for various applications.

IP-Adapter

IP-Adapter is a lightweight image prompt adapter developed by Tencent AI Lab that enables pre-trained text-to-image diffusion models to incorporate image prompts along with text prompts for multimodal image generation. With only 22M parameters, it offers comparable or improved performance compared to fine-tuned models and supports integration with various controllable generation tools.

Realistic Vision V6.0 B1 noVAE

Realistic Vision V6.0 "New Vision" is a beta diffusion-based text-to-image model focused on realism and photorealism. It is released on Hugging Face and provides detailed guidelines on resolutions, generation parameters, and recommended workflows (including using a VAE for quality improvements).

Shap-E

Shap-E is an official GitHub repository by OpenAI for generating 3D implicit functions conditioned on text or images. It provides sample notebooks and usage instructions for converting text prompts or images into 3D models, making it a practical tool for generating 3D objects.

Juggernaut-XL v8

Juggernaut-XL v8 is a fine-tuned text-to-image diffusion model built on Stable Diffusion XL, designed for photo-realistic art generation. It is part of the RunDiffusion suite and is intended for creative visual content generation, though it cannot be used behind API services. Business inquiries and commercial licensing are available via email.

CSM (Conversational Speech Model)

CSM is a conversational speech generation model by SesameAILabs. It generates RVQ audio codes from text and audio inputs using a Llama backbone for language processing and a specialized audio decoder to produce Mimi audio codes, enabling interactive conversational speech synthesis.

FLUX.1 Kontext [dev]

FLUX.1 Kontext [dev] is a state-of-the-art, open-weight text-based image editing model developed by Black Forest Labs. It enables detailed image edits using text prompts, such as style transfer, object modifications, text replacement, background swapping, and preserving character consistency. The model offers clear instructions on best prompting practices and is available under a non-commercial license with commercial use options via Replicate.

StoodioAI Fashion Model

A custom-trained model for generating unique fashion designs, available via API on the Replicate platform.

InvokeAI

An open-source creative tool for AI-driven image generation and art creation.

Photoshop Fusion Beta

An AI-powered beta extension for Photoshop aimed at enhancing digital creativity through generative image editing features.

UndressAI

UndressAI is an AI-powered undressing tool that processes images to generate realistic undressed versions. The platform emphasizes speed, high-quality outputs, and robust enterprise-grade security, aiming to outperform competitors by addressing issues like outdated technology and poor privacy found in similar tools.

AI-WebTV

AI-WebTV is a live, automated video generation demonstration hosted on Hugging Face Spaces. It streams generated video content using a fine-tuned Modelscope-based model (producing outputs similar to the Zeroscope model), and features an automated prompt database with different themes. The project serves as a public demonstration with research-only guidelines to avoid violent or excessively gory content.

stoodioai/test-yash-model-4-new-2

A custom trained generative image model that produces unique fashion designs. It supports text-to-image and image-to-image (inpainting) modes via an API, with configurable parameters such as prompt, aspect ratio, model type, and output quality.

krita-ai-tools

A collection of AI-powered tools designed as a plugin for Krita, enhancing digital painting workflows with advanced features like precise segmentation and mask generation using BiRefNet models. Built against Krita 5.2.x, it improves selection accuracy and performance for digital art creation.

AI Undresser

An AI-powered tool available via Replicate, designed for specialized image processing and transformation tasks.

Recraft V3 SVG

A text-to-image generative model that produces high-quality SVG (vector) images including logos, icons, and branded designs. It offers precise control over text and image placement, supports a variety of styles, and allows brand style customization by uploading reference images.

Whisper French Demo

A Hugging Face Space demo that leverages Whisper-based speech recognition specifically tuned for French. Users can interact with this web app to transcribe French audio using state-of-the-art Whisper technology, making it a practical tool for ASR in the French language.

Chatterbox TTS

Chatterbox TTS is Resemble AI's first production-grade open source text-to-speech model. It offers speech generation with voice cloning and unique features such as emotion exaggeration control, alignment-informed inference, and built-in imperceptible watermarks. It is built on a 0.5B Llama backbone and benchmarked against leading closed-source systems.

TTS Arena V2

A Hugging Face Space by TTS-AGI that provides an interface for text-to-speech (TTS) generation. Although details are minimal, the page indicates a running TTS application, likely aimed at evaluating or demonstrating various TTS models.

Flux Schnell

A fast text-to-image generation model optimized for local development and personal use, developed by Black Forest Labs. It provides an API for rapid text-to-image synthesis, making it ideal for personal projects and local experimentation.

OuteTTS

An open‐source text-to-speech (TTS) model released under the CC‐BY 4.0 License. The model has multiple versions (e.g., 0.1 350M, 0.2 500M, and 1.0 1B) and is also available in GGUF format. The tool is demonstrated via a Hugging Face Space and discussed within its community forum.

Ideogram v2 Inpainting Model

Ideogram v2 is a high-quality inpainting model available via Replicate’s API. It comes in two variants – the best quality version and a faster 'turbo' variant – and is adept at not only inpainting images but also generating new images (including effective text generation) for various creative applications.

FLUX Family of Models (Black Forest Labs)

A suite of API-accessible image generation and editing models that enable users to generate high-resolution images from text prompts, perform advanced inpainting, outpainting, edge-guided editing, and rapid image variation. The collection includes variants optimized for realism (FLUX1.1 Pro Ultra), speed (FLUX.1 Schnell), and prototyping (FLUX.1 Dev), among others.

FLUX.1 Kontext

FLUX.1 Kontext is an advanced image editing model from Black Forest Labs that enables users to modify images through text prompts. It supports various editing tasks such as style transfer, text editing, and character consistency adjustments. It is available in multiple variants (Pro, Max, and an upcoming Dev version) to balance quality and speed.

FLUX.1

FLUX.1 is an innovative text-to-image generative model that uses a novel flow matching technique instead of traditional diffusion. It produces images with a distinctive, fluid aesthetic, achieves faster generation speed, and offers refined control over light, texture, and composition. An optimized variant (FLUX.1 [schnell]) is available for local execution on Replicate.

Foundation Models

112 tools

Janus-1.3B

A unified multimodal AI model that decouples visual encoding to support both understanding and generation tasks.

Qwen2.5-7B

Qwen2.5-7B is a large language model designed for text generation, featuring improvements in coding, mathematics, instruction following, long text generation, and multilingual support. It supports context lengths up to 128K tokens and is intended for sophisticated NLP tasks.

BGE-M3

BGE-M3 is a versatile embedding model from the Beijing Academy of Artificial Intelligence that supports dense retrieval, multi-vector retrieval, and sparse retrieval for text embeddings. It is designed to work in over 100 languages and can handle inputs ranging from short sentences to long documents of up to 8192 tokens.

DeepSeek-V3

A large-scale AI model optimized for inference in multiple modes (BF16, FP8), supporting advanced language processing tasks.

Llama 3

Llama 3 is an open access large language model (LLM) released by Meta, available in various configurations (8B and 70B parameters) with capabilities for fine-tuning and integrations into platforms like Hugging Face, Google Cloud, and Amazon SageMaker.

UNfilteredAI-1B

A large-scale text generation model designed for creative and unconstrained content production without traditional filtering.

OmniGen

OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts, simplifying the image generation process without the need for additional network modules or preprocessing steps. It supports various tasks such as text-to-image generation, identity-preserving generation, image editing, and more.

YOLOv10

YOLOv10 is a real-time end-to-end object detection tool that improves upon previous YOLO versions through NMS-free training and a comprehensive architectural design to enhance efficiency and accuracy. It offers state-of-the-art performance across various model sizes and is implemented in PyTorch.

BLIP-2

BLIP-2 is an advanced visual-language model that allows zero-shot image-to-text generation, enabling tasks such as image captioning and visual question answering using a combination of pretrained vision and language models.

DeepSeek-VL2

A series of advanced vision-language models designed for multimodal understanding, available in multiple sizes to suit varying complexity and performance requirements.

YOLOv5

YOLOv5 is a popular open-source AI tool aimed at object detection, image segmentation, and image classification, leveraging PyTorch for model building and deployment. It supports various deployment formats including ONNX, CoreML, and TFLite, and is well-documented for ease of use in research and practical applications.

FLUX1.1 [pro]

A new text-to-image AI model capable of generating images six times faster than its predecessor, with higher quality, better prompt adherence, and more diversity in outputs. It includes a prompt upsampling feature that utilizes a language model to enhance prompts for improved image generation.

Shuttle-3

Shuttle-3 is a state-of-the-art language model designed for high-quality text generation, particularly suited for complex chat, multilingual communication, and reasoning tasks. It is fine-tuned from the Qwen-2.5-72b-Instruct model and designed to emulate high-quality prose similar to Claude 3 models.

WizardLM

WizardLM is a state-of-the-art large language model designed for complex chat, multilingual tasks, reasoning, and agent functionalities. It features an AI-powered pipeline (Auto Evol-Instruct) that optimizes instruction datasets for improved performance across various domains and leverages Arena Learning for an expanded learning pool of challenging instruction data.

Aria

A multimodal AI model that combines vision, language, and coding tasks, designed to deliver state-of-the-art performance across diverse tasks.

ToolACE-8B

ToolACE-8B is a finetuned LLaMA-3.1-8B-Instruct model designed for automatic tool usage and generating diverse tool-learning data, achieving state-of-the-art performance on the Berkeley Function-Calling Leaderboard. It features a novel self-evolution synthesis process and a dual-layer verification system for accurate data generation.

DeepSeek-R1

An autonomous agent designed for deep local and web research, capable of generating detailed reports with citations for various topics.

DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model designed for economical training and efficient inference, boasting 236B total parameters with excellent performance across various benchmarks and exceptional capabilities in text generation and conversational AI.

DeepSeek-Coder-V2

An open‐source Mixture‐of‐Experts code language model that enhances code generation and reasoning capabilities for programming tasks. It supports an extended 128K context window and a wide array of programming languages, making it competitive with closed‐source models like GPT4-Turbo.

Watt Tool 8B

A tool built for enhancing AI workflow by enabling precise tool selection and multi-turn dialogue to support complex task automation.

watt-tool-70B

watt-tool-70B is a fine-tuned large language model based on LLaMa-3.3-70B-Instruct, optimized for advanced tool usage and multi-turn dialogue. It is designed for AI workflow building tasks, excelling in function calling and tool selection, and achieves state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL).

NSFW-gen-v2

An updated AI model that generates uncensored NSFW content, offering users the ability to produce imaginative and unfiltered outputs.

Bielik-11B-v2

Bielik-11B-v2 is an 11-billion parameter generative text model trained on Polish text corpora. Initialized from Mistral-7B-v0.2 and fine-tuned using advanced parallelization techniques, it offers robust text generation capabilities in Polish and English, as evidenced by its performance on multiple NLP leaderboards.

Marco-o1

An open-source large reasoning language model designed for complex real-world problems, leveraging chain-of-thought fine-tuning, Monte Carlo Tree Search, and self-reflection mechanisms to expand solution spaces and improve open-ended reasoning.

JanusFlow-1.3B

JanusFlow-1.3B is a unified multimodal model by DeepSeek that integrates autoregressive language models with rectified flow, enabling both multimodal understanding and image generation.

Stable Diffusion 3.5 Medium

A Multimodal Diffusion Transformer text-to-image generative model by Stability AI that offers improved image quality, typography, complex prompt understanding, and resource efficiency. It supports local or programmatic use via diffusers, ComfyUI, and API endpoints.

Llama-3.1-Tulu-3-8B

An instruction-following language model from AllenAI based on Llama 3.1, optimized for a wide range of NLP tasks including chat, math, and reasoning. It provides various fine-tuned versions (SFT, DPO, RLVR) along with extensive benchmarking and deployment guidance on Hugging Face.

Ultralytics YOLOv8

A state‐of‐the‐art object detection model by Ultralytics that provides robust capabilities for object detection, instance segmentation, and pose estimation. It offers both CLI and Python integrations with extensive documentation and performance metrics.

Microsoft Phi-4

Microsoft Phi-4 is a state-of-the-art open language model (14B parameters, dense decoder-only transformer) trained on a blend of synthetic, public domain, and academic data. It has undergone rigorous supervised fine-tuning and direct preference optimization to improve instruction adherence, reasoning, and safety, making it suitable for research and generative AI applications.

Phi-3-mini-4k-instruct

A 3.8B parameter, lightweight instruction-tuned language model by Microsoft built on the Phi-3 datasets. It is designed for robust text generation, logical reasoning, and multi-turn conversation with support for both 4K and 128K token contexts.

Ultralytics YOLO11

A suite of computer vision models for object detection, segmentation, pose estimation, and classification, integrated with Ultralytics HUB for visualization and training.

DeepSeek-VL2-small

DeepSeek-VL2-small is a variant of the DeepSeek-VL2 series, advanced mixture-of-experts vision-language models designed for multimodal tasks such as visual question answering, optical character recognition, document/table/chart understanding, and visual grounding.

MicroThinker-3B-Preview

MicroThinker-3B-Preview is a fine-tuned language model developed by huihui-ai, built from the Llama-3.2-3B-Instruct-abliterated base model. It is specifically optimized for enhanced reasoning capabilities and text generation, using the FineQwQ-142k dataset. The model card includes detailed training instructions and support for inference (e.g., via ollama).

DeepSeek-V3

A large AI model with a total of 685B parameters, supporting FP8 and BF16 modes and featuring multi-token prediction.

DeepSeek-R1

A suite of AI models for search and retrieval, including distilled dense models with state-of-the-art performance.

Janus-Series

An open-source repository from deepseek-ai that offers a suite of unified multimodal models (including Janus, Janus-Pro, and JanusFlow) designed for both understanding and generation tasks. The models decouple visual encoding to improve flexibility and incorporate advanced techniques like rectified flow for enhanced text-to-image generation.

Stable Diffusion

A high-resolution image synthesis model that enables users to generate images from textual descriptions, supporting creative and design applications.

Qwen

Qwen is the large language model family developed by Alibaba Cloud, showcased on Hugging Face. It includes advanced language and multimodal models (e.g., Qwen2.5, Qwen2.5-VL) that support text generation, image-text interactions, and long-context processing. The organization page also links to interactive demos like Qwen Chat, highlighting its practical application in AI chat and content generation.

Ideogram-V2

Ideogram-V2 is an advanced image generation model that excels in inpainting, prompt comprehension, and text rendering. It is designed to transform ideas into captivating designs, realistic images, innovative logos, and posters. The model is accessible via an API on Replicate and offers unique features for creative image editing.

Qwen/QwQ-32B-Preview

An experimental preview release large language model developed by the Qwen Team, featuring 32.5B parameters. It is designed to advance AI reasoning and text generation, supporting extended context lengths (up to 32,768 tokens) and built using transformer architectures with RoPE, SwiGLU, and RMSNorm. The model is geared towards research and demonstrates strong capabilities in math and coding, despite noted limitations in language consistency and common sense reasoning.

BLOOM

BLOOM is a multilingual large language model with 176 billion parameters developed by the BigScience project. It generates text in 46 natural languages and 13 programming languages, and is designed for research and deployment under a Responsible AI License. The release includes access to intermediary checkpoints, optimizer states, and is integrated into the Hugging Face ecosystem.

Mochi 1

Mochi 1 is an open state-of-the-art video generation model by Genmo, featuring a 10 billion parameter diffusion model built on the novel Asymmetric Diffusion Transformer (AsymmDiT) architecture. It generates high-quality videos with high-fidelity motion and strong prompt adherence and is available via an API on Replicate.

OpenAI GPT 1

OpenAI GPT 1 is the first transformer-based language model developed by OpenAI. It is a causal transformer pre-trained on a large corpus for language modeling and is available for inference through both PyTorch and TensorFlow. The model card provides comprehensive details including training methodology, risks, limitations, and usage guidelines.

GPT-2

GPT-2 is a pretrained generative transformer model by OpenAI, designed for text generation. It is trained using a causal language modeling objective on a large corpus of English text and is available on Hugging Face. The model card provides detailed usage examples, training procedure, limitations, and evaluation results.

OpenAI GPT-4o

OpenAI GPT-4o is an advanced multimodal AI model available via the Azure OpenAI Service. It integrates text, image, and audio processing to offer efficient and cost-effective performance, surpassing GPT-4 Turbo with Vision in speed, cost, and non-English language support. It is designed for enhanced customer service, advanced analytics, and content innovation.

Shuttle 3 Diffusion

Shuttle 3 Diffusion is a text-to-image diffusion model that generates detailed and diverse images from textual prompts in just 4 steps. It offers enhanced image quality, improved typography, and resource efficiency, and can be integrated via API, Diffusers, or ComfyUI.

OpenLLaMA

An open-source reproduction of Meta AI’s LLaMA large language model, offering 3B, 7B, and 13B parameter models trained on the RedPajama dataset with both PyTorch and JAX weights under the Apache-2.0 license.

Yi

A series of large language models, including variants like Yi-chat, acclaimed for performance in both English and Chinese on multiple benchmarks.

DeepSeek-MoE

DeepSeek-MoE 16B is a Mixture-of-Experts (MoE) language model featuring 16.4B parameters. It employs fine-grained expert segmentation and shared experts isolation to achieve comparable performance to larger models with only around 40% of the typical computations. The repository includes both base and chat variants along with evaluation benchmarks and integration instructions via Hugging Face Transformers.

img2prompt

An AI model that extracts approximate text prompts from input images, optimized for stable diffusion using a modified CLIP Interrogator method. It enables users to generate descriptive prompts that can be used to recreate or modify images.

Allegro

Allegro is an advanced open-source text-to-video generation model by RhymesAI. It converts simple text prompts into high-quality, 6-second video clips at 15 FPS and 720p resolution using a combination of VideoVAE for video compression and a scalable Diffusion Transformer architecture.

Stable Diffusion 3.5 Large

Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer text-to-image generative model developed by Stability AI. It generates images from text prompts with enhanced image quality, typography, and resource-efficiency. The model supports integration with Diffusers, ComfyUI, and other programmatic interfaces, and is available under the Stability Community License.

xinsir/controlnet-union-sdxl-1.0

A ControlNet++ model for text-to-image generation and advanced image editing. Built on Stable Diffusion XL, it supports over 10 control conditions and advanced features such as tile deblurring, tile variation, super resolution, inpainting, and outpainting. The model is designed for high-resolution, multi-condition image generation and editing.

Mochi 1 Preview

Mochi 1 Preview is an open, state-of-the-art text-to-video generation model by Genmo that leverages a 10 billion parameter diffusion model with a novel Asymmetric Diffusion Transformer architecture. It generates high-fidelity videos from text prompts and is available under an Apache 2.0 license.

olmOCR-7B-0225-preview

A preview release of AllenAI's olmOCR model, fine-tuned from Qwen2-VL-7B-Instruct using the olmOCR-mix-0225 dataset. It is designed for document OCR and recognition, processing PDF images by extracting text and metadata. The model is intended to be used in conjunction with the olmOCR toolkit for efficient, large-scale document processing.

Perplexity R1-1776

A post-trained variant of the DeepSeek-R1 reasoning model by Perplexity AI, designed to remove censorship and deliver unbiased, accurate, and fact-based responses while maintaining robust reasoning skills.

DeepSeek-R1 Distill Qwen 14B GGUF

A quantized (GGUF) variant of the DeepSeek-R1 reasoning model distilled from Qwen 14B. This model supports a massive 128k context length and is tuned for reasoning and chain-of-thought tasks. It is provided by the lmstudio-community on Hugging Face, incorporating optimizations from llama.cpp.

YOLOv8

A state-of-the-art computer vision model for object detection, segmentation, pose estimation, and classification tasks, designed for speed, accuracy, and ease of use.

ModernBERT Embed

ModernBERT Embed is an embedding model derived from ModernBERT-base designed for generating sentence embeddings. It supports tasks such as sentence similarity and search through both full (768-d) and truncated (256-d) embedding outputs. The page provides comprehensive usage examples using SentenceTransformers, Transformers, and Transformers.js, indicating its integration into various frameworks.

Janus-Pro-1B

Janus-Pro-1B is a unified multimodal model by DeepSeek that decouples visual encoding for multimodal understanding and generation. It supports both image input (via SigLIP-L) for understanding and image generation using a unified transformer architecture.

Grok 3 ai

Grok 3 ai is xAI's flagship language model, introduced as an upgrade to Grok 2. It features enhanced computational power (10–15× more than its predecessor), advanced reasoning capabilities including a 'Big Brain Mode' for tackling complex multi-step problems, and a DeepSearch feature that scans and synthesizes information from the internet and social platforms. Additionally, it supports multimodal inputs and improved coding accuracy, positioning it as a strong competitor against models like GPT-4o and Gemini. The model is accessible via subscription plans integrated within X’s ecosystem.

EleutherAI/gpt-neox-20b

A 20-billion parameter autoregressive transformer language model developed by EleutherAI using the GPT-NeoX library. It is designed primarily for research purposes, with capabilities for further fine-tuning and adaptation, and provides detailed technical specifications and evaluation results.

Dolphin 3.0 R1 Mistral 24B

A next-generation instruct-tuned text generation model optimized for coding, math, reasoning, and agentic tasks. Built on the Mistral-24B base, it is fine-tuned with extensive reasoning traces to support function calling and steerable alignment, offering users local deployment control.

Falcon 3 Family

A family of open-source, decoder-only large language models under 10 billion parameters developed by Technology Innovation Institute (TII). The Falcon 3 models offer enhanced math, scientific, and coding capabilities through innovative pretraining techniques and are available in multiple variants including base and instruct configurations.

Hunyuan3D 2.0

A diffusion-based model for generating high-resolution textured 3D assets, featuring a two-stage pipeline with a shape generation component (Hunyuan3D-DiT) and a texture synthesis component (Hunyuan3D-Paint). It supports both image-to-3D and text-to-3D workflows, and includes a user-friendly production platform (Hunyuan3D-Studio) for mesh manipulation and animation.

DeepSeek-R1-Distill-Qwen-1.5B

A distilled dense language model based on Qwen2.5-Math-1.5B that leverages the DeepSeek-R1 pipeline. It is designed for advanced reasoning, math, and code generation tasks, and is available under an MIT license with extensive evaluation metrics and deployment instructions on Hugging Face.

SmolLM2/SmolVLM Family

A suite of compact language and vision-language models designed for efficiency, available in several sizes to balance performance and deployment demands.

FuseChat-7B-VaRM

FuseChat-7B-VaRM is a chat language model developed by FuseAI that fuses knowledge from multiple chat LLMs (NH2-Mixtral-8x7B, NH2-Solar-10.7B, and OpenChat-3.5-7B) using a fuse-then-merge strategy. It aims to integrate the strengths of diverse models into a single, memory-efficient LLM, achieving competitive performance on benchmarks like MT-Bench.

Florence-2-large

An advanced vision foundation model by Microsoft designed for a wide range of vision and vision-language tasks such as captioning, object detection, OCR, and segmentation. It uses a prompt-based, sequence-to-sequence transformer architecture pretrained on the FLD-5B dataset and supports both zero-shot and finetuned settings.

DeepSeek-R1-Distill-Qwen-14B

An open-source, distilled large language model derived from DeepSeek-R1 that is built on Qwen2.5-14B. It utilizes reinforcement learning techniques to achieve enhanced reasoning, chain-of-thought generation, and state-of-the-art performance benchmarks.

YOLOv8

A state-of-the-art object detection, segmentation, and classification model known for its speed, accuracy, and ease of use in computer vision tasks.

LHM

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds is an open‐source implementation for reconstructing and animating 3D human models from a single image. It offers GPU-optimized pipelines, Docker support, and integration with animation frameworks like ComfyUI.

Kernel/sd-nsfw

A Stable Diffusion v1-5 NSFW REALISM model variant hosted on Hugging Face. It is a diffusion-based text-to-image generation model fine-tuned for generating photo-realistic images, including NSFW content, and is intended for research purposes. It can be used with the Diffusers library and offers options for both direct inference and fine-tuning.

Anything V5

A text-to-image diffusion model from the Anything series designed for anime-style image generation. The model is available in multiple variants (e.g., V5-Prt) and is optimized for precise prompt-based outputs. It leverages Stable Diffusion pipelines and is hosted on Hugging Face with detailed versioning and usage instructions.

prunaai/hidream-l1-dev

An optimized version of the hidream-l1-dev model using the pruna ai optimisation toolkit. This model runs on Nvidia A100 GPUs, is available via an API on Replicate, supports rapid predictions (around 15 seconds per run), and has been executed over 28.5K times.

Stable Diffusion v1.5

A latent diffusion-based text-to-image generation model that produces photorealistic images from text prompts. It builds upon the Stable Diffusion v1.2 weights and is fine-tuned for improved classifier-free guidance. It can be used via the Diffusers library, ComfyUI, and other interfaces.

Llama 4 Maverick & Scout

A new generation of large language models from Meta released on Hugging Face. Llama 4 includes two Mixture-of-Experts models – Maverick (~400B total with 17B active parameters and 128 experts) and Scout (~109B total with 17B active parameters and 16 experts). Both support native multimodal inputs (text and images), extremely long context lengths (up to 10M tokens in Scout), and are integrated with Hugging Face transformers and TGI for easy deployment.

VLM-R1

VLM-R1 is a stable and generalizable R1-style large Vision-Language Model designed for visual understanding tasks such as Referring Expression Comprehension (REC) and Out-of-Domain evaluation. The repository provides training scripts, multi-node and multi-image input support, and demonstrates state-of-the-art performance with RL-based fine-tuning approaches.

FinGPT

FinGPT is an open‐source repository that provides financial large language models along with training scripts, fine‐tuning techniques, and benchmark datasets. It is designed to efficiently adapt LLMs for financial applications, democratizing financial data and supporting research through released models and accompanying academic papers.

DeepSeek-V2-Lite

DeepSeek-V2-Lite is a Mixture-of-Experts language model designed for economical training and efficient inference. With 16B total parameters and 2.4B activated parameters, it employs innovative techniques such as Multi-head Latent Attention (MLA) and DeepSeekMoE for performance gains. The model is available for both text and chat completions via Hugging Face and is optimized to run with a 40GB GPU using BF16 precision.

Jamba-v0.1

Jamba-v0.1 is a state-of-the-art, hybrid SSM-Transformer large language model developed by AI21 Labs. It is a pretrained, mixture-of-experts generative text model with 12B active parameters (52B total across experts), supporting a 256K context length. Designed for high throughput, it serves as a strong base for fine-tuning into chat/instruct versions.

OpenAI GPT 4.1 API

OpenAI's flagship GPT-4.1 API is a high-performance large language model optimized for real-world applications. It supports up to 1M tokens of context, offers improved coding, advanced instruction following, enhanced formatting, and robust long-context comprehension, making it ideal for building intelligent agents, processing extensive documents, and handling complex workflows.

STILL-3-Tool-32B

A 32.8B parameter text-generation model that integrates Python code to enhance the reasoning process via tool manipulation. It achieves 81.70% accuracy on AIME 2024, matching o3-mini and outperforming o1 and DeepSeek-R1. The model is open-sourced on Hugging Face, and its design focuses on improving reasoning capabilities by leveraging integrated tool use.

Stable Diffusion XL Base 1.0

A diffusion-based text-to-image generative model developed by Stability AI. This model uses a latent diffusion approach with dual fixed text encoders, and can be used standalone or combined with a refinement model for enhanced high-resolution outputs. It supports both direct image generation and img2img workflows leveraging SDEdit.

HiDream-I1

An open-source image generative model with 17B parameters, delivering state-of-the-art image generation quality, accompanied by a dedicated Hugging Face Space for experimentation.

spaCy Models

A GitHub repository by explosion that distributes pre-trained model packages for the spaCy NLP library. The repository provides model releases in .whl and .tar.gz formats for various NLP tasks (e.g., tagging, parsing, lemmatization, and named entity recognition) along with versioning and compatibility guidelines.

Llama4

Llama4 is a large autoregressive Mixture-of-Experts (MoE) multimodal model developed by Meta. It comes in two variants: Maverick (17B active parameters out of ~400B total with 128 experts) and Scout (17B active parameters out of ~109B total with 16 experts). The models support native multimodal inputs (text and images), long context lengths (up to 10 million tokens in some versions), and advanced quantization and offloading techniques for efficient deployment.

OpenAI GPT-4o API

GPT‑4o is OpenAI’s most advanced flagship multimodal model that supports text, image, and audio inputs and outputs, offering real-time responsiveness, a 1M token context window via API, and high performance across reasoning, math, and coding tasks. It is ideal for applications such as real-time voice assistants, interactive multimodal document Q&A, and advanced code generation.

Phi-4-mini-instruct

Phi-4-mini-instruct is a 3.8B-parameter lightweight language model from Microsoft, built from the Phi-4 family. It is designed for high-quality reasoning and instruction-following tasks, supports a 128K token context length, and is optimized via supervised fine-tuning and direct preference optimization. It is intended for both commercial and research use in memory/compute constrained and latency-sensitive environments.

Shakker-Labs/AWPortraitCN2

A text-to-image model focused on generating portraits with Eastern aesthetics. The updated version expands character depiction across various age groups and themes including cuisine, architecture, traditional ethnic costumes, and diverse environments. It is based on the stable-diffusion/flux framework and released under a non-commercial license.

deepfake-detector-model-v1

A deepfake detection image classification model fine-tuned from google/siglip2-base-patch16-512. It leverages the SiglipForImageClassification architecture to classify images as either 'fake' (deepfakes) or 'real', and is intended for applications such as media authentication, content moderation, forensic analysis, and security.

ComfyUI-Florence2

A GitHub repository that integrates Microsoft’s Florence-2, an advanced vision foundation model, into ComfyUI. It enables prompt-based vision and vision-language tasks such as captioning, object detection, segmentation, and Document Visual Question Answering (DocVQA) on scanned documents.

Flux1.1 Pro – Ultra

Flux1.1 Pro – Ultra is an advanced text-to-image diffusion model by Black Forest Labs available on Replicate. It offers ultra mode for generating high-resolution images (up to 4 megapixels) at impressive speeds (around 10 seconds per sample) and a raw mode that produces images with a more natural, candid aesthetic.

Flux-uncensored

Flux-uncensored is a text-to-image diffusion model hosted on Hugging Face by enhanceaiteam. It leverages the stable-diffusion pipeline, LoRA, and the fluxpipeline to generate images from text prompts. The model is marked as 'Not-For-All-Audiences', indicating that it might produce sensitive content.

AM-Thinking-v1

AM-Thinking-v1 is a 32B dense language model built on Qwen 2.5-32B-Base, designed to enhance reasoning capabilities. It uses a post-training pipeline that includes supervised fine-tuning and dual-stage reinforcement learning, enabling strong performance in reasoning tasks like code generation, logic, and writing while operating efficiently on a single GPU.

jina-embeddings-v3

jina-embeddings-v3 is a multilingual multi-task text embedding model developed by Jina AI. Built on the Jina-XLM-RoBERTa architecture, it employs task-specific LoRA adapters to generate embeddings for various NLP tasks such as retrieval, classification, text-matching, and more. It supports rotary position embeddings for input sequences up to 8192 tokens and offers flexible, adjustable embedding dimensions.

Microsoft Phi-4-reasoning-plus

Phi-4-reasoning-plus is a state-of-the-art open-weight reasoning large language model developed by Microsoft. Finetuned from the base Phi-4 model with a mix of supervised fine-tuning on chain-of-thought traces and reinforcement learning, it is optimized for advanced reasoning tasks in math, science, and coding. The model features a dense 14B parameter decoder-only Transformer architecture with a 32k token context length and produces responses with a reasoning chain-of-thought followed by a summarization. It is intended for research and generative AI applications in constrained memory/latency settings.

nanoVLM

A lightweight, fast repository for training and fine-tuning small vision-language models using pure PyTorch.

Tesseract OCR

Tesseract OCR is an open-source optical character recognition engine that can recognize text from images. It supports over 100 languages, multiple image formats (PNG, JPEG, TIFF), and offers both an LSTM-based OCR engine and a legacy mode for character pattern recognition.

FLUX.1

FLUX.1 is an open‐source state‐of‐the‐art text‐to‐image generation model developed by Black Forest Labs. It excels in prompt adherence, visual detail, and diverse output quality. Available via Replicate's API, FLUX.1 comes in three variants (pro, dev, schnell) with different pricing models.

Ideogram 3.0

Ideogram 3.0 is a text-to-image generation model available on Replicate that offers three variants—Turbo, Balanced, and Quality—to cater for fast iterations, balanced outputs, and high-fidelity results. It delivers improved realism, enhanced text rendering, precise layout generation, and advanced style transfer capabilities, making it ideal for graphic design, marketing, and creative visual content creation.

Smaug-72B-v0.1

Smaug-72B-v0.1 is an open-source large language model for text generation developed by Abacus.AI. Based on Qwen-72B and finetuned using the novel DPO-Positive (DPOP) technique, it achieves high performance on benchmarks like MT-Bench and is the first open model to surpass an average score of 80% on the Open LLM Leaderboard.

DeepSeek-R1-Distill-Llama-8B

A distilled language model from the DeepSeek-R1 series built on the Llama-3.1-8B base. It is optimized for text generation and chain-of-thought reasoning tasks through reinforcement learning and selective fine-tuning, delivering competitive performance on math, code, and reasoning benchmarks.

Depth Anything V2

An interactive Hugging Face Space that leverages deep learning to generate depth maps from images. This tool extracts depth information from 2D images, which can be used for creative 3D effects, image editing, or further computer vision tasks.

Kimi-VL-A3B-Thinking

Kimi-VL-A3B-Thinking is an efficient open-source Mixture-of-Experts vision-language model specialized in long-context processing and extended chain-of-thought reasoning. With a 128K context window and only 2.8B activated LLM parameters, it excels in multimodal tasks including image and video comprehension, OCR, mathematical reasoning, and multi-turn agent interactions.

roberthein/modelname-new

This is an API reference page for the AI model 'modelname-new' by roberthein hosted on Replicate. The page shows usage statistics, a playground, API usage options, and links to documentation such as README and version history.

DeepSeek

DeepSeek is a verified Chinese company focused on advancing AGI by developing a suite of advanced text generation models (such as DeepSeek-R1 and DeepSeek-V3) available on Hugging Face. Their offerings include a range of models for text generation, research, and development, reflecting a robust commitment to AI model innovation.

SmolVLM

SmolVLM is a 2B parameter vision-language model that is small, fast, and memory-efficient. It builds on the Idefics3 architecture with modifications such as an improved visual compression strategy and optimized patch processing, making it suitable for local deployment, including on laptops. All model checkpoints, training recipes, and tools are released open-source under the Apache 2.0 license.

Meta Llama

A collection of state-of-the-art large language models from Meta, including variants with multimodal capabilities for text and image processing.

MiniMax-M1

MiniMax-M1 is an open-weight, large-scale hybrid-attention reasoning model built using a hybrid Mixture-of-Experts architecture with a lightning attention mechanism. It supports an extended context length of up to 1 million tokens and is optimized with reinforcement learning for tasks ranging from mathematical reasoning to complex software engineering environments.

OpenAI GPT OSS

An open-source family of large language models released by OpenAI, featuring two models: a 117B-parameter model (gpt-oss-120b) and a 21B-parameter model (gpt-oss-20b). Utilizing mixture-of-experts (MoE) with 4-bit quantization (MXFP4), these models offer powerful reasoning, chain-of-thought capabilities, and tool use support, optimized for fast inference on modern GPUs, ranging from data-center H100s to consumer hardware.

OpenAI GPT-5

OpenAI GPT-5 is the latest language model from OpenAI, designed for advanced reasoning, code generation, instruction following, and tool usage. It is offered in several variants (gpt-5, gpt-5-mini, gpt-5-nano) to cater to tasks from complex multi-step reasoning to lightweight classification.

Robotics

6 tools

LeRobot

A comprehensive suite of models, datasets, and tools for real-world robotics, providing end-to-end learning resources in PyTorch for robotics AI applications.

FAST: Efficient Action Tokenization for Vision-Language-Action Models

A Hugging Face repository that provides a universal action tokenizer (FAST+) designed to map sequences of robot actions into dense, discrete tokens for training autoregressive vision-language-action models. It supports both usage of a pre-trained tokenizer and easy custom training on users' own data.

Genesis

Genesis is a physics simulation platform and generative data engine for general-purpose robotics and embodied AI learning. It features a universal physics engine, ultra-fast robotics simulation, photorealistic rendering, and a modular generative framework that converts natural language prompts into various data modalities.

NVIDIA Isaac GR00T N1

An open foundation model for generalized humanoid robot reasoning and manipulation. It accepts multimodal inputs (language and images) and uses a vision-language foundation model combined with a diffusion transformer head to denoise continuous actions, enabling robot control and fine-tuning for specific tasks and embodiments.

NVIDIA Isaac GR00T N1

The world’s first open foundation model for generalized humanoid robot reasoning and skills, aimed at revolutionizing physical AI applications.

Reachy Mini

Reachy Mini is an expressive, open‐source robot designed for human-robot interaction, creative coding, and AI experimentation. Programmable in Python (with upcoming support for additional languages), it integrates with state-of-the-art Hugging Face models for speech, vision, and personality. Available as a kit in a lite or wireless version, it is intended for AI developers, researchers, educators, and hobbyists to build, test, and share real-world AI applications.

Coding Assistants

13 tools

GitHub Copilot

An AI pair programmer that provides context-aware code suggestions and completions directly within your coding environment.

Tabby

A self-hosted AI coding assistant designed to integrate with VSCode, offering chat-based code completions and an enhanced in-editor experience.

Continue

A platform for creating, sharing, and using custom AI code assistants, integrated with open-source IDE extensions and a hub of tools.

AutoDev

AutoDev is an AI-powered coding assistant integrated in IntelliJ IDEA. It provides multilingual support, auto code generation, bug-slaying assistance, and customizable prompts along with features for auto development, testing, documentation, and agent functionalities.

AI Toolkit for Visual Studio Code

Formerly known as Windows AI Studio, this toolkit extends VS Code with support for a broad range of AI models, streamlining prompt generation, code completion, and AI model discovery.

Microsoft AI Extension Pack

A curated collection of Visual Studio Code extensions designed to accelerate building generative AI applications and agents. It bundles essential tools such as the AI Toolkit, AI Foundry Extension, GitHub Copilot (and its Azure variant), and Data Wrangler, providing integrated support for intelligent code assistance and deployment within VS Code.

gitprompt

An AI-powered CLI git assistant that automatically stages files and creates commits with perfect, GPT-4.1 generated messages. It analyzes code changes, groups files intelligently, and supports features like diff analysis, safety checks, and interactive confirmations, boosting developer productivity during version control.

Aider

Aider is an AI pair programming tool for the terminal that leverages large language models to assist with coding tasks. It maps your codebase, supports multiple programming languages, integrates with git, and offers features like voice-to-code, IDE integration, and the ability to work with both cloud and local LLMs.

Claude Auto-Commit

Claude Auto-Commit is an open-source AI-powered tool that analyzes code changes using the Claude Code SDK to generate contextual and meaningful Git commit messages. It supports multi-language commit formatting, automatic staging, and optional auto-push, integrating seamlessly into developers’ workflows via OAuth-based authentication (requiring a Claude Pro/Max subscription).

PR-Agent (Qodo Merge)

An AI-powered tool for automated pull request analysis, feedback, and code suggestions, designed to streamline code review processes.

Kilo Code

Kilo Code is an open-source VS Code AI agent that helps with planning, building, and fixing code. It leverages natural language to generate code, automates repetitive tasks (including terminal commands and browser automation), refactors code, and offers multi-mode operation (Architect, Coder, Debugger). It integrates features from existing tools like Roo Code and Cline.

GitHub Copilot Business

An enterprise-grade AI coding assistant that integrates with IDEs, GitHub, and the CLI to enhance developer productivity, code quality, and secure governance. It is designed to scale operations for modern development teams with additional business controls and integration options.

GitHub Copilot Extensions Beta

A limited public beta platform from GitHub that extends the capabilities of GitHub Copilot Chat in editors like VS Code and Visual Studio. It allows developers to perform actions such as creating feature flags, checking exceptions, accessing API documentation, and deploying applications through natural language commands via integrated third‐party extensions.

Chat Interfaces

32 tools

HuggingChat - Models

HuggingChat provides access to various AI chat models, enabling users to interact with and utilize state-of-the-art language models for various applications including conversation and task-oriented interactions.

Vercel AI Chatbot

A full-featured, hackable Next.js chatbot framework that uses an AI SDK to generate text, structured objects and integrates with multiple LLM providers.

NextChat

A lightweight and fast AI assistant designed for enterprise use, featuring customizable branding, resource integration, and permission controls.

Hugging Face Chat UI

An open-source codebase that powers customizable chat user interfaces, allowing deployment of chatbot instances integrated with various supported language models.

Khoj

An open-source, self-hostable 'AI second brain' providing personalized, context-aware assistance for knowledge management and research.

leon-ai/leon

An open-source personal assistant that runs on your own server, providing a customizable, self-hosted AI assistant experience.

ChatGPT Desktop Application

A native desktop application for ChatGPT available on macOS, Windows, and Linux, providing an enhanced chat experience with extended capabilities.

Jan

A local AI assistant powered by Cortex, designed to run completely offline on a variety of hardware, offering an open source alternative to ChatGPT.

PocketPal AI

PocketPal AI is a mobile application that brings offline language model-based AI assistance directly to your phone. It allows users to download, load, and interact with various small language models (SLMs) on both iOS and Android devices, with customizable inference settings and performance metrics.

Cherry Studio

Cherry Studio is a cross-platform desktop client that integrates multiple LLM providers (including major cloud services like OpenAI, Gemini, Anthropic, etc., along with local model support) and supports deepseek-r1. It offers pre-configured AI assistants, multi-model simultaneous conversations, document and data processing features, and practical integrations, making it a comprehensive tool for interacting with various AI models.

Chatbox Community Edition

An open-source, user-friendly desktop client for AI models/LLMs such as ChatGPT, Claude, Gemini, Ollama, and more. It offers features like local data storage, enhanced prompting, markdown and code highlighting, keyboard shortcuts, team collaboration, and cross-platform availability (Windows, macOS, Linux, iOS/Android and Web version).

XiaoZhi AI Chatbot

An open-source project to build your own AI friend using ESP32, SenseVoice, and LLMs like Qwen and DeepSeek. It integrates voice wake-up, speech recognition, multi-language chat, TTS, OLED/LCD display support, and configurable prompts for a hardware-based conversational AI device.

Hollama

A minimal web-UI for interacting with Ollama and OpenAI servers, featuring multi-server support, markdown rendering, code editor functionalities, and a responsive design for local and self-hosted use.

ChatGPT-On-CS (懒人客服)

An open-source intelligent customer service system based on large language models. It supports multi-channel integration (WeChat, Pinduoduo, Qianniu, Bilibili, Douyin, Weibo, Xiaohongshu, Zhihu, etc.), enabling text, voice, and image communication, auto-replies, and knowledge-base customization for enterprise AI applications.

Lobe Chat

Lobe Chat is an open-source, modern-design AI chat framework that enables one-click free deployment of private ChatGPT, Claude, Gemini, Ollama, DeepSeek, and Qwen based chat applications. It supports multi-AI provider integration, features such as chain of thought, branching conversations, knowledge base management (file upload), multi-modal interactions (including TTS/STT voice conversation and text-to-image generation), plugin systems with function calling, and more, making it a comprehensive solution for building private, customized chat interfaces.

XiaoZhi AI Chatbot

An open-source AI chatbot hardware project built on ESP32 using SenseVoice and Qwen72B. It integrates offline voice wake-up, multi-language speech recognition, configurable TTS and LLM integration to serve as a physical AI chat companion for educational and experimental purposes.

Grok 中文版

A comprehensive guide and portal for the Chinese-optimized version of Elon Musk's Grok3 AI model. It provides mirror site links for domestic users to access a chat interface that supports Grok3 reasoning, deep search, and other advanced functionalities without the need for VPN.

Second Me

Second Me is an open‐source platform that lets you train and deploy a personalized AI self. It uses hierarchical memory modeling and the Me-Alignment Algorithm to capture your memories and identity, enabling your AI to switch roles, collaborate on a decentralized network, and serve as a private, self-hosted personal assistant.

Open Assistant

Open Assistant is an open-source, chat-based assistant developed by LAION-AI. It is designed to understand user tasks, interact with third-party systems, and dynamically retrieve information, democratizing access to powerful large language models.

AI-DEBAT

AI-DEBAT is a Streamlit-based web app that enables users to pit two AI models against each other in a turn-based debate. Users select from models like OpenAI GPT-3.5/4, Anthropic Claude 3, Google Gemini, and Hugging Face models, provide respective API keys, and watch an interactive debate unfold with unused models acting as judges. It also allows downloading the final debate report.

GPT4All Web Search Beta

A beta release feature for GPT4All that integrates Brave Search API to enable real-time web search functionality within the GPT4All chat environment. The page provides step-by-step instructions on setting up the feature, obtaining an API key, and configuring the system prompt to allow the Llama 3.1 8B Instruct model to perform web searches.

LibreChat

An open-source AI chat application supporting advanced models (like GPT-4.1) and offering features such as agent chains and persistent code environments.

SecureAI Tools

SecureAI Tools enables users to interact with their documents via AI chat interfaces. It integrates document processing and secure AI functionalities, suitable for both cloud and on-premise deployments.

Candy.ai Free

A groundbreaking platform offering customizable virtual AI companions for mature and personalized interactions.

DeepChat

DeepChat is a powerful open-source AI chat platform that provides a unified interface for interacting with multiple large language models. It supports both cloud APIs (like OpenAI, Gemini, Anthropic) and local models (like Ollama), and offers advanced features including search enhancement and tool calling capabilities.

E-commerce Visual Assistant

An interactive visual assistant that lets users upload a product photo and ask commerce-related questions (e.g., 'What brand is this?') using the google/paligemma-3b model. It leverages Gradio for an easy-to-use interface, processing image and text inputs to generate relevant answers.

yuntian-deng/ChatGPT

A Hugging Face Space providing a ChatGPT-like conversational interface. This tool, duplicated from 'yuntian-deng/ChatGPT4', offers users an interactive chat experience powered by AI, similar to ChatGPT. It appears to be actively running and includes discussion threads that showcase sample interactions (including code and document generation demonstrations).

GPT4All

GPT4All is an open‐source, private local LLM environment by NOMIC that allows users to run and chat with large language models on their own computer without relying on cloud services. The project provides installers for Windows, macOS, and Linux along with detailed system requirements and hardware recommendations.

NativeMind

NativeMind is a fully private, on-device AI assistant browser extension that runs local AI models (built on Ollama) without cloud dependencies. It offers features such as cross-tab contextual memory, local AI-powered search, smart webpage summarization, bilingual translation, writing enhancement, and customizable prompts, all aimed at boosting productivity while ensuring privacy.

Jan

An open-source alternative to ChatGPT that runs entirely offline, offering local conversational AI capabilities.

GPT4All Web Search Beta Release

This is a beta add-on feature for GPT4All that integrates web search capabilities into the GPT4All Chat application using the Brave Search API. It provides step-by-step instructions on signing up for a Brave Search API Key, configuring the GPT4All tool settings, and testing the integration with the Llama 3.1 8B Instruct model.

chat-ui by huggingchat

A live chat interface hosted as a Hugging Face Space by the huggingchat team, allowing users to interact with conversational AI models through an intuitive web-based application.

Agent Frameworks

47 tools

AG2

AG2 (formerly AutoGen) is an open-source software platform designed for building AI agents and facilitating multi-agent interactions to solve complex tasks. It supports integration with various large language models and provides various orchestration patterns for AI agents, enabling flexible and efficient tool usage and human collaboration.

elizaOS

A framework for creating autonomous agents, featuring connectors for Discord, Twitter, and Telegram, support for various AI models, multi-agent functionality, document ingestion, and memory storage capabilities.

smolagents

A barebones library for running multi-step AI agents that supports both CLI and web-based interactions, enabling rapid prototyping of agent-driven workflows.

SuperAGI

A dev-first open source autonomous AI agent framework designed for building, managing, and running autonomous agents.

SmythOS

SmythOS lets you build and deploy AI agents without manual coding. Describe your needs, and Agent Weaver creates it automatically using the best AI models and APIs. Integrate with OpenAI, Hugging Face, Amazon Bedrock, and more—no manual coding required.

OpenHands

A platform for AI software development agents that simplifies interactions with code and automates task management through minimal-code interfaces.

Roo Code

Roo Code is an AI-powered autonomous coding agent that lives in your editor. It communicates in natural language, reads and writes files, executes terminal commands, automates browser actions, and can integrate with any OpenAI-compatible API/model. It adapts its personality via customizable modes, acting as a flexible coding partner, system architect, QA engineer, or product manager to help build software more efficiently.

GPT Researcher

An LLM-based autonomous agent that conducts deep local and web research on any topic and generates long reports with citations, with support for connecting to specialized data sources.

AutoGen

A programming framework for creating and managing multi-agent workflows, enabling the development of complex AI-driven autonomous systems.

Dify

A production-ready platform for developing and deploying agent-driven workflows, enabling collaboration between multiple AI agents.

crewAI Tools

An open-source toolkit that provides a comprehensive guide and pre-built modules for integrating and creating custom tools for CrewAI agents. It includes implementations for file operations, web scraping, database interactions, API integrations, and AI-powered functionalities, supporting developers in enhancing AI agent capabilities.

OpenAI Realtime Agents

A demonstration repository showcasing advanced, agentic patterns built on top of OpenAI's Realtime API. It provides a Next.js/TypeScript example for prototyping multi-agent realtime voice applications, including sequential agent handoffs and state machine based interactions.

OpenManus

OpenManus is an open-source project that replicates the capabilities of the Manus AI agent by providing a modular, containerized multi-agent framework. It enables autonomous execution of complex tasks such as travel planning, data analysis, and content generation, and is built with Docker, Python, and JavaScript.

Goose

An open-source, extensible AI agent that goes beyond code suggestions by enabling installation, execution, editing, and testing using LLMs.

NVIDIA AgentIQ

An open-source toolkit for connecting and optimizing teams of AI agents by treating agent workflows as simple function calls, ensuring composability and scalable agent orchestration.

Auto-Deep-Research

An open-source, fully-automated personal AI assistant that serves as a cost-effective alternative to OpenAI's Deep Research. Built on the AutoAgent framework, it supports integration with various LLMs, function-calling interactions, file uploads, and a one-click launch for effortless research automation.

OpenAI Agents Python

A lightweight and powerful framework for multi-agent workflows built on LLMs, complete with built‐in agent tracing and management.

Cline

Cline is an autonomous coding agent that integrates into your IDE and CLI, capable of creating and editing files, executing terminal commands, interacting with the browser, and leveraging the Model Context Protocol to extend its capabilities—all under human supervision.

OWL

OWL (Optimized Workforce Learning) is an open-source multi-agent collaboration framework built on top of the CAMEL-AI Framework. It enables dynamic agent interactions and integrates various toolkits (such as web search, file writing, terminal execution, and browser automation) to facilitate robust and efficient task automation across real-world domains.

AI Scientist-v2

An autonomous AI agent pipeline for conducting deep research and experimental analysis.

Archon

An AI agent capable of creating other AI agents using an advanced agentic coding workflow and a framework knowledge base.

Suna - Open Source Generalist AI Agent

An open-source AI agent that offers an extensive toolkit including browser automation, file management, web crawling, and API integrations, all within isolated execution environments.

GUI-R1

GUI-R1 is a generalist R1-style vision-language action model designed for GUI agents that leverages reinforcement learning and policy optimization to automatically control and interact with graphical user interfaces across multiple platforms (Windows, Linux, macOS, Android, Web).

ADK Python

ADK Python is an open-source, code-first Python toolkit by Google for building, evaluating, and deploying sophisticated AI agents. It offers a modular framework for creating both single and multi-agent systems, enabling flexible integration, testing, and deployment across various environments, including cloud platforms such as Cloud Run and Vertex AI Agent Engine. It is optimized for the Gemini and Google ecosystem while remaining model- and deployment-agnostic.

Manus AI

Manus AI is an autonomous AI agent designed to execute complex tasks across multiple domains, including report writing, data analysis, content generation, and more. It features multi-modal capabilities, advanced tool integration (e.g., web browsers, code editors, database systems), and adaptive learning to optimize performance. The tool claims state-of-the-art performance on the GAIA benchmark, positioning itself as a competitive alternative to leading AI models.

Swarm

An experimental, educational framework by OpenAI for lightweight multi-agent orchestration. Swarm enables agents to offload tasks through simple handoffs and demonstrates scalable, stateless agent interactions using the Chat Completions API. Note that it has been superseded by the production-ready OpenAI Agents SDK.

Agentic Browser

An open-source AI agent designed for web automation and scraping. It orchestrates specialized agents (Planner, Browser, and Critique) to automate browser interactions such as form filling, data extraction, e-commerce searches, and content retrieval via a natural language interface.

Pipecat

Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. It orchestrates audio, video, AI services, and multiple transports (e.g., WebSockets, WebRTC) to enable developers to create voice assistants, AI companions, multimodal interfaces, interactive storytelling tools, and complex dialog systems.

JARVIS

JARVIS is an AI system developed by Microsoft that connects large language models with expert AI models from the ML community. It orchestrates task planning, model selection, and execution by leveraging multiple approaches (CLI, Gradio demo, web APIs) and supports integration with cloud and local deployments.

clineAI

clineAI is an autonomous coding agent integrated into your IDE that can create/edit files, execute terminal commands, use the browser, and analyze your project’s file structure and source code. It operates with human-in-the-loop permission, ensuring safe execution, and leverages the Model Context Protocol (MCP) to expand its capabilities.

AgenticSeek

An autonomous AI agent framework designed to run locally or on remote servers, offering both CLI and web interfaces for interaction and task automation.

Dive

Dive is an open-source MCP Host Desktop Application that integrates with various large language models (LLMs) supporting function calling capabilities. It offers universal LLM support (including ChatGPT, Anthropic, Ollama, and more), cross-platform compatibility (Windows, macOS, Linux), and advanced features like custom instructions, API management, and auto-update mechanisms via the Model Context Protocol (MCP).

II-Agent

II-Agent is an open-source intelligent agent framework that streamlines and enhances workflows across multiple domains. It provides a CLI interface and a WebSocket-powered, React-based frontend, and integrates with several leading language model providers (e.g., Anthropic Claude and Google Gemini). It also includes performance evaluation on the GAIA benchmark.

OpenHands

An open-source platform that enables code-less creation and deployment of autonomous AI agents.

smolagents

An open-source Python library that simplifies building and running AI agents with minimal code, complete with CLI tools and memory management features.

CrewAI

A framework for orchestrating autonomous AI agents, enabling role-playing and complex multi-agent workflows.

Plandex

Plandex is an open source, terminal-based AI coding agent designed for large projects and real-world tasks. It can plan and execute complex, multi-step coding workflows across dozens of files, supports a massive 2M token context (approximately 100K tokens per file), and uses tree-sitter for fast project mapping and syntax validation. It offers configurable autonomy ranging from full automated execution to fine-grained control with a cumulative diff review sandbox and automated debugging of terminal commands.

WebAgent (WebWalker & WebDancer)

WebAgent is an open-source autonomous agent framework by Alibaba Group for information seeking. It comprises two complementary systems: WebWalker, a benchmark for LLMs in web traversal (ACL2025), and WebDancer, a native agentic search reasoning model (Preprint). Utilizing the ReAct framework with a four-stage training paradigm including supervised fine-tuning and reinforcement learning, it is designed to handle long-horizon, multi-step web traversal and autonomous search tasks.

Gemini CLI

Gemini CLI is an open-source command-line AI workflow tool that brings Gemini’s multimodal AI capabilities directly into your terminal. It enables users to query and edit large codebases, generate new apps from PDFs or sketches, automate operational tasks (like handling pull requests or complex rebases), and integrate various tools using built-in Google Search support.

ottomator-agents

A GitHub repository hosting a collection of open-source AI agents built for the oTTomator Live Agent Studio platform. This repository contains various agents that perform tasks such as web research, content generation, and automation. It serves as a hub for deploying and experimenting with multiple AI agents and is actively maintained under the MIT license.

Refly

Refly is the world's first open-source 'Vibe Workflow' platform designed for complex tasks. It offers a multi-threaded conversation system and multi-model integration framework that enables parallel management of independent conversation contexts with intelligent agent workflows, supporting integrations with leading language models and MCP tools.

AutoGPT

AutoGPT is a powerful, open-source platform that enables users to create, deploy, and manage continuous AI agents to automate complex workflows. It offers both self-hosting and cloud-hosted options for accessible AI-driven automation.

Evo AI

Evo AI is an open-source platform for creating and managing AI agents. It enables integration with various language models and provides features such as custom tool management, multiple agent types (LLM, A2A, sequential, parallel, workflow, etc.), secure API key management, and client management. It is built using FastAPI, SQLAlchemy, PostgreSQL, and other modern technologies.

Nodetool

Nodetool is an agent framework that offers intelligent agent nodes supporting all LLM providers, a comprehensive tool suite, and advanced workflow management features including node visualization, global shortcuts, and integrations with services like Brightdata, Gemini, and Ollama. It also includes document and media processing nodes for PDF extraction and text-to-speech, along with core platform enhancements.

RobotPai

RobotPai is an AI-driven autonomous agent project hosted on Hugging Face Spaces. It integrates large language models with multiple tool bindings (including search, retrieval, and state graph management) to provide conversational assistance and dynamic tool orchestration. The commit shows updates to system prompt handling, retrieval tool integration, and LLM bindings.

monday.com MCP

monday.com MCP is an open-source framework designed to integrate AI agents with monday.com’s work operating system. It provides a monday API MCP Server for plug-and-play secure access to structured data and an Agent Toolkit that simplifies building AI integrations, automations, and custom workflows.

Camel

Camel is an open‐source AI agent framework by camel-ai that provides multi-model integration, support for advanced LLMs (e.g., GPT-5, Claude 4.1), custom sandboxing, and workforce prompting features. Its active development is evident from its version releases on GitHub, demonstrating its capability to orchestrate AI agent workflows.

Productivity

26 tools

IdeasAI

An OpenAI-powered startup idea generator that utilizes an autoregressive language model with deep learning to generate innovative product and business ideas.

LongShot AI

An AI platform for content creation and custom chatbot building, tailored for generating all sorts of written content.

Excalidraw

An open-source virtual whiteboard that allows users to create hand-drawn style diagrams, wireframes, and collaborate in real-time, featuring customizable tools and end-to-end encryption.

Paperless-AI

An automated document analyzer designed for the Paperless-ngx system, leveraging several AI APIs to analyze and tag documents.

Perplexica

Perplexica is an open-source AI-powered search engine that utilizes machine learning techniques such as similarity search and embeddings to process user queries and provide precise answers with cited sources. It offers different modes (Normal and Copilot) and supports integrations with local LLMs, making it a viable alternative to Perplexity AI.

OpenAdapt

An open-source generative tool aimed at process observation and productivity measurement for system optimization.

Maxun

An open-source no-code web data extraction platform that lets users train a robot in minutes to automatically scrape websites and convert them into APIs and spreadsheets.

Omnitool

An open source AI desktop environment offering a unified, browser-based interface to interact with multiple leading AI models and services.

OfficeAI

An AI-powered office assistant tailored for Microsoft Office and WPS, offering quick solutions for tasks like formatting, formula selection, and other productivity tweaks.

Winpilot

Winpilot is an AI-driven Windows companion app that streamlines system customization and management on Windows 10/11. It enables users to modify settings, remove bloatware, retrieve system information, and perform various system tasks through a chat-based interface powered by AI.

Docling

A tool designed to prepare documents for generative AI by setting up pipelines, including audio transcription using models like Whisper.

Nanobrowser

An open-source Chrome extension that automates web interactions using AI, serving as a lightweight alternative to proprietary solutions.

Reka Flash 3

Powers Nexus – a platform for organizations to create and manage AI workers with native deep research capabilities and web browsing.

Aviator Predictor

An AI tool designed to predict product launch outcomes with features like activation code systems and multi-platform support.

Prompt Buddy

Prompt Buddy is a free Microsoft Teams Power App built on the Power Platform with Dataverse for Teams. It provides a dedicated space for teams to share, upvote, and discover AI prompts. The app is customizable, preloaded with Microsoft Copilot categories, and supports smooth updates without losing settings.

Blinko

Blinko is an open-source, self-hosted personal AI note tool built with TypeScript that prioritizes privacy. It allows users to instantly capture ideas as plain text with full Markdown support and leverages AI-powered Retrieval-Augmented Generation (RAG) for natural language note retrieval, all while ensuring data ownership through self-hosting.

Rytr Desktop

Rytr Desktop is a fully-featured desktop application for AI-assisted text creation and editing. It enables content generation, rewriting, tone customization, grammar improvement, and integrated plagiarism checking without subscriptions or feature limitations, targeting content creators across multiple languages.

New Plant Disease Detection

An AI tool that analyzes uploaded plant leaf images to diagnose diseases and provide a confidence level along with visual highlights.

Aviator Predictor

An AI-powered prediction application for the popular Aviator game. Using an enhanced prediction algorithm and activation code system, it offers both free and premium features with multi-platform support (PC, iOS, Android) to provide accurate game outcome predictions and an improved user experience.

ScraperAI

ScraperAI is an open-source, AI-powered tool that simplifies web scraping by leveraging large language models like ChatGPT to automatically detect data elements, generate XPATHs, handle pagination, and create reusable scraping recipes. It supports multiple scraping methods including Selenium and custom crawlers.

GodMode

GodMode is an AI chat browser that lets users simultaneously access multiple AI chat webapps such as ChatGPT, Claude, Bard, Bing, Llama2, and more with a single keyboard shortcut. It aggregates input across all these platforms in real-time and supports macOS (both Apple Silicon and Intel) with experimental Windows/Linux support.

Pinokio

Pinokio is an open‐source AI-powered browser that integrates advanced AI capabilities to enhance the web browsing experience. The repository provides the source code for a browser application with AI-driven features and has a dedicated website at pinokio.co.

Falconsai-medical Summarization-new

A Hugging Face Space demo created by kiranbeethoju that provides a medical text summarization tool. The tool is designed to automatically generate concise summaries of medical documents or content, aiding in information extraction from medical texts.

Writing Tools

A desktop AI-assisted writing application that integrates multiple AI providers (OpenAI, Ollama, Gemini, DeepSeek R1, etc.) to enhance the writing and proofreading process. It features customizable buttons, a sleek pop-up UI, hotkey functionality, and support for various providers to optimize the user’s writing workflow.

Organization Activity Heatmap

A Hugging Face Space that visualizes an organization's activity by fetching and displaying metadata from the HF Docker repository in a heatmap format. It provides a dashboard view for monitoring model releases and organizational activity.

AI Sheets

AI Sheets is an open-source, no-code spreadsheet tool for building, transforming, and enriching datasets using AI models. It enables users to experiment with data by generating new columns via AI prompts, classifying, cleaning, and even synthesizing datasets. The tool can be deployed locally or on the Hugging Face Hub.

Directories & Lists

3 tools

AI Launch List

The largest directory of AI tool directories, designed to showcase and consolidate AI product information in one place.

Dataset-to-Model Monitor

A tool that monitors datasets and tracks models trained on them, helping users manage and oversee AI model performance.

All-AI-Tools

A Retrieval-Augmented Generation (RAG) system that indexes and provides information about over 300 AI tools. Built with LlamaIndex, pgvector, and FastAPI, it features a Telegram bot interface enhanced with Nvidia NeMo Guardrails to answer user queries about AI tools.

Gaming

9 tools

AI Aimbot

An open-source, Python-based AI-powered aiming tool designed for games that automates target acquisition.

AirHub-V2

A ROBLOX aimbot and wall hack tool incorporating advanced AI techniques to optimize targeting, offering features like universal aimbot functionality and enhanced visual aids.

Sunone Aimbot

An AI-powered aimbot for FPS games that leverages YOLOv8 and YOLOv10 models, PyTorch, and TensorRT to automatically detect and target enemies in various first-person shooter games.

Aviator Prediction App

A predictive tool that uses advanced algorithms to provide real‑time outcome predictions for the Aviator game. The app is available on Windows, iOS, and Android, and is designed to help users make more informed gameplay or betting decisions.

Open-Aimbot

Open-Aimbot is a universal open-source aim assist framework for Roblox, offering over 80 features such as detection bypasses, silent aim, configurable sensitivity, and a dynamic UI, designed for game exploitation and cheat development.

AIMr

AIMr is an AI-powered aimbot written in Python designed for FPS games including Fortnite, Valorant, CS2, R6, COD, Apex, and more. It features advanced functionalities such as recoil control, silent aim, prediction, customizable visuals, and leverages modern AI technologies (e.g., YOLO) to achieve an undetected performance. It is available as an open-source project on GitHub and has a paid enhanced version available via Discord.

Aimmy-V2

Aimmy-V2 is an open‐source, AI-based aim alignment tool designed to assist gamers—especially those facing physical or accessibility challenges—with aiming in FPS games. It leverages DirectML, ONNX, and YOLOv8 for fast and efficient opponent detection, and includes features like auto-trigger, hot model/config swapping, and adjustable aiming settings.

PrimeAIM

PrimeAIM is an AI-powered aim assist tool for shooter games that uses OpenCV for screen capturing and PyTorch with YOLOv5 for object/player detection. It allows features such as head/chest aiming, adjustable aim speed, customizable field-of-view (FOV) and ESP overlays, and leverages the Windows API for precise mouse control. It is designed for educational purposes and is developed in Python.

Xeno Executor

Xeno Executor is a next-generation Roblox script executor designed for speed, security, and efficiency. It offers features such as multi-script execution, custom UI, anti-ban mechanisms, and a built-in script library, making it a powerful tool for automating and enhancing Roblox gameplay.

Research Tools

11 tools

Aria - AI Research Assistant (Zotero Plugin)

A Zotero plugin that leverages GPT-4 and GPT-4 Turbo to provide an AI-powered research assistant. It offers features such as drag-and-drop referencing, autocompletion, visual analysis via GPT-4 Vision, and conversational interactions, helping users manage and annotate their Zotero items efficiently.

PapersGPT For Zotero

An AI-powered Zotero plugin that enhances academic research by enabling users to interact with PDF documents through chat. It supports a variety of state-of-the-art language models (e.g., GPT-4.5, ChatGPT, Claude, Gemini, DeepSeek, and others) and integrates seamlessly with Zotero, offering local model deployment for privacy and efficiency.

Open Deep Research Agent

An open–source deep research AI agent that utilizes reasoning models to conduct in–depth factual research. It separates planning and research execution for detailed report generation.

DeepSeek-Prover-V2

An AI tool for advancing formal mathematical reasoning using reinforcement learning and subgoal decomposition, synthesizing chain-of-thought formal proofs.

DeepSeek-Prover-V1.5-RL

DeepSeek-Prover-V1.5-RL is an open‐source language model for formal theorem proving in Lean 4. It refines the previous DeepSeek-Prover models by incorporating reinforcement learning from proof assistant feedback (RLPAF) and a Monte-Carlo tree search variant (RMaxTS) to generate diverse proof paths, achieving state‐of‐the‐art results on miniF2F and ProofNet benchmarks.

SpeedAI 科研小助手

A Chinese AI tool designed for academic research that reduces similarity (降重) and supports services for databases like 知网, 维普, 格子达, and Turnitin.

DeepSearcher

DeepSearcher is an open-source tool that leverages multiple large language models and vector databases to perform private data search, evaluation, and reasoning, providing accurate answers and comprehensive reports for enterprise knowledge management and intelligent Q&A systems.

Unimol_tools

Unimol_tools is an easy-to-use, auto-ML molecule property prediction tool that provides wrappers for property prediction, molecular representation, and downstream tasks using the Uni-Mol framework. It facilitates molecule property prediction by integrating with pre-trained models hosted on Hugging Face and supports both PyTorch and RDKit for chemoinformatics workflows.

The AI Scientist

An AI tool designed to automate scientific ideas and research across different subject fields, facilitating coding and experimentation.

OpenAI Universe

A tool that transforms existing programs into OpenAI Gym environments using Docker, enabling real-time AI interaction with software.

2024 AI Timeline

An interactive timeline hosted on Hugging Face Spaces by reach-vb that visually presents key AI events and milestones for 2024. The space appears to be a working app showcasing timeline data related to AI developments.

Security

5 tools

Nebula

Nebula is an advanced, open-source, AI-powered penetration testing assistant designed to automate recon, note-taking, and vulnerability analysis. It integrates state-of-the-art AI models (via OpenAI, Ollama, etc.) directly into a command-line interface, streamlining vulnerability assessments and security workflows for cybersecurity professionals.

Taranis AI

An advanced open-source intelligence (OSINT) tool that leverages AI to gather, analyze, and structure unstructured data from diverse sources.

HumanLayer

HumanLayer is an API and SDK that enables AI agents to communicate with humans through tool-based and asynchronous workflows. It ensures human oversight of high-stakes function calls via approval workflows integrated into channels like Slack, email, and more. It supports multiple LLMs (e.g., OpenAI, Llama, Claude) and frameworks (e.g., LangChain, CrewAI), providing a safe way to give AI agents controlled access to perform actions.

SecureAI-Tools

A self-hostable suite of private and secure AI tools designed to boost productivity. It offers features such as chatting with AI models and documents (PDFs), local inference supporting 100+ open-source models (via Ollama), built-in email/password authentication, user management, and streamlined deployment using Docker Compose.

Cybersecurity AI (CAI)

An open-source, lightweight framework by Alias Robotics for building bug bounty-ready cybersecurity AIs. It provides developers with agents, tools, and interaction patterns for testing vulnerabilities and performing ethical penetration testing.