Explore AI Models
Browse 265+ state-of-the-art foundation models. Filter by category, provider, or search by name.
265 models found
GPT-4o
OpenAI's most advanced multimodal model. Accepts text, image, and audio inputs with best-in-class reasoning. ~200B parameters (estimated).
DALL-E 3
Latest image generation model with improved prompt understanding and photorealistic outputs.
Whisper Large v3
Open-source speech recognition model. Robust multilingual transcription. 1.55B parameters.
Gemini 2.0 Flash
Google's fastest model with multimodal reasoning across text, image, and audio.
Claude 3.5 Sonnet
Anthropic's most intelligent model. Excels at complex reasoning, coding, and analysis.
Llama 3.1 405B
Meta's largest open-source model. State-of-the-art with 128K context. 405B parameters.
Mistral Large
Previous flagship model for reasoning and code generation. 70B parameters (estimated).
DeepSeek V3
671B MoE model with 37B active parameters. Rivals frontier models at fraction of cost.
Grok-2
xAI's frontier model with real-time information access and strong reasoning.
Command R+
Scalable enterprise model optimized for RAG, tool use, and multi-step agents. 104B parameters.
Stable Diffusion XL
Popular open-source image generation model with photorealistic output. 3.5B parameters.
Yi-Large
Flagship model with strong bilingual English-Chinese capabilities.
Yi-1.5 34B
Strong bilingual model excelling at coding and reasoning. 34B parameters.
Yi-1.5 9B
Compact bilingual model for diverse applications. 9B parameters.
Yi-Coder 9B
Specialized code model with strong performance for its size. 9B parameters.
StripedHyena Nous 7B
Alternative architecture model using hyena operator. 7B parameters.
DBRX
Enterprise-grade MoE model optimized for efficiency. 132B total / 36B active parameters.
DBRX Instruct
Instruction-tuned DBRX for enterprise applications. 132B total parameters.
OpenELM 3B
Apple's efficient open-source language model family. 3B parameters.
Runway Gen-3 Alpha
Advanced video generation model with high fidelity and temporal coherence.
Pika 2.0
Text-to-video model with scene-level physics understanding.
Kling 1.5
Chinese video generation model with strong motion and physics simulation.
Mochi 1
Open-source text-to-video model with strong motion quality. 10B parameters.
LTX Video
Fast open-source video generation model based on DiT architecture. 2B parameters.
Midjourney v6
Industry-leading image generation with exceptional artistic quality and photorealism.
Adobe Firefly 3
Commercial-safe image generation trained on licensed content.
Ideogram 2.0
Image generation model with industry-best text rendering in images.
Playground v3
Image generation model optimized for graphic design and aesthetics.
GPT-4o Mini
Compact and cost-efficient variant of GPT-4o. Strong performance on everyday tasks. ~8B parameters (estimated).
GPT-4 Turbo
High-performance GPT-4 variant with 128K context window and improved instruction following. ~1.8T parameters (estimated).
GPT-4
OpenAI's flagship large language model with advanced reasoning. ~1.8T parameters (estimated MoE).
GPT-3.5 Turbo
Fast, affordable model optimized for chat and instruction-following. 175B parameters.
o1
OpenAI's reasoning model with chain-of-thought thinking for complex problem-solving. ~200B parameters (estimated).
o1-mini
Smaller reasoning model optimized for STEM and coding tasks. ~100B parameters (estimated).
o1-pro
Enhanced reasoning model with improved reliability for the most demanding tasks.
o3
Next-generation reasoning model with state-of-the-art benchmark performance.
o3-mini
Compact reasoning model balancing speed, cost, and strong reasoning.
o4-mini
Latest compact reasoning model with enhanced multimodal and agentic capabilities.
Whisper Large v3 Turbo
Faster variant of Whisper v3 with pruned decoder. 809M parameters.
Whisper Medium
Mid-size speech recognition model balancing accuracy and speed. 769M parameters.
Whisper Small
Compact speech recognition model for resource-constrained environments. 244M parameters.
Sora
Text-to-video generation model capable of creating realistic and imaginative scenes.
Codex
Code-focused model powering GitHub Copilot. Excels at code generation. 12B parameters.
GPT-4.1
Incremental improvement over GPT-4 with better coding performance and instruction following.
GPT-4.1 Mini
Compact variant of GPT-4.1 optimized for cost-effective deployments.
GPT-4.1 Nano
Ultra-compact GPT-4.1 variant designed for the fastest, cheapest inference at scale.
Gemini 2.5 Pro
Google's most advanced thinking model with enhanced reasoning and 1M token context.
Gemini 2.5 Flash
Fast, efficient Gemini model with thinking capabilities and multimodal understanding.
Gemini 1.5 Pro
Powerful multimodal model with 2M token context window for long-document understanding.
Gemini 1.5 Flash
Lightweight model optimized for speed and cost-efficiency with strong multimodal abilities.
Gemini 1.0 Ultra
Google's first Gemini flagship model for highly complex multimodal tasks.
Gemma 3 27B
Google's latest open-source model. State-of-the-art for its size class. 27B parameters.
Gemma 3 12B
Mid-size open-source Gemma 3 model balancing capability and efficiency. 12B parameters.
Gemma 3 4B
Compact open-source Gemma 3 model for mobile and edge deployment. 4B parameters.
Gemma 3 1B
Ultra-compact Gemma 3 model for on-device inference. 1B parameters.
Gemma 2 27B
Open-source text model with strong general capabilities. 27B parameters.
Gemma 2 9B
Efficient open-source model for balanced performance. 9B parameters.
Gemma 2 2B
Compact open-source model ideal for edge and mobile inference. 2B parameters.
CodeGemma 7B
Specialized code model based on Gemma for code generation and completion. 7B parameters.
PaLM 2
Google's advanced language model powering Bard and other AI features. 340B parameters (estimated).
Imagen 3
Highest-quality text-to-image model with photorealistic output and deep language understanding.
Veo 2
Most capable video generation model producing high-quality, realistic footage.
T5 XXL
Text-to-text transformer for NLP tasks. Pre-trained on C4 corpus. 11B parameters.
FLAN-T5 XXL
Instruction-tuned T5 model with strong zero-shot capabilities. 11B parameters.
BERT Large
Bidirectional encoder for NLU tasks. Foundation of modern NLP. 340M parameters.
SigLIP
Vision-language model for zero-shot image classification using sigmoid loss. 878M parameters.
Claude 4 Opus
Anthropic's most powerful model with unmatched reasoning, analysis, and creative capabilities.
Claude 4 Sonnet
Balanced Claude 4 model offering strong performance with improved speed and efficiency.
Claude 3.7 Sonnet
Hybrid thinking model combining instant responses with extended reasoning when needed.
Claude 3.5 Haiku
Fast, affordable Claude model for quick responses and high-throughput applications.
Claude 3 Opus
Previous flagship with exceptional performance on highly complex, open-ended tasks.
Claude 3 Sonnet
Balanced model with strong performance and practical speed for enterprise workloads.
Claude 3 Haiku
Fastest Claude 3 model for near-instant responses at scale.
Llama 4 Scout
Meta's latest MoE model with 17B active / 109B total parameters and 10M token context.
Llama 4 Maverick
Larger Llama 4 MoE variant: 17B active / 400B total parameters for complex reasoning.
Llama 3.3 70B
Cost-effective 70B model delivering Llama 3.1 405B-level performance. 70B parameters.
Llama 3.1 70B
High-performance open-source model for commercial applications. 70B parameters.
Llama 3.1 8B
Compact Llama model for edge deployment and constrained environments. 8B parameters.
Llama 3 70B
Previous generation large Llama model with strong general performance. 70B parameters.
Llama 3 8B
Efficient previous-gen Llama model for everyday applications. 8B parameters.
Llama 2 70B
Large open-source model that kickstarted the open LLM revolution. 70B parameters.
Llama 2 13B
Mid-size Llama 2 model balancing capability and compute. 13B parameters.
Llama 2 7B
Compact foundational model for fine-tuning and experimentation. 7B parameters.
Code Llama 70B
Largest Code Llama variant for complex code generation. 70B parameters.
Code Llama 34B
Specialized coding model supporting code generation and infilling. 34B parameters.
Code Llama 7B
Compact code model for fast code completion and generation. 7B parameters.
Llama Guard 3
Safety classifier for content moderation in AI applications. 8B parameters.
Segment Anything 2
Foundation model for promptable visual segmentation in images and videos.
MusicGen Large
Music generation model creating audio from text descriptions. 3.3B parameters.
MusicGen Medium
Mid-size music generation model for text-to-music. 1.5B parameters.
SeamlessM4T v2
Multilingual multimodal translation model for speech and text. 2.3B parameters.
ImageBind
Embedding model binding 6 modalities: images, text, audio, depth, thermal, IMU.
Mistral Large 2
Mistral's flagship model for complex reasoning and multilingual tasks. 123B parameters.
Mistral Medium
Balanced model for business applications at moderate cost.
Mistral Small 3
24B parameter open-source model competitive with much larger alternatives.
Mistral Small
Efficient model for simple tasks, very low latency, and bulk processing.
Mistral Nemo
Open-source model co-developed with NVIDIA. 128K context. 12B parameters.
Mistral 7B
The original Mistral model that outperformed Llama 2 13B. 7.3B parameters.
Mixtral 8x22B
Large MoE model. 141B total / 39B active parameters. Strong multilingual.
Mixtral 8x7B
Efficient MoE model. 46.7B total / 12.9B active parameters.
Codestral 25.01
Latest specialized coding model for code generation and review. 22B parameters.
Codestral Mamba
Linear-time coding model using Mamba architecture. 7B parameters.
Pixtral Large
Frontier multimodal model for document understanding and image analysis. 124B parameters.
Pixtral 12B
Open-source multimodal model with vision capabilities. 12B parameters.
DeepSeek R1
Reasoning model using RL to achieve o1-level performance. 671B total / 37B active parameters.
DeepSeek R1 Distill Qwen 32B
R1 reasoning distilled into Qwen 2.5 32B architecture. 32B parameters.
DeepSeek R1 Distill Llama 70B
R1 reasoning distilled into Llama 3.1 70B architecture. 70B parameters.
DeepSeek R1 Distill Qwen 7B
R1 reasoning distilled into compact 7B architecture. 7B parameters.
DeepSeek R1 Distill Qwen 1.5B
Ultra-compact R1 distillation for edge deployment. 1.5B parameters.
DeepSeek V2.5
Previous generation MoE model with strong general capabilities. 236B total parameters.
DeepSeek Coder V2
Code model supporting 338 programming languages. 236B total / 21B active parameters.
DeepSeek Coder 33B
Specialized code generation model trained from scratch. 33B parameters.
DeepSeek Coder 6.7B
Compact code model for fast development assistance. 6.7B parameters.
Janus Pro 7B
Multimodal model unifying visual understanding and generation. 7B parameters.
Grok-3
xAI's most powerful model topping multiple benchmarks. Real-time information access.
Grok-3 Mini
Lightweight thinking model with controllable reasoning depth.
Grok-2 Mini
Compact Grok variant for fast, cost-effective inference.
Grok-1
xAI's first open-source model released as MoE. 314B total parameters.
Command A
Latest enterprise model with 256K context and multilingual agentic performance. 111B parameters.
Command R
Efficient enterprise model for production RAG workloads. 35B parameters.
Command R7B
Compact open-source model for fast enterprise inference. 7B parameters.
Embed v4
Latest embedding model for semantic search, classification, and clustering.
Embed v3
Multilingual embedding model supporting 100+ languages.
Aya 23 35B
Open-source multilingual model covering 23 languages. 35B parameters.
Aya 23 8B
Compact multilingual model for 23 languages. 8B parameters.
Aya Expanse 32B
Instruction-tuned multilingual model excelling in 23 languages. 32B parameters.
Stable Diffusion 3.5 Large
Latest Stable Diffusion with improved text rendering and composition. 8B parameters.
Stable Diffusion 3.5 Medium
Mid-size diffusion model balancing quality and speed. 2.5B parameters.
Stable Diffusion 1.5
Classic and widely-used image generation model with huge ecosystem. 860M parameters.
Stable Video Diffusion
Open-source video generation model for short clips from images. 1.5B parameters.
Stable Audio 2.0
AI music and sound generation from text descriptions. 1.1B parameters.
Stable LM 2 12B
Open-source language model for general-purpose text generation. 12B parameters.
Phi-4
Small language model excelling at reasoning and STEM tasks. 14B parameters.
Phi-3.5 MoE
Mixture-of-experts variant of Phi-3.5 for improved efficiency. 42B total / 6.6B active.
Phi-3.5 Mini
Compact model with 128K context for edge/mobile deployment. 3.8B parameters.
Phi-3 Medium
Mid-size Phi-3 model with strong reasoning capabilities. 14B parameters.
Phi-3 Mini
Compact language model for on-device AI applications. 3.8B parameters.
Phi-2
Efficient small model demonstrating emergent capabilities. 2.7B parameters.
Florence-2 Large
Unified vision model for captioning, detection, segmentation, and OCR. 770M parameters.
Florence-2 Base
Foundation vision model for diverse visual understanding tasks. 230M parameters.
Orca 2 13B
Research model trained with improved reasoning strategies. 13B parameters.
WizardLM 2 8x22B
MoE model with enhanced instruction following. 141B total parameters.
Nova Pro
Highly capable multimodal model balancing accuracy, speed, and cost.
Nova Lite
Fast, low-cost multimodal model processing images, video, and text.
Nova Micro
Text-only model with lowest latency at very low cost.
Nova Canvas
Image generation model for creative and professional content creation.
Nova Reel
Video generation model for short-form content creation.
Titan Text Premier
Amazon's flagship text model for enterprise workloads.
Titan Embeddings v2
Text embedding model for search and retrieval applications.
Qwen 3 235B A22B
Largest Qwen 3 MoE model with hybrid thinking. 235B total / 22B active parameters.
Qwen 3 32B
Strong general-purpose model with hybrid thinking. 32B parameters.
Qwen 3 14B
Mid-size Qwen 3 model balancing power and efficiency. 14B parameters.
Qwen 3 8B
Compact Qwen 3 model for versatile applications. 8B parameters.
Qwen 3 4B
Small Qwen 3 model for edge deployment. 4B parameters.
Qwen 3 1.7B
Ultra-compact Qwen 3 model for mobile devices. 1.7B parameters.
Qwen 3 0.6B
Tiny Qwen 3 model for embedded and IoT applications. 0.6B parameters.
Qwen 2.5 72B
Large model competitive with Llama 3.1 405B on reasoning benchmarks. 72B parameters.
Qwen 2.5 32B
Strong mid-size model for diverse tasks. 32B parameters.
Qwen 2.5 14B
Efficient model for production deployments. 14B parameters.
Qwen 2.5 7B
Compact model for fast inference and fine-tuning. 7B parameters.
Qwen 2.5 3B
Small model for resource-constrained environments. 3B parameters.
Qwen 2.5 Coder 32B
Top open-source coding model rivaling GPT-4o on coding benchmarks. 32B parameters.
Qwen 2.5 Coder 7B
Compact code model for fast code completion. 7B parameters.
Qwen VL Max
Flagship vision-language model for document/chart understanding and visual QA.
Qwen VL Plus
Balanced vision-language model for image-based tasks.
QwQ 32B
Reasoning model by Alibaba matching o1-mini level performance. 32B parameters.
Yi-Lightning
Fast and powerful model rivaling frontier models at a fraction of cost.
Kolors
Open-source text-to-image model with strong Chinese text support. 8B parameters.
PixArt-Σ
Open-source DiT-based image generation with 4K resolution support. 900M parameters.
E5 Mistral 7B
LLM-based text embedding model with strong retrieval performance. 7B parameters.
BGE-M3
Multi-lingual, multi-granularity embedding model. 568M parameters.
GTE-Qwen2 7B
Text embedding model based on Qwen2 for retrieval tasks. 7B parameters.
NV-Embed v2
State-of-the-art generalist embedding model. 7B parameters.
Jina Embeddings v3
Multilingual multi-task embedding model with matryoshka support. 572M parameters.
Nomic Embed v1.5
Open-source embedding model with improved performance. 137M parameters.
RT-2
Vision-Language-Action model for robotic control. 55B parameters.
Octo
Open-source generalist robot policy for dexterous manipulation.
Jamba 1.5 Large
Hybrid SSM-Transformer model with 256K context for enterprise RAG. 398B total / 94B active.
Jamba 1.5 Mini
Compact Jamba for fast enterprise workloads. 52B total / 12B active parameters.
Nemotron 70B
70B model fine-tuned with RLHF, excelling as reward model and assistant. 70B parameters.
Nemotron 340B
Large-scale model optimized for enterprise synthetic data generation. 340B parameters.
NVLM 72B
Frontier multimodal LLM matching proprietary models on vision-language tasks. 72B parameters.
Llama 3.1 Nemotron 70B
NVIDIA-optimized Llama 3.1 70B with enhanced helpfulness. 70B parameters.
Falcon 180B
One of the largest open-source autoregressive models. 180B parameters.
Falcon 40B
Strong open-source model that topped HuggingFace leaderboard. 40B parameters.
Falcon 11B
Compact Falcon model for efficient deployment. 11B parameters.
Falcon 7B
Base Falcon model on RefinedWeb dataset. 7B parameters.
Falcon 3 10B
Latest Falcon generation with improved training methodology. 10B parameters.
Falcon Mamba 7B
First large-scale pure Mamba architecture model. 7B parameters.
BLOOM
Open multilingual model supporting 46 languages, created by 1000+ researchers. 176B parameters.
BLOOMZ
Instruction-tuned BLOOM for cross-lingual zero-shot task generalization. 176B parameters.
StarCoder 2 15B
Open-source code LLM on The Stack v2, supporting 600+ languages. 15B parameters.
StarCoder 2 7B
Compact code model for fast code generation. 7B parameters.
StarCoder 2 3B
Small code model for local development tools. 3B parameters.
StarCoder
Original open-source code model trained on permissive data. 15.5B parameters.
SmolLM2 1.7B
Tiny but capable model for on-device AI. 1.7B parameters.
SmolLM2 360M
Ultra-compact model for embedded applications. 360M parameters.
SmolLM2 135M
Smallest capable instruction-following model. 135M parameters.
SmolVLM 2 2.2B
Compact vision-language model for image and video understanding. 2.2B parameters.
FLUX.1 [dev]
State-of-the-art open image generation with exceptional prompt following. 12B parameters.
FLUX.1 [schnell]
Fastest FLUX variant for real-time image generation. 12B parameters.
FLUX.1 [pro]
Premium FLUX model with highest quality outputs. 12B parameters.
Bark
Open-source TTS generating realistic speech, music, and sound effects.
Parler TTS Large
Controllable text-to-speech with natural speaker descriptions. 2.3B parameters.
Dia 1.6B
TTS model generating realistic dialogue with emotions. 1.6B parameters.
Moshi
Real-time speech-to-speech foundation model for natural conversation. 7B parameters.
Mars5 TTS
Novel two-stage TTS model excelling at prosody and expression.
F5-TTS
Flow-matching based text-to-speech with zero-shot voice cloning.
Pi
Personal AI assistant designed for emotional intelligence and supportive conversations.
Inflection 2.5
97.5% of GPT-4 performance with strong EQ capabilities.
Reka Core
Frontier multimodal model processing text, images, video, and audio natively.
Reka Flash
Fast multimodal model for cost-effective applications. 21B parameters.
Reka Edge
Compact multimodal model for edge deployment. 7B parameters.
Sonar Reasoning Pro
Advanced reasoning model with real-time web search and citations.
Sonar Pro
Enhanced search-augmented model with grounded, cited answers.
Sonar
Fast search-augmented model providing answers with web citations.
GLM-4 9B
Bilingual model with strong tool-use and 128K context. 9B parameters.
CogVideoX 5B
Open-source text-to-video model with strong generation quality. 5B parameters.
CogView 3 Plus
Text-to-image model with relay-based generation for high quality.
OpenELM 1.1B
Compact Apple language model for on-device AI. 1.1B parameters.
AIMv2 Large
Autoregressive vision encoder pre-trained with multimodal objectives. 304M parameters.
xLAM 2 70B
Large action model for function calling and multi-step agentic tasks. 70B parameters.
xLAM 2 8B
Compact action model excelling at tool use and function calling. 8B parameters.
CodeGen 2.5 7B
Multi-turn program synthesis model for code generation. 7B parameters.
Hunyuan Large
Tencent's largest open-source MoE model. 389B total / 52B active parameters.
HunyuanVideo
Open-source video generation model with strong physical understanding. 13B parameters.
Hunyuan3D 2.0
3D asset generation model creating high-quality meshes from images/text.
ERNIE 4.0
Baidu's flagship model with strong Chinese-English bilingual capabilities.
ERNIE-ViLG 2.0
Text-to-image model from Baidu with knowledge-enhanced generation.
MPT-30B
Open-source model trained from scratch with commercial license. 30B parameters.
MPT-7B
Commercially usable open-source base model. 7B parameters.
Pythia 12B
Research model suite for studying LLM training dynamics. 12B parameters.
GPT-NeoX 20B
Large open-source autoregressive model. 20B parameters.
GPT-Neo 2.7B
Early open-source GPT alternative on The Pile. 2.7B parameters.
MiniMax-01
Powerful model with 4M token context window. 456B total parameters.
MiniMax-Text-01
Lightning-fast text model with hybrid attention architecture.
Exaone 3.5 32B
Bilingual model excelling at instruction following. 32B parameters.
Exaone 3.5 7.8B
Compact bilingual model for diverse applications. 7.8B parameters.
Exaone 3.5 2.4B
Small but capable bilingual model. 2.4B parameters.
Vicuna 33B
Fine-tuned LLaMA model trained on ShareGPT conversations. 33B parameters.
Vicuna 13B
Popular fine-tuned chat model based on LLaMA. 13B parameters.
Vicuna 7B
Compact chat model fine-tuned on user conversations. 7B parameters.
Zephyr 7B
DPO-aligned model based on Mistral 7B with strong chat abilities. 7B parameters.
Nous Hermes 2 Mixtral 8x7B
Instruction-tuned Mixtral with strong general performance. 46.7B total parameters.
Nous Hermes 2 34B
Strong instruct model based on Yi-34B. 34B parameters.
OpenChat 3.5 7B
High-quality open model rivaling ChatGPT. 7B parameters.
Neural Chat 7B
Intel-optimized chat model based on Mistral 7B. 7B parameters.
Solar 10.7B
Depth upscaled model with strong performance. 10.7B parameters.
InternLM 2.5 20B
Strong open-source model with excellent tool use. 20B parameters.
InternLM 2.5 7B
Compact model with advanced reasoning and tool use. 7B parameters.
InternVL 2.5 78B
Open-source multimodal model matching GPT-4o on vision tasks. 78B parameters.
Baichuan 2 13B
Strong bilingual model with emphasis on Chinese NLP. 13B parameters.
Baichuan 2 7B
Compact bilingual model for Chinese-English tasks. 7B parameters.
ChatGLM3 6B
Bilingual chat model with function calling support. 6B parameters.
Aquila 2 70B
Bilingual model from Beijing Academy of AI. 70B parameters.
MAP-Neo 7B
Fully open-source bilingual model with transparent training. 7B parameters.
OLMo 2 13B
Fully open model with open data, weights, and code by Allen AI. 13B parameters.
OLMo 2 7B
Compact fully open model for research and deployment. 7B parameters.
Tülu 3 70B
State-of-the-art post-trained open model by Allen AI. 70B parameters.
Molmo 72B
Open multimodal model rivaling proprietary models on vision tasks. 72B parameters.
LeRobot
Open-source robotics framework with pretrained manipulation policies.
Med-PaLM 2
Medical-specialized model with expert-level clinical reasoning.
BioMistral 7B
Biomedical domain model fine-tuned on PubMed data. 7B parameters.
Codestral
Mistral's specialized coding model for code generation, review, and debugging.
Meditron 70B
Medical LLM adapted from Llama 2 for healthcare. 70B parameters.