Run open-source
AI models
‍with a single API

Serverless inference API for cutting-edge open-source models.
Improve your GenAI App by switching to better models with one line of code. OpenAI libraries compatible.

Get Started for Free Documentation

All the latest industry-leading models are served here as a production-ready API.

Thanks to the contribution of communities, open-source models are catching up very well, and perform even better than closed models upon use case. We provide easier access to those cutting-edge models from chat, image to embedding and reranking, as a simple API, so developers can use them like they use OpenAI APIs.

mistralai/Mixtral-8x22B-Instruct-v0.1

chat

Multilingual

google/gemma-7b-it

chat

English

meta-llama/Meta-Llama-3-70B-Instruct

chat

English

Snowflake/snowflake-arctic-instruct

chat

English

Qwen/Qwen1.5-110b-Chat

chat

English

BAAI/bge-m3

embedding

Multilingual

WhereIsAI/UAE-Large-V1

embedding

English

microsoft/multilingual-e5-large

embedding

Multilingual

cyberagent/calm2-7b-chat

chat

Japanese

Learn more →

OpenAI libraries compatible

If you are already using the OpenAI libraries, you can quickly switch by replacing "base_url" and "model".

Learn more →

Seamless integration with Teammate AI Services

Teammate Infer is available by default on "Lang" and "Aug". You can quickly start developing your own GenAI apps and RAG without API connection to platforms like OpenAI or Amazon Bedrock.

Privacy is completely protected

We’re NOT a model builder, but a HOST. We will not (and have no incentive!) to store and use your data to train new models. We believe our unique position makes GenAI app builders even more confident in their privacy and autonomy.