LOADING

Falcon Models

Falcon LLM is a generative large language model (LLM) that helps advance applications and use cases to future-proof our world. Today the Falcon 2, 180B, 40B, 7.5B, 1.3B parameter AI models, as well as our high-quality REFINEDWEB dataset, form a suite of offerings.

Falcon 2

Falcon 2 11B | Falcon 2 11B VLM

Today, we have unveiled Falcon 2: we’re proud to announce it is Open-Source, Multilingual, and Multimodal – and is only AI Model with Vision-to-Language Capabilities. New Falcon 2 11B Outperforms Meta’s Llama 3 8B, and Performs on par with leading Google Gemma 7B Model, as Independently Verified by Hugging Face Leaderboard. Next up, we’re looking to add 'Mixture of Experts' to enhance Falcon 2’s capabilities even further.

The download of Falcon 2 is subject to our Terms & Conditions and Acceptable Use Policy

FALCON
FALCON

Falcon 7B

Falcon-7B is a 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license.

Falcon 40B

Falcon 40B was the world’s top-ranked open-source AI model when launched. Falcon has 40 billion parameters and was trained on one trillion tokens. For two months following its launch, Falcon 40B ranked #1 on Hugging Face’s leaderboard for open source large language models (LLMs). Offered completely royalty-free with weights, Falcon 40B is revolutionary and helps democratize AI and make it a more inclusive technology.

The multilingual Falcon 40B LLM works well with English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, and Swedish languages. The foundational LLM serves as a versatile base model that can be fine-tuned for specific requirements or objectives.

Falcon 40B launched a Call for Proposals from scientists, researchers, and innovators for inspiring use cases and applications with the most exceptional use cases to receive an investment of training computing power to work on the powerful model to shape transformative solutions. The model uses only 75 percent of GPT-3’s training compute, 40 percent of Chinchilla AI’s, and 80 percent of PaLM-62B’s.

One of the core differences in the development of Falcon was the quality of the training data. The size of the pre-training data collected for Falcon 40B was nearly five trillion tokens gathered from public web crawls (~80%), research papers, legal text, news, literature, and social media conversations.

Since LLMs are particularly sensitive to the data they are trained on, our team built a custom data pipeline to extract high-quality pre-training data using extensive filtering and deduplication, implemented both at the sample level and at the string level.

The download of Falcon 40B is subject to our Terms & Conditions and Acceptable Use Policy

FALCON
FALCON

Falcon 180B

Falcon 180B is a super-powerful language model with 180 billion parameters, trained on 3.5 trillion tokens. It's currently at the top of the Hugging Face Leaderboard for pre-trained Open Large Language Models and is available for both research and commercial use..

This model performs exceptionally well in various tasks like reasoning, coding, proficiency, and knowledge tests, even beating competitors like Meta's LLaMA 2.

Among closed source models, it ranks just behind OpenAI's GPT 4, and performs on par with Google's PaLM 2 Large, which powers Bard, despite being half the size of the model.

The download of Falcon 180B is subject to our Terms & Conditions and Acceptable Use Policy

Falcon Demo

Try out Falcon on this demo platform!

FALCON
Access Falcon

Responsive, Intuitive Rewarding