Skip to content

Commit 6d59106

Browse files
committed
added open source lesson
1 parent 92351ea commit 6d59106

File tree

5 files changed

+79
-2
lines changed

5 files changed

+79
-2
lines changed

16-open-source-models/README.MD

+77
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
## Introduction
2+
3+
The world of open-source LLMs is exciting and constantly evolving. This lesson aims to provide an in-depth look at open source models. If you are looking for information on how proprietary models compare to open source models, go to the ["Exploring and Comparing Different LLMs" lesson](../02-exploring-and-comparing-different-llms/README.md?WT.mc_id=academic-105485-koreyst). This lesson will also cover the topic of fine-tuning but a more detailed explanation can be found in the ["Fine-Tuning LLMs" lesson](../18-fine-tuning/README.md?WT.mc_id=academic-105485-koreyst).
4+
5+
## Learning goals
6+
7+
- Gain an understanding of open source Models
8+
- Understanding the benefits of working with open source Models
9+
- Exploring the open models available on Hugging Face and the Azure AI Studio
10+
11+
## What are Open Source Models?
12+
13+
Open source software has played a crucial role in the growth of technology across various fields.The Open Source Initiative (OSI) has defined [10 criteria for software](https://opensource.org/osd?WT.mc_id=academic-105485-koreyst) to be classified as open source. The source code must be openly shared under a license approved by the OSI.
14+
15+
While the development of LLMs has similar elements to developing software, the process is not exactly the same. This has brought much discussion in the community on the definition of open source in the context of LLMs. For a model to be aligned with the traditional definition of open source the following information should be publicly available:
16+
17+
- Datasets used to train the model.
18+
- Full model weights as a part of the training.
19+
- The evaluation code.
20+
- The fine-tuning code.
21+
- Full model weights and training metrics.
22+
23+
There are currently only a few models that match this criteria. The [OLMo model created by Allen Institute for Artificial Intelligence (AllenAI)](https://huggingface.co/allenai/OLMo-7B?WT.mc_id=academic-105485-koreyst) is one that fits this category.
24+
25+
For this lesson, we will refer to the models as "open models" going forward as they may not match the criteria above at the time of writing.
26+
27+
## Benefits of Open Models
28+
29+
**Highly Customizable** - Since open models are released with detailed training information, researchers and developers can modify the model's internals. This enables the creation of highly specialized models that are fine-tuned for a specific task or area of study. Some examples of this is code generation, mathematical operations and biology.
30+
31+
**Cost** - The cost per token for using and deploying these models is lower than that of proprietary models. When building Generative AI applications, looking at performance vs price when working with these models on your use case should be done.
32+
33+
![Model Cost](./images/model-price.png)
34+
Source: Artifical Anayslsis
35+
36+
**Flexibility** - Working with open models enables you do be flexible on in terms of using different models or combining them. An example of this is the [HuggingChat Assistants ](https://huggingface.co/chat?WT.mc_id=academic-105485-koreyst) where a users can select the model being used directly in the user interface:
37+
38+
![Choose Model](./images/choose-model.png)
39+
40+
## Exploring Different Open Models
41+
42+
### Llama 2
43+
44+
[LLama2](https://huggingface.co/meta-llama?WT.mc_id=academic-105485-koreyst), developed by Meta is an open model that is optimized for chat based applications. This is due to its fine-tuning method, which included a large amount of dialogue and human feedback.. With this method, the model produces more results that are aligned to human expectation which provides a better user experience.
45+
46+
Some examples of fine-tuned versions of Llama include [Japanese Llama](https://huggingface.co/elyza/ELYZA-japanese-Llama-2-7b?WT.mc_id=academic-105485-koreyst), which specializes in Japanese and [Llama Pro](https://huggingface.co/TencentARC/LLaMA-Pro-8B?WT.mc_id=academic-105485-koreyst), which is an enhanced version of the base model.
47+
48+
### Mistral
49+
50+
[Mistral](https://huggingface.co/mistralai?WT.mc_id=academic-105485-koreyst)is an open model with a strong focus of high performance and efficiency. It uses the Mixture-of-Experts approach which combines a group of specialized expert models into one system where depending on the input, certain models are selected to be used. This makes the computation more effective as models are only addressing the inputs they are specalized in.
51+
52+
Some examples of fine-tuned versions of Mistral include [BioMistral](https://huggingface.co/BioMistral/BioMistral-7B?text=Mon+nom+est+Thomas+et+mon+principal?WT.mc_id=academic-105485-koreyst), which is focused on the medical domain and [OpenMath Mistral](https://huggingface.co/nvidia/OpenMath-Mistral-7B-v0.1-hf?WT.mc_id=academic-105485-koreyst), which performs mathematical computation.
53+
54+
### Falcon
55+
56+
[Falcon](https://huggingface.co/tiiuae?WT.mc_id=academic-105485-koreyst) is an LLM created by the Technology Innovation Institute (**TII**) .The Falcon-40B was trained on 40 billion parameters which has been shown to perform better than GPT-3 with less compute budget. This is dues to its use of the FlashAttention algorirth and multiquery attention that enables it to cut down on the memory requirements at inference time. With this reduced inference time, the Falcon-40B is suitable for chat applications.
57+
58+
Some examples of fine-tuned versions of Falcon are the [OpenAssistant](https://huggingface.co/OpenAssistant/falcon-40b-sft-top1-560?WT.mc_id=academic-105485-koreyst), an assistant built on open models and [GPT4ALL](https://huggingface.co/nomic-ai/gpt4all-falcon?WT.mc_id=academic-105485-koreyst), which delivers higher performance than the base model.
59+
60+
## How to Choose
61+
62+
There is no one answer for choosing an open model. A good place to start is by using the Azure AI Studio's filter by task feature. This will help you understand what types of tasks the model has been trained for. Hugging Face also maintains an LLM Leaderboard which shows you the best performing models base on certain metrics.
63+
64+
When looking to compare LLMs across the different types, [Artificial Analysis](https://artificialanalysis.ai/?WT.mc_id=academic-105485-koreyst) is another great resource:
65+
66+
![Model Quality](./images/model-quality.png)
67+
Source: Artifical Anayslsis
68+
69+
If working on a specific use case, searching for fine-tuned versions that are focused on the same area can be effective. Experimenting with multiple open models to see how they perform according to your and your users' expectations is another good practice
70+
71+
## Next Steps
72+
73+
The best part about open models is that you can get started working with them pretty quickly. Check out the [Azure AI Studio Model Catalog](https://ai.azure.com?WT.mc_id=academic-105485-koreyst) , which features a specific Hugging Face collection with these models we discussed here.
74+
75+
## Learning does not stop here, continue the Journey
76+
77+
After completing this lesson, check out our [Generative AI Learning collection](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) to continue leveling up your Generative AI knowledge!
268 KB
Loading
69.6 KB
Loading
79.3 KB
Loading

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -81,9 +81,9 @@ Find spelling errors, code errors or have a suggestion? [Raise an issue](https:/
8181
| 13 | [Securing Your Generative AI Applications](./13-securing-ai-applications/README.md?WT.mc_id=academic-105485-koreyst) | **Learn:** The threats and risks to AI systems and methods to secure these systems. | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
8282
| 14 | [The Generative AI Application Lifecycle](./14-the-generative-ai-application-lifecycle/README.md?WT.mc_id=academic-105485-koreyst) | **Learn:** The tools and metrics to manage the LLM Lifecycle and LLMOps | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
8383
| 15 | [Retrieval Augmented Generation (RAG) and Vector Databases](./15-rag-and-vector-databases/README.md?WT.mc_id=academic-105485-koreyst) | **Build:** An application using a RAG Framework to retrieve embeddings from a Vector Databases | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
84-
| 16 | Open Source Models and Hugging Face | **Build:** An application using open source models available on Hugging Face | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
84+
| 16 | [Open Source Models and Hugging Face](./16-open-source-models/README.MD) | **Build:** An application using open source models available on Hugging Face | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
8585
| 17 | [AI Agents](./17-ai-agents/README.md?WT.mc_id=academic-105485-koreyst) | **Build:** An application using an AI Agent Framework | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
86-
| 18 | [Fine-Tuning](./18-fine-tuning/README.md?WT.mc_id=academic-105485-koreyst) LLMs | **Learn:** The what, why and how of fine-tuning LLMs | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
86+
| 18 | [Fine-Tuning LLMs](./18-fine-tuning/README.md?WT.mc_id=academic-105485-koreyst) | **Learn:** The what, why and how of fine-tuning LLMs | [Learn More](https://aka.ms/genai-collection?WT.mc_id=academic-105485-koreyst) |
8787

8888
### 🌟 Special thanks
8989

0 commit comments

Comments
 (0)