Top 5 Must-Have LLM Open-Source Projects

Two bonus resources are available for those who read till the end.

Mrinal Walia
7 min readJan 29, 2024

Large Language Models (LLMs) are the most powerful open-access models currently available and have taken over the NLP AI community. In this article, you will discover the best open-source LLM projects with their GitHub links and a brief overview of each.

🌟 Transform Your Writing Journey with QuillBot (Affiliate Link)🌟

Dear creative souls,

As a writer who started with a humble word and a dream, I’ve faced the daunting blank page, just like you. I remember struggling to find the perfect words, to make each sentence flow into the next like a melodious tune. That was until I discovered QuillBot.

Let me share a quick story: There was this one article, close to my heart but challenging to pen down. The deadline was looming, and my words felt tangled. Then, I turned to QuillBot. What happened next was nothing short of magic. The article didn’t just meet the deadline; it soared to the top of Google’s ranks, capturing hearts and sparking minds.

QuillBot is not just a writing tool; it’s a catalyst for creativity, a whisperer of eloquent prose, and a guardian of your unique voice. It respects your individuality while polishing your words to shine brighter.

And now, I extend this magical quill to you. By using my link to explore QuillBot, you’re not just refining your craft; you’re supporting a community of dreamers and doers. Together, we’ll turn the daunting blank page into a canvas of possibilities.

🌈 Exclusive Gift: Dive into the QuillBot experience and enjoy a special 30% off on Annual Plans with the code: SCHOOL30. And the best part? It’s incredibly affordable — for less than the cost of a weekly coffee, you can unlock the full potential of your writing now for just $5.83 per month. Give it a try yourself and then decide whether or not you like it. Let’s make each word count!

🚀 Begin Your QuillBot Adventure & Support Creative Dreams! 🚀

List of Projects ⬇️

1. MetaGPT— 33.8K stars

MetaGPT is a Multi-Agent Framework in which you need to give one line requirement, and the model returns PRD, Design, Tasks, Repo, etc.

  1. MetaGPT inputs a one-line requirement and outputs various outputs such as user stories, competitive analysis, requirements, data structures, and APIs.
  2. MetaGPT includes product managers, architects, project managers, and engineers. It provides the entire software development process and carefully orchestrated standard operating procedures.

GitHub | Official Documenatation

https://www.deepwisdom.ai/

2. Ollama —33.4K stars

This project helps you get up and running with Llama 2, Mistral, and other large language models locally.

  • ollama allows you to import from GGUF models in the ModelFile
  • ollama allows you to import models from the PyTorch framework in Python
  • Models from the Ollama Library can be customized with a prompt
  • Includes a comprehensive list of 25+ LLM models for general use and modifications

GitHub | Official Documenatation

https://ollama.ai/

3. llama_index —27.2K stars

LlamaIndex (formerly GPT Index) is a data framework for your LLM applications. It offers a natural language interface between humans and data.

LlamaIndex uses Retrieval-Augmented Generation (RAG) instead of immediately generating an answer with LLM. This project retrieves information from your data sources first, adds it to your question as context, and asks the LLM to answer based on the enriched prompt.

LlamaIndex does not restrict the use of LLMs. You can use them for auto-complete, chatbots, and more.

GitHub | Official Documenatation

https://docs.llamaindex.ai/en/stable/index.html

4. FlowiseAI —19.8K stars

FlowiseAI is another open-source UI visual tool that allows you to build customized LLM orchestration flow & AI agents. They provide an easy-to-use drag-and-drop UI to build your customized LLM flow.

1. Iterate, fast: Their low code approach enables you to quickly make iterations to go from testing to production.

2. LLM Orchestration: You can connect LLMs with memory, data loaders, cache, moderation, and many more:

  • Langchain
  • LlamaIndex
  • 100+ integrations

3. Agents & Assistants: You can create autonomous agents that can use tools to execute different tasks:

  • Custom Tools
  • OpenAI Assistant
  • Function Agent

4. API, SDK, Embed: You can extend and integrate flowiseAI to your applications using APIs, SDK, and Embedded Chat:

  • APIs
  • Embedded Widget
  • React SDK

5. Open source LLMs: You can run flowiseAI in an air-gapped environment with local LLMs, embeddings, and vector databases:

  • HuggingFace, Ollama, LocalAI, Replicate
  • Llama2, Mistral, Vicuna, Orca, Llava
  • Self-host on AWS, Azure, GCP

GitHub | Official Documenatation

https://github.com/FlowiseAI/Flowise

5. Mindsdb — 19.7K stars

Their team has enhanced SQL to simplify the creation of AI tools that need access to real-time data to perform their tasks.

mindsdb allows you to:

  • Automate the process of fine-tuning Language Models directly from the data stored in your database!
FINETUNE mindsdb.hf_model FROM postgresql.table;
  • Use your database to build RAG and Semantic Search systems!
SELECT * FROM rag_model WHERE question='What product is best for treating a cold?';
  • Predict future behavior for forecasting models.
SELECT * FROM binance.trade_data WHERE symbol = 'BTCUSDT';
  • Publish your AI Agent directly into end-user applications with zero infrastructure set up.
CREATE AGENT my_agent USING model='chatbot_agent', skills = ['knowledge_base'];
  • Provide usage-based in-product suggestions for recommendation engines.
CREATE CHATBOT slack_bot USING database='slack',agent='customer_support';
  • Orchestrate workflows to automate responses to specific conditions or events.
CREATE TRIGGER data_updated ON mysql.customers_data (sql_code)

GitHub | Official Documenatation

https://mindsdb.com/

6. PandasAI — 9.6K Stars

PandasAI makes data analysis conversational by chatting with your data (SQL, CSV, pandas, polars, noSQL, etc).

It is a Python library that adds Generative AI capabilities to pandas, the popular data analysis and manipulation tool.

Note: This project was designed to be used in conjunction with pandas and is not a replacement for it.

Usage example:

  • You can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows.
  • You can also ask PandasAI to draw graphs, clean data, impute missing values, and generate features.

GitHub | Official Documentation

https://github.com/gventuri/pandas-ai

🌟 BONUS RESOURCES 🌟

➡️ Open LLMs — 9.3K Stars — GitHub

This project is a comprehensive list of open LLMs for commercial use (e.g., Apache 2.0, MIT, OpenRAIL-M).

They have LLMs for training and fine-tuning, LLMs for code, and LLM datasets for pre-training, instruction-tuning, and alignment-tuning.

️️️️➡️ Gorilla — 8.9K Stars — GitHubDocs

Gorilla is an API app store for LLMs by Microsoft and UC Berkeley researchers. Gorilla is an LLM that knows how to make the right API calls. It’s super smart because it’s trained on big datasets like Torch Hub, TensorFlow Hub, and HuggingFace. Gorilla is licensed under Apache 2.0. With Gorilla being fine-tuned on MPT and Falcon, you can use Gorilla commercially without obligations!

Quick links:

🚀 Try Gorilla in 60s in Google Colab

💻 Use Gorilla in your CLI with pip install gorilla-cli

🗞️ Check out their paper!

👋 Join their Discord community of more than 3500 active members!

https://github.com/ShishirPatil/gorilla

Do you have any questions?

Feel free to ask questions in the comments below or connect with me on my social media accounts. I’ll do my best to answer them.

LinkedIn | GitHub

I love writing about topics like OpenAI examples projects, AGI tools, open-source LLMs, artificial intelligence course, machine learning algorithms, and other upcoming AI technology that pique my curiosity. Let’s explore this amazing tech universe together!

Please have a look at some of my articles:

--

--

Mrinal Walia

Data Scientist and a Technical Writer! I will give you the best of Open-Source and AI.