AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

CleverBee: open source AI research assistant generates citation studies

General Introduction

CleverBee is an open source AI research assistant hosted on GitHub and developed by SureScaleAI. It is designed to help AI researchers to improve their knowledge of AI through large language models such as Gemini CleverBee is a web browsing solution that combines web browsing technology with the ability to quickly collect, analyze, and summarize information to produce a cited research report. Users can easily access accurate content from web pages, PDFs, YouTube videos, or academic resources.CleverBee supports both cloud and local models and is suitable for academic research, business analysis, and other scenarios. Its interface is based on Chainlit, with simple and intuitive operation and flexible configuration, suitable for individuals and research teams.

CleverBee: Open Source AI Research Assistant to Generate Citation Studies-1


 

Function List

  • Support for several large language models, including Gemini, Claude and native GGUF models.
  • Automate web browsing, extract HTML content and cleanse it into structured data.
  • Generate research papers with citations that include source citations and limitations notes.
  • Supports YouTube video subtitle extraction, PDF parsing and PubMed academic search.
  • Provides real-time token usage and cost tracking to optimize resource management.
  • pass (a bill or inspection etc) config.yaml File customization models, browser behavior, and research tools.
  • Integrated Chainlit interactive interface for user to enter queries and view results.
  • Improve performance and reduce costs by using the SQLite database's NormalizingCache cache.

 

Using Help

Installation process

CleverBee is easy to install, with detailed instructions available via a GitHub repository. Here are the steps:

  1. clone warehouse
    Open a terminal and run the following command to clone the CleverBee repository:

    git clone https://github.com/SureScaleAI/cleverbee.git
    cd cleverbee
  1. Run the installation script
    Execute the installation script to configure the environment:

    bash setup.sh
    

    The script automates the following tasks:

    • Check for necessary dependencies (e.g. python3,jq,yq cap (a poem) cmake).
    • Create a Python virtual environment (venv/).
    • mounting requirements.txt Python dependencies in the
    • Detect hardware and ask if local model is enabled (recommended NVIDIA GPU, graphics memory ≥ 24GB).
    • If a local model is selected, the installation llama-cpp-python and enable CUDA acceleration (for NVIDIA GPUs).
    • Prompts the user to log into the Hugging Face (if using a local model).
    • Help configure the main inference model and summarization model, update the config.yamlThe
    • Download selected local GGUF models (if applicable).
  2. Configuring API Keys
    If using a cloud model such as Gemini or Claude, an API key is required. The install script checks the model selection and instructs the user to add the key to the .env Documentation:

    • Anthropic API key: from Anthropic Console fetch for Claude Model.
    • Google Gemini API key: from Google AI Studio Get, for Gemini models.
      After the key is added, the application needs to be restarted to take effect. Users can edit the .env File update key.
  3. launch an application
    After the installation is complete, run the following command to start CleverBee:

    bash run.sh
    

    This will launch the Chainlit interactive interface, which can be accessed by the user through a browser.

system requirements

  • operating systemSupport for macOS (Intel and Apple Silicon, requires Rosetta 2) and Linux.
  • software: High-performance hardware is not required for cloud models; NVIDIA GPUs (graphics memory ≥ 24GB) are recommended for local models.
  • dependencies: Ensure that the installation python3,git,jq,cmake cap (a poem) nodemacOS users can install it via Homebrew:
    brew install python cmake git jq node
    

Usage

The core function of CleverBee is to generate research reports with citations. Below is the detailed procedure:

  1. Launch Interface
    (of a computer) run run.sh The browser will then open the Chainlit interface. Here the user can enter a research question or topic, such as "Recent advances in quantum computing" or "Economic impact of sustainable energy".
  2. Enter a query
    Entering a question into the interface input box, CleverBee automatically plots a research path, calling on web browsing tools, YouTube subtitle extraction, or academic search modules to gather information. Users can view the progress of their research in real time.
  3. View Report
    Upon completion of the study, CleverBee generates a report containing the following:

    • Synthesizing and summarizing: generating concise conclusions based on the information gathered.
    • Source citations: list links or sources for all references.
    • Limitations note: Describes possible limitations of the AI and reminds users to check the source.
    • Token Usage: shows the resource consumption during the model call.
  4. Customized Configuration
    User editable config.yaml The file adjusts settings, for example:

    • Change the main inference model (Gemini 2.5 Pro recommended).
    • Setting proxy behavior (e.g., web browsing depth).
    • Adjust token limits or caching policies.
      The configuration documentation is located in the https://cleverb.ee/docsThe
  5. Featured Function Operation
    • YouTube subtitle extraction: Enter a link to a YouTube video and CleverBee automatically extracts the subtitles and integrates them into the report, suitable for analyzing lectures or interviews.
    • PDF parse: Upload a PDF file and CleverBee extracts the text and summarizes the key elements, suitable for academic papers or reports.
    • PubMed Search: Enter a medical-related topic and CleverBee searches the authoritative literature from the PubMed database.
    • Real-time cost tracking: The interface displays the token consumption per query to help users optimize their budget.

caveat

  • Local models have high hardware requirements and cloud models are recommended for optimal performance.
  • Always check the sources in the report, the AI may be hallucinating.
  • The project is for non-commercial use and follows the GNU Affero license.

 

application scenario

  1. academic research
    Students or researchers can use CleverBee to quickly gather academic papers, web articles, and videos to produce a fully cited literature review. For example, when researching "Artificial Intelligence Ethics", CleverBee can extract relevant literature from PubMed and academic websites.
  2. Business Analysis
    Business users can analyze market trends or competitor information. For example, enter "Electric Vehicle Market Forecast 2025" and CleverBee will browse industry reports and news to generate a data-driven summary.
  3. Personal Learning
    For casual users exploring complex topics such as "Applications of Blockchain Technology," CleverBee offers multiple perspectives, including YouTube tutorials and authoritative articles, to help users gain a comprehensive understanding.

 

QA

  1. What models does CleverBee support?
    Support for Gemini, Claude, and native GGUF models (such as Deepseek (R1, Llama). Cloud models have more stable performance and local models are suitable for privacy requirements.
  2. How to reduce running costs?
    Reduce duplicate queries with NormalizingCache caching. Choose a low-cost model such as Gemini 2.5 Flash and monitor token usage through the interface.
  3. Is the report reliable?
    Reports are based on authentic sources and citations are provided, but AI may have hallucinations. Users are advised to verify key information.
  4. Is programming experience required?
    No programming experience is required. Installation scripts automate configuration and the Chainlit interface is easy to use.
May not be reproduced without permission:Chief AI Sharing Circle " CleverBee: open source AI research assistant generates citation studies
en_USEnglish