General Introduction
Deep Recall is an open source, enterprise-class memory framework designed for large-scale language models (LLMs). It provides hyper-personalized responsiveness through efficient contextual retrieval and integration. The framework adopts a three-tier architecture including a memory service, an inference service, and a coordinator, and supports GPU-optimized inference and vector database integration.Deep Recall is suitable for both cloud and local deployments, with automated scaling capabilities to ensure high performance and reliability. It not only improves the context-awareness of the model, but also generates customized responses based on user history and preferences, making it ideal for scenarios that require deeply personalized interactions.
Function List
- Efficient contextual retrieval: quickly extract relevant information from historical user interactions.
- Personalized Response Generation: Generate customized responses based on user preferences and historical data.
- GPU-optimized inference: accelerates the inference process using GPUs to improve processing speed.
- Vector database integration: supports efficient storage and querying of large-scale vector data.
- Automated scaling: Dynamically adjust resource allocation to adapt to different load demands.
- RESTful API support: Provides a convenient interface for memory management and retrieval.
- Comprehensive monitoring and maintenance: Built-in monitoring tools to ensure stable system operation.
- Security Scanning System: Ensure code security through dependency scanning, code analysis, etc.
Using Help
Installation process
To use Deep Recall, you need to install and configure the dependencies in an environment that supports Python. Here are the detailed installation steps:
- Clone Code Repository
Run the following command in a terminal to get the Deep Recall source code:git clone https://github.com/jkanalakis/deep-recall.git cd deep-recall
- Creating a Virtual Environment
To avoid dependency conflicts, it is recommended that you create a Python virtual environment:python -m venv venv source venv/bin/activate # Linux/macOS venv\Scripts\activate # Windows
- Installation of dependencies
Install the required runtime and development dependencies for the project:pip install -r requirements.txt pip install -r requirements-dev.txt
- Configuring Pre-Commit Hooks
To ensure code quality, install the pre-commit hook:pre-commit install
- Verify Installation
Once the installation is complete, you can verify that the environment is configured correctly by running test cases. Refer to the project'sCONTRIBUTING.md
file to execute the test command:pytest
Main Functions
1. Contextual search and personalized response
Deep Recall's core functionality is to generate personalized responses based on a user's history of interactions. The user invokes the memory service via a RESTful API, and the system retrieves the relevant context from a vector database and generates a response based on the current input. The steps are as follows:
- API Calls: Use a POST request to send a message to the
/memory/retrieve
The endpoint sends the user ID and the query. Example:curl -X POST http://localhost:8000/memory/retrieve \ -H "Content-Type: application/json" \ -d '{"user_id": "user123", "query": "推荐一部电影"}'
- response processing: API returns JSON data containing context and generated responses that developers can parse and display directly to users.
- Configuration personalization: In the configuration file
config/memory_config.json
to adjust retrieval parameters such as the context window size or similarity threshold.
2. GPU-optimized reasoning
Deep Recall supports GPU-accelerated inference to dramatically increase processing speed. You need to ensure that CUDA and related drivers are installed on your system. Configuration Steps:
- Installing GPU dependencies: During installation, ensure that the
requirements.txt
GPU-related libraries such as PyTorch have been installed correctly. - Start the reasoning service: run it in the project root directory:
python -m deep_recall.inference_service --gpu
- Verify GPU Usage: Confirms through logs that the inference service is utilizing GPU resources.
3. Vector database integration
Deep Recall uses a vector database to store user interaction data and supports efficient queries. Operational Processes:
- Initializing the database: Run the initialization script to create the vector index:
python scripts/init_vector_db.py
- Data import: Import user history data into the database via API or script. Sample API call:
curl -X POST http://localhost:8000/memory/store \ -H "Content-Type: application/json" \ -d '{"user_id": "user123", "data": "用户喜欢科幻电影"}'
- Query Data: Use the Retrieval API to query stored vector data on demand.
4. Automated extensions
Deep Recall supports dynamic resource allocation for high load scenarios. Users can configure the configuration file config/scaling_config.json
Set scaling policies such as maximum number of instances or load thresholds. Start the coordinator service:
python -m deep_recall.orchestrator
The coordinator automatically adjusts the number of inference service instances based on the load.
Featured Function Operation
Secure Scanning System
Deep Recall has comprehensive security scanning tools built in to ensure code quality. Method of Operation:
- Running a Dependency Scan: Check for known vulnerabilities in Python dependencies:
safety check
- Code Security Analysis: Use Bandit to scan for security issues in your code:
bandit -r deep_recall
- View Report: The scan results are saved in JSON and Markdown formats in the
reports/
Catalog for user review.
API Client Example
Deep Recall provides Python and JavaScript client libraries to simplify API integration. Sample Python code:
from deep_recall_client import DeepRecallClient
client = DeepRecallClient("http://localhost:8000")
response = client.retrieve_memory(user_id="user123", query="推荐一部电影")
print(response["reply"])
Users can also refer to the React example front-end in the project to quickly build interactive interfaces.
caveat
- Ensure that your network connection is stable, API calls may fail due to network issues.
- Regularly back up the vector database, refer to
docs/backup.md
Configure automatic backups. - probe
config/security_config.json
, customize security scanning rules.
application scenario
- Customer Service Robot
Deep Recall provides customer service bots with a memory function that records a user's historical questions and preferences, generating responses that are more relevant to the user's needs. For example, on e-commerce platforms, bots can recommend products based on a user's past purchases. - Personalized Education Platform
In online education, Deep Recall stores a student's progress and interests to generate customized learning suggestions. For example, it suggests practice questions that are appropriate for the student's level. - Intelligent Assistant Development
Developers can use Deep Recall to build intelligent assistants that record user habits and provide contextually relevant suggestions. For example, the assistant can remind users of meetings or tasks based on their schedule. - Content Recommender System
Deep Recall is suitable for building content recommendation engines that analyze a user's browsing history to recommend relevant articles, videos, or products. For example, news platforms can push personalized information based on users' reading preferences. - Enterprise Knowledge Management
Organizations can use Deep Recall to build internal knowledge bases, store employee interaction data, and quickly retrieve historical information. For example, technical support teams can use the system to find past solutions.
QA
- What big models does Deep Recall support?
Deep Recall is compatible with a variety of open-source macromodels such as LLaMA, Mistral, and BERT, for which users can refer to the official documentation.docs/model_support.md
See the full support list. - How do you ensure data privacy?
Deep Recall supports local deployment with data stored on a user-controlled server. Users can further protect data privacy through encrypted volumes or configured firewalls. - Does it require a GPU to run?
GPUs can accelerate inference, but are not required; CPU environments can also run Deep Recall, albeit with slightly slower processing speeds. GPUs are recommended for high-load scenarios. - How do I handle API call failures?
Check network connectivity and API endpoint configuration. If the problem persists, review the log fileslogs/service.log
Or contact the official support email. - Does it support multilingual data?
Yes, Deep Recall's vector database supports multilingual text storage and retrieval for internationalized application scenarios.