133 lines
2.6 KiB
Markdown
133 lines
2.6 KiB
Markdown
|
|
# TXT版轻量NotebookLM
|
|||
|
|
|
|||
|
|
一个基于RAG(Retrieval-Augmented Generation)的轻量级TXT文档处理工具,支持上传、检索、生成一体化。
|
|||
|
|
|
|||
|
|
## 功能特性
|
|||
|
|
|
|||
|
|
- TXT批量处理和智能解析
|
|||
|
|
- FAISS向量化存储和检索
|
|||
|
|
- BM25+向量混合检索
|
|||
|
|
- 多风格文案生成
|
|||
|
|
- 幻觉检测机制
|
|||
|
|
- 异步任务队列(Celery+Redis)
|
|||
|
|
- 进度跟踪和增量更新
|
|||
|
|
- Markdown/DOCX导出功能
|
|||
|
|
|
|||
|
|
## 系统要求
|
|||
|
|
|
|||
|
|
- Python 3.8+
|
|||
|
|
- Redis服务
|
|||
|
|
- AI API密钥(OpenAI、Anthropic或通义千问)
|
|||
|
|
|
|||
|
|
## 安装步骤
|
|||
|
|
|
|||
|
|
1. 克隆项目:
|
|||
|
|
```bash
|
|||
|
|
git clone <repository-url>
|
|||
|
|
cd notebookls
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. 安装依赖:
|
|||
|
|
```bash
|
|||
|
|
pip install -r requirements.txt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. 配置环境变量:
|
|||
|
|
```bash
|
|||
|
|
cp .env.example .env
|
|||
|
|
# 编辑.env文件,添加您的API密钥
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
4. 启动Redis服务(需要单独安装)
|
|||
|
|
|
|||
|
|
5. (推荐)使用一键启动脚本启动所有服务:
|
|||
|
|
```bash
|
|||
|
|
./start_all.sh # Linux/Mac
|
|||
|
|
start_all.bat # Windows
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
或者分别启动各个服务:
|
|||
|
|
|
|||
|
|
a. 启动Celery Worker:
|
|||
|
|
```bash
|
|||
|
|
./start_celery.sh # Linux/Mac
|
|||
|
|
start_celery.bat # Windows
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
b. 启动主应用:
|
|||
|
|
```bash
|
|||
|
|
./start.sh # Linux/Mac
|
|||
|
|
start.bat # Windows
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 多平台模型支持
|
|||
|
|
|
|||
|
|
本系统支持多种AI模型提供商,包括:
|
|||
|
|
|
|||
|
|
### OpenAI
|
|||
|
|
- 嵌入模型:`text-embedding-ada-002`
|
|||
|
|
- 生成模型:`gpt-3.5-turbo`, `gpt-4`
|
|||
|
|
|
|||
|
|
### Anthropic
|
|||
|
|
- 嵌入模型:`claude-2`
|
|||
|
|
- 生成模型:`claude-2`
|
|||
|
|
|
|||
|
|
### 通义千问
|
|||
|
|
- 嵌入模型:`text-embedding-v1`
|
|||
|
|
- 生成模型:`qwen-turbo`, `qwen-plus`
|
|||
|
|
|
|||
|
|
### 中转站支持
|
|||
|
|
系统支持通过中转站调用API,可以在配置中设置自定义API地址。
|
|||
|
|
|
|||
|
|
## 配置说明
|
|||
|
|
|
|||
|
|
在 `.env` 文件中配置以下参数:
|
|||
|
|
|
|||
|
|
```env
|
|||
|
|
# OpenAI API配置
|
|||
|
|
OPENAI_API_KEY=your_openai_api_key
|
|||
|
|
OPENAI_API_BASE=https://api.openai.com/v1
|
|||
|
|
|
|||
|
|
# Anthropic API配置
|
|||
|
|
ANTHROPIC_API_KEY=your_anthropic_api_key
|
|||
|
|
ANTHROPIC_API_BASE=https://api.anthropic.com/v1
|
|||
|
|
|
|||
|
|
# 通义千问API配置
|
|||
|
|
QWEN_API_KEY=your_qwen_api_key
|
|||
|
|
QWEN_API_BASE=https://dashscope.aliyuncs.com/api/v1
|
|||
|
|
|
|||
|
|
# 模型提供商配置
|
|||
|
|
EMBEDDING_PROVIDER=openai
|
|||
|
|
GENERATION_PROVIDER=openai
|
|||
|
|
|
|||
|
|
# 模型配置
|
|||
|
|
EMBEDDING_MODEL=text-embedding-ada-002
|
|||
|
|
GENERATION_MODEL=gpt-3.5-turbo
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
在设置页面中也可以动态修改这些配置。
|
|||
|
|
|
|||
|
|
## 使用方法
|
|||
|
|
|
|||
|
|
1. 访问 `http://localhost:8001`
|
|||
|
|
2. 上传TXT文件
|
|||
|
|
3. 输入查询或关键词
|
|||
|
|
4. 选择生成风格
|
|||
|
|
5. 点击生成按钮
|
|||
|
|
6. 查看和导出结果
|
|||
|
|
|
|||
|
|
## 技术架构
|
|||
|
|
|
|||
|
|
- 后端:FastAPI
|
|||
|
|
- 异步任务:Celery + Redis
|
|||
|
|
- 向量存储:FAISS
|
|||
|
|
- 前端:HTML + JavaScript
|
|||
|
|
- 文本处理:jieba分词
|
|||
|
|
|
|||
|
|
## 贡献指南
|
|||
|
|
|
|||
|
|
欢迎提交Issue和Pull Request来改进项目。
|
|||
|
|
|
|||
|
|
## 许可证
|
|||
|
|
|
|||
|
|
MIT License
|