🚀 Invitation to Experience: China's First AI IDE Intelligent Programming Software Trae Chinese version downloadThe DeepSeek-R1 and Doubao-pro are available for unlimited use!

Total 1020 articles

Tags: ai open source projects Page 52

ChatTTS：模仿真人说话声音的语音生成模型（ChatTTS一键加速包）-首席AI分享圈

ChatTTS: a speech generation model that mimics the voice of a real person speaking (ChatTTS one-click acceleration package)

General Introduction ChatTTS is a generative speech model designed for conversational scenarios. It generates natural and expressive speech, supports multiple languages and multiple speakers, and is suitable for interactive conversations. The model goes beyond large by predicting and controlling fine-grained prosodic features such as laughter, pauses, and interjections...

2024-09-05AI tools AI open source project AI Text-to-Speech

MoneyPrinterPlus：一键生成短视频的AI工具，免费批量混剪-首席AI分享圈

MoneyPrinterPlus: AI tool for generating short videos with one click, free batch mixing

Comprehensive Introduction MoneyPrinterPlus is an open source project aimed at generating and mixing all kinds of short videos with one click through AI technology, and automatically publishing them to multiple video platforms, such as Jieyin, Shutterbugs, Xiaohongshu, and Video Number. The tool supports local and cloud-based voice models, including chatTTS, fasterwhisper, G...

Trae Chinese Version First Invitation to Download: Unlimited use of DeepSeek-R1 after registration!

Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.

2025-05-10

TF-ID: academic paper form/image recognition tool

Comprehensive Introduction TF-ID (Table/Figure IDentifier) is a family of object detection models specialized for extracting tables and images from academic papers. The project was created by Yifei Hu and open-sourced on GitHub.TF-ID models are fine-tuned to recognize and extract tables and images from academic papers...

2024-09-05AI tools AI open source project

Chatbot UI：模仿ChatGPT界面和功能的开源AI聊天应用程序-首席AI分享圈

Chatbot UI: an open source AI chat app that mimics ChatGPT's interface and functionality

General Introduction Chatbot UI is an open source project designed to help developers create personalized and intelligent conversational interfaces. The project provides a range of interface components and interactive features that can be easily integrated into the existing Chatbot system to provide users with a smoother and smarter conversation experience.Chatbot UI ...

2024-09-05AI tools AI open source project AI Localized Chat App

GLIGEN GUI：精确控制图像元素位置，基于ComfyUI的直观图形界面-首席AI分享圈

GLIGEN GUI: Precise control of the position of image elements, intuitive graphical interface based on ComfyUI

General Introduction GLIGEN GUI is an intuitive graphical interface based on ComfyUI designed to simplify the use of the GLIGEN model, a novel text-to-image model that allows precise specification of the position of objects in an image. With GLIGEN GUI, the user is prompted by drawing boxes and entering text...

2024-09-05AI tools AI image generation aids AI open source project

Easy Voice Toolkit: AI Voice Toolkit for Local Deployment

Comprehensive Introduction Easy-Voice-Toolkit is a multifunctional toolkit based on the Open Source Speech Project that provides a wide range of automated audio tools for speech recognition, speech transcription, speech conversion, dataset creation and model training. Users can use these tools selectively or sequentially as needed...

2024-09-04AI tools AI open source project AI Text-to-Speech AI voice cloning AI Speech to Text

FaceFusion: Video Face Swap Enhancement Tool | Voice Synchronized Video Mouth Moves

General Introduction FaceFusion is an advanced cloud platform with integrated facial exchange and enhancement features that optimizes the image-to-video and image-to-image exchange process with 5 professional models to ensure flawless output. In addition, it performs facial enhancement with 7 models, using 3 different models to boost...

2024-09-04AI tools AI open source project AI Video Face Swap

Kotaemon: simple to deploy open source multimodal document quiz tool

General Introduction Kotaemon is an open source document Q&A tool designed to provide end-users and developers with Q&A capabilities based on Retrieval Augmented Generation (RAG). Developed by Cinnamon, the project supports a variety of LLM API providers (e.g. OpenAI, AzureOpenAI, Cohere, etc.) as well as native...

2024-09-03AI tools AI open source project knowledge map Knowledge Retrieval and the RAG Framework

HivisionIDPhotos: open source intelligent AI photo ID creation tool

Comprehensive introduction HivisionIDPhotos is an open source lightweight AI document photo production tools, can intelligently identify the user photo scene and keying, to generate a standard document photo in line with a variety of specifications. The tool supports custom background colors and sizes, and in the future will also launch the beauty and intelligent change of formal dress function. With...

2024-09-03AI tools AI open source project AI keying to change background

Marker: quickly convert PDF to Markdown open source tools

General Introduction Marker is a deep learning based document processing tool designed to convert PDF files to Markdown format quickly and accurately. It supports a wide range of document types and is especially optimized for conversion of books and scientific papers.Marker is able to remove redundant content such as headers and footers, format tables and...

2024-09-03AI tools AI open source project Document Extraction and Cleaning

SadTalker：让照片说话|嘴型同步音频|合成口型同步视频|免费数字人-首席AI分享圈

SadTalker: Make Photos Talk | Mouth Synchronized Audio | Synthesized Mouth Synchronized Video | Free Digital People

General Introduction SadTalker is an open source tool that combines single still portrait photos and audio files to create realistic talking head videos for a wide range of scenarios such as personalized messages, educational content, and more. The revolutionary use of 3D modeling technologies such as ExpNet and PoseVAE excel in capturing the subtle facets...

2024-09-03AI tools AI open source project AI digital person lip sync

VideoReTalking: Audio-Driven Lip Synchronization and Video Editing System

General Introduction VideoReTalking is an innovative system that allows users to generate lip-synchronized facial videos based on input audio, producing high-quality and lip-synchronized output videos even with different emotions. The system breaks down this goal into three successive tasks: facial video generation with typical expressions...

2024-09-02AI tools AI open source project lip sync

MuseV+Muse Talk：完整数字人视频生成框架|人像转视频|姿态转视频|唇形同步-首席AI分享圈

MuseV+Muse Talk: Complete Digital Human Video Generation Framework | Portrait to Video | Pose to Video | Lip Synchronization

General Introduction MuseV is a public project on GitHub that aims to enable the generation of avatar videos of unlimited length and high fidelity. It is based on diffusion technology and offers Image2Video, Text2Image2Video, Video2Video and many other features. Provides model structure, use cases, quick start...

2024-09-02AI tools AI open source project AI digital person lip sync

Unstructured：开源预处理非结构化文档，无结构数据处理的利器-首席AI分享圈

Unstructured: open source preprocessing unstructured documents, unstructured data processing tools

Comprehensive Introduction Unstructured-IO provides a range of open source components for processing and preprocessing images and text documents such as PDF, HTML, Word documents, etc. Its main goal is to simplify and optimize data processing workflow , especially for large language model (LLM) applications to provide support.Unstructured...

2024-09-01AI tools AI open source project Document Extraction and Cleaning

magic-html: extract body data from HTML URL, output plain text/markdown

General Introduction magic-html is a Python library designed to simplify the process of extracting body region content from HTML. Whether dealing with complex HTML structures or simple web pages, this library aims to provide a convenient and efficient interface for users. It supports multimodal extraction, multiple layout extracto...

2024-09-01AI tools AI open source project

WebPilot: Intelligent Web Information Processing Tool, Free API for Web Content Capture

WebPilot General Introduction Webpilot is a free and open source "web assistant" that allows you to communicate freely with any web page or perform automated tasks. Instead of switching pages or copying and pasting, just select text or enter commands, and webpilot will provide you with real-time information and smart...

2024-08-31AI tools AI Open Services AI open source project AI search tools

DB-GPT：构建AI原生数据应用开发框架，集成多模型管理与智能数据处理-首席AI分享圈

DB-GPT: Building AI Native Data Application Development Framework, Integrating Multi-Model Management and Intelligent Data Processing

Comprehensive Introduction DB-GPT is an open source AI native data application development framework built using AWEL (Agentic Workflow Expression Language) and intelligent body technologies. The project aims to build infrastructure in the field of large models by developing several technical capabilities, including a multi-model management system (SMMF),...

2024-08-31AI tools AI open source project AI data analysis Knowledge Retrieval and the RAG Framework

DreamTalk: Generate expressive talking videos with a single avatar image!

DreamTalk Comprehensive Introduction DreamTalk is a diffusion model-driven expression talking head generation framework, jointly developed by Tsinghua University, Alibaba Group and Huazhong University of Science and Technology. It is mainly composed of three parts: a noise reduction network, a style-aware lip expert and a style predictor, and is able to generate a variety of audio input based on...

2024-08-31AI tools AI open source project AI digital person lip sync

InstantID：上传一张图片，迁移人像特征来生成不同风格图片-首席AI分享圈

InstantID: upload an image and migrate the portrait features to generate different styles of images

Comprehensive Introduction InstantID is an advanced technology focused on generating images with personalized styles or poses in seconds while ensuring a high level of fidelity using a single reference ID image. The technology employs a diffusion model-based solution by integrating facial images, landmark images with...

2024-08-30AI tools AI image style control AI open source project AI Face Swap and Dress Up

preceding page
1
---
49
50
51
52
53
54
next page
Total 54 pages