AI Personal Learning
and practical guidance
TRAE
Total 1020 articles

Tags: ai open source projects Page 25

bilive:B站无人监守直播录制与自动切片、上传工具-首席AI分享圈

bilive: Unsupervised live recording and automatic slicing and uploading tools for B station

Comprehensive Introduction bilive is a tool designed for B station live recording, providing extremely fast live recording, auto-slicing, pop-up rendering and subtitle generation. The tool is compatible with ultra-low configuration machines, supports 7x24 hours unattended recording, automatically recognizes and renders pop-ups and subtitles, automatically slices and uploads them to B...

R1-V:低成本强化学习实现视觉语言模型泛化能力-首席AI分享圈

R1-V: Low-cost reinforcement learning for visual language model generalization capability

Comprehensive Introduction R1-V is an open source project that aims to achieve breakthroughs in visual language modeling (VLM) through low-cost reinforcement learning (RL). The project utilizes a verifiable reward mechanism to motivate VLMs to learn generalized counting abilities. Amazingly, R1-V's 2B model is able to learn the counting ability in only 100 training steps...

CoT-Lab:探索人机协作迭代思考的实验性对话工具-首席AI分享圈

CoT-Lab: an experimental dialog tool for exploring iterative thinking about human-computer collaboration

CoT-Lab (Collaborative Thinking Laboratory) is an experimental interface for exploring new paradigms in human-computer collaboration. Based on Cognitive Load Theory and Active Learning Principles, CoT-Lab facilitates deep cognitive alignment between humans and Artificial Intelligence (AI) through the creation of "Thinking Partners". The program is designed to slowly output...

PengChengStarling:对比Whisper-Large v3更小、更快的多语言语音转文字工具-首席AI分享圈

PengChengStarling: Smaller and Faster Multilingual Speech-to-Text Tool than Whisper-Large v3

Comprehensive Introduction PengChengStarling (PengCheng Labs) is a multilingual Automatic Speech Recognition (ASR) tool capable of converting speech in different languages into corresponding text. This toolkit is developed based on the icefall project and provides a complete speech recognition process, including data processing, model training,...

SpeechGPT 2.0-preview:实时交互的端到端拟人语音对话大模型-首席AI分享圈

SpeechGPT 2.0-preview: an end-to-end anthropomorphic speech dialog grand model for real-time interaction

Introduction SpeechGPT 2.0-preview is the first anthropomorphic real-time interaction system introduced by OpenMOSS, which is trained on millions of hours of speech data. SpeechGPT 2.0-preview is the first anthropomorphic real-time interaction system from OpenMOSS, trained on millions of hours of speech data...

Goose:开源可扩展的编程智能体,自动化执行编程全流程任务-首席AI分享圈

Goose: open source scalable programming intelligences that automate the full range of programming tasks

General Introduction Goose is an open source AI agent tool developed by Block, Inc. designed to help developers automate everyday development tasks. It supports a wide range of Large Language Models (LLMs) and interacts with users via the command line or desktop application interfaces.Goose performs everything from code writing and editing to testing and...

en_USEnglish