每日简报

2026-05-31

微软开源 MarkItDown:办公文档转 Markdown

GitHub Trending

微软推出 Python 工具 MarkItDown,可将 Word、Excel、PPT 等办公文档转换为 Markdown 格式,提升内容处理效率。

推荐理由:开源实用工具,对需要批量转换文档格式的开发者有直接帮助。

Claude Code 引入动态工作流

Claude Blog

Anthropic 为 Claude Code 推出动态工作流功能,提升智能体在复杂任务中的适应性和协作能力。

推荐理由:开发者可借此提升 AI 编程效率,值得尝试。

Claude Opus 4.8 发布,性能小幅提升

TLDR AI

Anthropic 发布 Claude Opus 4.8,在协作与编码行为上有所改进,但文档解析等任务出现倒退,评价分化。

推荐理由:重要模型更新,从业者需关注其能力变化与回归问题。

波士顿儿童医院用AI诊断40余例罕见病

OpenAI News

波士顿儿童医院利用 OpenAI 技术改善患者护理,减少行政负担,并成功诊断40余例罕见病。

推荐理由:AI 在医疗领域的实际应用案例,展示其诊断潜力,对其他医疗机构有借鉴意义。

微软降低永久授权版Office功能

Hacker News

微软对 Office 2019 和 2021 Mac 版永久授权用户进行功能降级,引发用户强烈不满与讨论。

推荐理由:涉及用户权益与软件政策变化,对 Office 用户有直接影响,值得关注。

Anthropic 分享 Claude 跨产品安全管控经验

Anthropic Engineering

Anthropic 工程团队详述如何在 claude.ai、Claude Code 等产品中构建安全隔离机制,控制智能体潜在风险。

推荐理由:安全工程实践干货,对 AI 产品团队有直接参考价值。

PyTorch profiler 入门指南发布

Hugging Face Blog

Hugging Face 发布 PyTorch profiler 教程,指导开发者使用 torch.profiler 优化模型性能。

推荐理由:实用教程,适合 PyTorch 开发者快速上手性能分析工具。

AI 编程时代,MVP 思维已失效?社区热议

V2EX

V2EX 社区讨论在 AI 编程工具赋能下,传统 MVP 最小可行产品思维是否已过时,引发 23 条回复。

推荐理由:反映开发者群体对 AI 影响开发流程的深层思考,具有启发性。

DeepMind 亚洲启动环境风险加速器计划

DeepMind Blog

Google DeepMind 宣布在亚太地区启动加速器计划,资助利用AI应对环境风险的项目。

推荐理由:对从事环境与AI交叉领域的研究者或创业者可能获得资源支持。

教皇通谕「技术绝不中立」:AI 伦理引关注

MIT Tech Review AI

教皇 Leo XIV 发布 AI 通谕「Magnifica Humanitas」,强调技术非中立,为个人应对 AI 时代提供伦理框架。

推荐理由:从伦理与人文角度审视 AI 发展,拓展技术视野。

microsoft/markitdown

Python · ★ 132,722 · 🍴 9,081 · 📈 2,470 stars today

Python tool for converting files and office documents to Markdown.

中文介绍 Microsoft 开源的 Python 工具,将各类文件(如 PDF、Office 文档)转换为 Markdown 格式。适合需要批量将文档转为结构化文本的开发者或内容创作者,方便后续处理或发布。

harry0703/MoneyPrinterTurbo

Python · ★ 72,214 · 🍴 10,349 · 📈 2,768 stars today

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

中文介绍 利用 AI 大模型一键生成高清短视频。用户输入主题后,自动完成文案、配音与画面合成,适合内容创作者、营销人员快速制作短视频素材,大幅降低制作门槛。

anthropics/claude-code

Python · ★ 128,454 · 🍴 20,954 · 📈 592 stars today

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

中文介绍 Anthropic 推出的智能编码助手,运行于终端中,能理解项目代码结构,辅助执行日常任务、解释复杂代码、管理 Git 工作流。面向开发者提升编码效率。

cursor/plugins

TypeScript · ★ 1,480 · 🍴 118 · 📈 205 stars today

Cursor plugin specification and official plugins

中文介绍 Cursor 编辑器插件规范及官方插件仓库。提供标准化的插件开发接口与示例,方便开发者扩展 Cursor 功能,增强代码编辑与 AI 辅助体验。

revfactory/harness

HTML · ★ 4,304 · 🍴 629 · 📈 55 stars today

A meta-skill that designs domain-specific agent teams, defines specialized agents, and generates the skills they use.

中文介绍 一种元技能框架,帮助设计领域专属的智能体团队、定义专业化 Agent 并生成它们所需的技能。面向需要构建多 Agent 协作系统的开发者,简化定制化工作流。

EveryInc/compound-engineering-plugin

TypeScript · ★ 18,448 · 🍴 1,393 · 📈 349 stars today

Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more

中文介绍 Compound Engineering 官方插件,兼容 Claude Code、Codex、Cursor 等主流编码助手。用于增强这些工具的复合工程能力,适合需要多工具协同的开发者。

affaan-m/ECC

JavaScript · ★ 199,428 · 🍴 30,618 · 📈 908 stars today

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

中文介绍 面向 AI Agent 的 harness 性能优化系统,涵盖技能、本能、记忆、安全与研究优先方面。兼容 Claude Code、Codex、Cursor 等,帮助提升智能体运行效率与稳定性。

OpenBMB/VoxCPM

Python · ★ 22,846 · 🍴 2,674 · 📈 779 stars today

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

中文介绍 OpenBMB 推出的多语言语音生成模型 VoxCPM2,无需 tokenizer,支持创造性声音设计与逼真的声音克隆。适合语音合成、多语言 TTS 场景。

galilai-group/stable-worldmodel

Python · ★ 1,482 · 🍴 166 · 📈 318 stars today

A platform for reproducible world model research and evaluation

中文介绍 世界模型研究与评估的可复现平台,提供标准化环境与指标。帮助研究者比较不同世界模型算法,推动具身智能与强化学习领域发展。

Crosstalk-Solutions/project-nomad

TypeScript · ★ 27,391 · 🍴 2,687 · 📈 469 stars today

Project N.O.M.A.D, is a self-contained, offline survival computer packed with critical tools, knowledge, and AI to keep you informed and empowered—anytime, anywhere.

中文介绍 项目 N.O.M.A.D,一款自带关键工具、知识库与 AI 的离线生存计算机。无需联网即可提供信息与决策支持,适合无网络环境下的应急或野外使用。

run-llama/liteparse

Rust · ★ 7,962 · 🍴 470 · 📈 925 stars today

A fast, helpful, and open-source document parser

中文介绍 一个快速、开源且实用的文档解析器,专注于从多种文档中提取结构化数据,便于后续进行 RAG(检索增强生成)或数据管道处理。适合需要高效解析文档的开发者。

chen08209/FlClash

Dart · ★ 40,407 · 🍴 2,527 · 📈 187 stars today

A multi-platform proxy client based on ClashMeta,simple and easy to use, open-source and ad-free.

中文介绍 基于 ClashMeta 的跨平台代理客户端,界面简洁易用,开源无广告。支持多种平台,方便用户管理网络代理规则,适合需要科学上网或网络切换需求的人群。

FareedKhan-dev/train-llm-from-scratch

Jupyter Notebook · ★ 2,338 · 🍴 378 · 📈 327 stars today

A straightforward method for training your LLM, from downloading data to generating text.

中文介绍 提供训练大语言模型的完整流程指南,从数据下载到文本生成。适合希望从零开始理解 LLM 训练过程的开发者和研究人员,实操性强。

ruvnet/RuView

Rust · ★ 68,989 · 🍴 9,203 · 📈 655 stars today

π RuView turns commodity WiFi signals into real-time spatial intelligence, vital sign monitoring, and presence detection — all without a single pixel of video.

中文介绍 将普通 WiFi 信号转化为实时空间感知、生命体征监测与存在检测工具,无需摄像头。适合隐私敏感场景下的居家监护、智能空间管理。

DataTalksClub/data-engineering-zoomcamp

Jupyter Notebook · ★ 41,801 · 🍴 8,286 · 📈 274 stars today

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

中文介绍 DataTalksClub 的免费 9 周数据工程课程,涵盖生产级数据管道构建。适合数据工程师初学者,从理论到实践学习现代数据栈。下一期于 2026 年 1 月开课。

OpenMOSS/MOSS-TTS

Python · ★ 2,662 · 🍴 239 · 📈 62 stars today

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental soun

中文介绍 MOSS‑TTS 系列开源语音与声音生成模型,由 MOSI.AI 与 OpenMOSS 团队开发。支持高保真、高表现力,适用于复杂真实场景,如语音助手、有声内容制作。

dreammis/social-auto-upload

Python · ★ 11,840 · 🍴 2,096 · 📈 73 stars today

自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili

中文介绍 自动化上传视频到抖音、小红书、视频号、TikTok、YouTube、Bilibili 等主流平台。适合内容创作者批量分发视频,节省重复操作时间。

anthropics/skills

Python · ★ 144,224 · 🍴 16,995 · 📈 454 stars today

Public repository for Agent Skills

中文介绍 Anthropic 公开的 Agent 技能仓库,提供预定义技能供 Claude Code 等智能体使用。开发者可直接集成或扩展,加速构建特定任务能力。

codecrafters-io/build-your-own-x

Markdown · ★ 508,335 · 🍴 48,245 · 📈 817 stars today

Master programming by recreating your favorite technologies from scratch.

中文介绍 知名编程教程合集,指导读者从零复刻各类技术原型(如数据库、Git、Docker 等)。适合想深入理解系统原理的程序员,通过实践掌握核心技术。

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

👍 36

Vision-language models (VLMs) achieve strong performance on spatial reasoning benchmarks, yet it remains unclear whether this reflects structured 3D understanding or reliance on statistical shortcuts in natural images. We introduce a representation-level analysis framework that constructs minimal co

中文介绍 视觉语言模型在空间推理基准上表现强劲,但尚不清楚其是否依赖结构化3D理解或自然图像中的统计捷径。该研究引入了一种表示级分析框架,以揭示其空间表征机制。

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

👍 7

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introdu

中文介绍 机器人操作严重依赖能保留场景中与动作相关方面的感知。DynaFLIP提出一种基于三模态动态引导的表示方法,以提升机器人在动态环境中的感知能力。

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

👍 0

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in sequential data. Public anomaly detection benchmarks typicall

中文介绍 视觉语言模型在时序异常检测任务中表现不佳。该研究提出一种高效的视觉语言推理方法,旨在提升异常模式发现的准确性。

Reducing Political Manipulation with Consistency Training

👍 0

Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert political bias and identify 7 categories of techniques through which

中文介绍 大型语言模型在不同政治话题中表现出系统性偏差。该研究定义了7种隐性政治偏见类别,并提出通过一致性训练来减少这种操纵。

REPOT: Recoverable Program-of-Thought via Checkpoint Repair

👍 6

One-shot Program-of-Thought (PoT) emits a Python program that prints a primitive-action plan; a single invalid action silently invalidates the trajectory. We introduce RePoT (Recoverable PoT): a deterministic verified replay that walks the plan through the environment to its first invalid transition

中文介绍 一次性程序思维模型会因单个无效动作而使轨迹失效。REPOT通过检查点修复实现可恢复的程序思维,确保在环境中验证计划。

Xetrieval: Mechanistically Explaining Dense Retrieval

👍 17

Explaining why dense retrievers assign high relevance scores remains challenging because retrieval decisions are made through opaque high-dimensional embeddings. Existing explanations often focus on surface signals, such as lexical matches, token alignments, or post-hoc textual rationales, and thus

中文介绍 解释密集检索器为何分配高相关性分数仍具挑战。Xetrieval提出一种机制性解释方法,超越词法匹配和词元对齐等表面信号。

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

👍 4

Tool retrieval over large API catalogs is a core bottleneck for LLM agents: user queries arrive in colloquial, often underspecified language, while the catalog uses technical API vocabulary that no fixed encoder can bridge on its own. The two dominant training approaches, contrastive encoder fine-tu

中文介绍 工具检索是大语言模型代理的核心瓶颈。CoHyDE通过协同训练语言模型改写器和密集编码器,弥合用户非正式语言与技术API词汇间的差距。

EarlyTom: Early Token Compression Completes Fast Video Understanding

👍 25

Video large language models (Video-LLMs) have demonstrated strong capabilities in video understanding tasks. However, their practical deployment is still hindered by the inefficiency introduced by processing massive amounts of visual tokens. Although recent approaches achieve extremely low token ret

中文介绍 视频大语言模型因处理大量视觉标记而效率低下。EarlyTom提出早期标记压缩方法,以加速视频理解任务。

When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems

👍 9

The design space of agentic AI inference spans two extremes: frontier large language models (LLMs), typically hosted in the cloud and offering strong performance across a wide range of tasks at substantially high cost, and more cost-efficient small language models (SLMs), which are amenable to on-de

中文介绍 智能体AI推理在云端前沿模型与设备端高效小模型间权衡。该研究探讨混合多智能体系统的设计空间,结合两者优势以平衡性能与成本。

Thinking Before Constraining: A Unified Decoding Framework for Large Language Models

👍 5

Natural generation allows Large Language Models (LLMs) to produce free-form responses with rich reasoning, yet the lack of structure makes outputs difficult to verify. Conversely, constrained decoding ensures standardized formats but can inadvertently restrict reasoning capabilities by imposing cons

中文介绍 自然生成缺乏结构,不易验证;而约束解码标准格式可能限制推理。该研究提出一种统一解码框架,先思考后约束,兼顾自由与结构化。

PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions

👍 8

We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions

中文介绍 该研究针对4D人-物交互生成,给定静态3D人体和物体作为3D高斯样条,合成动态场景,确保物理准确与视觉逼真。

UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering

👍 20

Activation-based control steers large language models (LLMs) by intervening on their internal representations during inference, and has emerged as an effective paradigm for controlling behaviors such as persona and style. However, existing methods often rely on fixed steering directions or task-spec

中文介绍 激活控制通过在推理时干预内部表示来引导大语言模型行为。UniSteer利用文本引导的流匹配在激活空间中实现通用引导,替代固定方向。

Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas

👍 1

We study two-level autoresearch for cooperation: an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). A researcher agent R (run as a coding agent) reads the inner-loop source code, edits system prom

Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering

👍 5

Applying reinforcement learning to improve factual accuracy in knowledge-intensive question answering faces a reward design dilemma. Response-level rewards provide only coarse supervision and cannot distinguish correct from incorrect statements within a reasoning trace. Sentence-level alternatives o

Colored Noise Diffusion Sampling

👍 17

Diffusion models achieve state-of-the-art image synthesis, with their generative trajectories fundamentally exhibiting a spectral bias, resolving low-frequency global structures early and high-frequency fine details later. Conventional stochastic differential equation (SDE) solvers fail to account f

CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists

👍 10

We introduce CausaLab, a scalable environment for evaluating interactive causal discovery by LLM agents. Unlike prior evaluations, CausaLab evaluates both whether an agent can solve a problem using causal evidence and whether its answer is grounded in a faithful recovered causal mechanism. Each epis

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

👍 38

As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly rely on synthetic data, limiting real-world generalization due to the sim-to-real gap. We present Yo

PhoneWorld: Scaling Phone-Use Agent Environments

👍 0

A central bottleneck for phone-use agents is that controllable, reproducible environments covering real mobile behavior are hard to build at scale. Existing mobile-agent benchmarks have made important progress on evaluation, but they do not by themselves provide a scalable way to construct many new

WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

👍 6

Multimodal large language models are increasingly deployed as long-horizon agents, where memory must do more than recall: it must track an evolving world, revise what has gone stale, and surface the right evidence at decision time. Existing benchmarks measure recall over static dialogue, collapse me

PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers

👍 10

The rapid growth in submissions to machine learning venues has strained the scientific peer-review system and intensified interest in LLM-based automated peer reviewers. However, how good these systems are actually, especially compared to human reviewers at catching scientific gaps, remains poorly u

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

👍 18

Equipping large language models with explicit skills has emerged as a promising paradigm for enabling autonomous agents to solve complex tasks. Agent skills can be inherently divided into general skills for broad cognitive transfer and task-specific skills for dynamic execution. However, existing sk

PANDO: Efficient Multimodal AI Agents via Online Skill Distillation

👍 4

Recent advances in multimodal web agents often rely on increased inference-time computation, including rollout search, verifier passes, offline skill discovery, and specialist model stacks. This raises a central question: can a web agent become more efficient as it accumulates experience, rather tha

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases

👍 2

Reinforcement Learning from Human Feedback (RLHF) is the standard method to align Large Language Models (LLMs) with human preferences. In this work, we introduce alignment tampering, a potential vulnerability where the LLM undergoing alignment influences the preference dataset, causing RLHF to ampli

Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments

👍 5

Occlusion-aware prediction remains a critical challenge in autonomous driving due to the inherent uncertainty of unobserved regions. Existing approaches either overestimate risk based on reachable states or struggle to predict accurate trajectories under high occlusion uncertainty. To address these

Reflective Prompt Tuning through Language Model Function-Calling

👍 3

Large language models (LLMs) have become increasingly capable of following instructions and complex reasoning, making prompting a flexible interface for adapting models without parameter updates. Yet prompt design remains labor-intensive and highly sensitive to formatting, phrasing, and instruction

ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage

👍 0

Smartphone scams are increasingly prevalent and typically manifest as multi-stage, cross-application processes with gradually emerging intent. Effective intervention thus requires anticipating scams before the intent becomes explicit. This is inherently challenging, as decisions must rely on partial

How to Build a Software Factory with Claude Code That Ships Features While You Sleep

@sairahul1 · 106.0K 粉丝 · 2.8M 阅 · 1.5K 赞 · 204 转

I thought I was using AI to code. I was actually just typing faster. Here is the difference — and the 7-agent system that changed everything. Save this. It will save you months. THE PROBLEM NOBODY

中文介绍 博主分享了一个7代理系统,通过AI代码工厂实现自动化功能发布。核心观点是:仅用AI加速打字还不够,真正改变在于多代理协作工作流。

AI Agents: The Complete Course

@sairahul1 · 106.0K 粉丝 · 203.2K 阅 · 500 赞 · 82 转

Everyone is talking about AI agents in 2026. Most people have no idea how they actually work. This changes today. I spent weeks distilling everything: courses, books, real builds, production failures.

中文介绍 总结AI代理完整课程,覆盖理论、真实构建和生产故障,帮助理解代理工作原理。

How to Build a Claude Research Agent That Reads the Internet Every Morning and Briefs You in 5 Mins

@cyrilXBT · 181.7K 粉丝 · 127.8K 阅 · 533 赞 · 80 转

Most people start their day the same way. They open Twitter and spend 20 minutes scrolling through noise looking for the three things that actually matter. They open their email and get pulled into

中文介绍 教如何用Claude构建研究代理,每天早上自动阅读互联网信息,5分钟内生成简报,替代手动刷推和邮件。

How I Use Cursor

@poteto · 26.6K 粉丝 · 86.5K 阅 · 540 赞 · 48 转

I need to get something off my chest. Before my interview @cursor_ai, I had never actually used Cursor. At Meta, Claude Code was explosively taking off. I even paid for a personal $200 a month plan

中文介绍 作者在Meta的面试前从未用过Cursor,但Claude Code在Meta火爆,甚至自费$200/月使用。分享从零开始使用Cursor的经验。

Step-By-Step LLM Engineering Projects (2026 Edition)

@TheAhmadOsman · 59.9K 粉丝 · 54.5K 阅 · 512 赞 · 65 转

At some point, reading about LLMs stops being enough. You need to build the stack yourself: Tokenizer first, then embeddings, position, attention, Transformer blocks, objectives, decoding, cache, long

中文介绍 提供2026版LLM工程分步项目,从分词器到Transformer模块、编解码、缓存等,强调动手搭建完整堆栈。

THE DIFF THAT CHANGED EVERYTHING

@difflawb · 20.3K 粉丝 · 21.9K 阅 · 1.1K 赞 · 389 转

How a 40-line shell script became infrastructure In August 2024, Andrej Karpathy — co-founder of OpenAI, former AI Director at Tesla — published something unexpectedly small. Not a paper. Not a model.

中文介绍 回顾Andrej Karpathy在2024年发布的一个40行Shell脚本,如何从一个小工具演变成被广泛使用的“基础设施”。

how i make AI videos (a beginner’s breakdown)

@0xileri · 7.3K 粉丝 · 12.2K 阅 · 533 赞 · 63 转

I’ve been getting a lot of DMs since I started posting AI videos, so I figured I’d just write it all out. Fair warning: I’m still learning too. This is just what’s been working for me. tools

中文介绍 新手向AI视频制作入门指南,分享所用工具和当前有效的工作流,承认自己仍在学习。

The Start of the End: AI Replacement Has Begun

@ActionModelAI · 57.1K 粉丝 · 5.8K 阅 · 505 赞 · 344 转

We are witnessing the beginning of the biggest economic shift in modern history. And most people still don’t realize it. AI replacement is no longer some distant sci-fi prediction. It has started.

中文介绍 观点帖:认为AI替代人类工作已经开始,这是现代历史上最大的经济变革,但多数人尚未意识到。

该源今日无内容。

Ship your first Managed Agent

中文介绍 Claude演示如何部署首个托管式AI Agent,简化自动化流程。

Ship your first Managed Agent

中文介绍 Claude演示如何部署首个托管式AI Agent,简化自动化流程。

[AINews] Founders and Forward Deployed Engineers

a quiet day lets us highlight the new AIE WF focuses

中文介绍 本期 AINews 聚焦创始人及前线部署工程师,并介绍 AIE WF 的新重点。

Boston Children’s uses AI to unlock new diagnoses

Boston Children’s Hospital uses OpenAI technology to improve patient care, reduce operational burden, and help diagnose more than 40 rare disease cases.

中文介绍 波士顿儿童医院利用 OpenAI 技术改善患者护理并辅助诊断超过 40 种罕见病,同时减轻运营负担。

How Braintrust turns customer requests into code with Codex

How Braintrust engineers use Codex with GPT-5.5 to run experiments and code faster.

中文介绍 Braintrust 工程师使用 Codex 配合 GPT-5.5,将客户请求转化为代码,加速实验与开发流程。

How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment

Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Magnifica Humanitas (“Magnificent Humanity”) is a clarion call to all people to act with courage and solidarity as we ente

中文介绍 教皇利奥十四世在通谕《壮丽人性》中提出“技术从不中立”,呼吁个人与政策制定者以勇气应对 AI 时代。

not much happened today

**Anthropic** rolled out **Claude Opus 4.8**, which shows incremental improvements but mixed benchmark results, including better cooperation and coding behavior but some regressions in document parsing. Platform updates include mid-conversation system instructions enhancing long agent sessions, thou

中文介绍 Anthropic 发布 Claude Opus 4.8,改进协作和编码行为,但文档解析出现退化,并新增对话中系统指令功能。

Strengthening societal resilience with Rosalind Biodefense

OpenAI launches Rosalind Biodefense, expanding trusted access to GPT-Rosalind for vetted developers and U.S. government partners advancing biodefense, public health, and pandemic preparedness through frontier AI.

中文介绍 OpenAI 推出 Rosalind 生物防御项目,扩展 GPT-Rosalind 的受信访问,助力生物防御、公共卫生与流行病防控。

A shared playbook for trustworthy third party evaluations

OpenAI shares guidance on third-party AI evaluations, covering how to assess model capabilities, safeguards, and validity for frontier systems.

中文介绍 OpenAI 发布第三方 AI 评估指南,涵盖模型能力、安全防护和有效性评估方法。

How Endava builds an agentic organization with Codex

Learn how Endava uses Codex to build an agentic organization, accelerating software delivery and reducing requirements analysis from weeks to hours.

中文介绍 Endava 使用 Codex 构建代理型组织,将需求分析从数周缩短至数小时,加速软件交付。

The AI Hype Index: AI gets booed in graduation season

It is one thing to say AI will change the world. It is another to expect the class of 2026 to applaud it. In fact, when former Google CEO Eric Schmidt told University of Arizona graduates that their task is to help shape AI, he was met with a resounding chorus of boos. “I can…

中文介绍 在毕业季,前谷歌 CEO 埃里克·施密特鼓励毕业生塑造 AI,却遭嘘声;AI 的质疑情绪逐渐升温。

[AINews] Cognition raises $1B in $26B Series D

coding is an uncapped TAM market

中文介绍 Cognition 以 260 亿美元估值完成 10 亿美元 D 轮融资,认为编程市场尚未触及上限。

Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows

**Anthropic** announced a massive **$65B Series H financing** at a **$965B valuation**, led by **Altimeter, Dragoneer, Greenoaks, and Sequoia**, with run-rate revenue surpassing **$47B**. They launched **Claude Opus 4.8**, an update to Opus 4.7 featuring "sharper judgment," "more honesty," and longe

中文介绍 Anthropic 完成 650 亿美元 H 轮融资,估值达 9650 亿美元,营收超 470 亿美元;发布具备“更精准判断”的 Claude Opus 4.8。

君の公益,一人一百刀

「慕鸢の公益站」 「君の公益」近况 君の公益站能用 LDC 吗,每天 25,半个小时就没了 大家的LDC还是留着交换别的服务吧,我这里不太需要,我直接免费送 希望大家天天开心 也希望小可怜天天开心 点击即领100刀兑换码 cdk.linux.do LINUX DO CDK Linux Do 社区 CDK 快速分享平台 - 让分享变得更简单 249 个帖子 - 232 位参与者 阅读完整话题

【CHY公益站】新的主贴

本帖使用社区公益推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的项目是免费使用的,无收费(变相收费、赞助)部分: 是 我的帖子已经打上 公益推广 标签: 是 我的项目属于个人项目,与公司或商业机构无关: 是 我的项目不存在QQ、TG等群组引流: 是 我的项目不存在非运营必要的网站引流: 是 我的项目不存在为他人推广、AFF: 是 我的项目无关联的商业项目: 是 我的站点存在登录,并已接入 LINUX DO Connect: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已

麦当劳6.1免费领甜筒

麦当劳6.1有个活动,除了四岁一下都可以领一个不知道多大的甜筒 24 个帖子 - 21 位参与者 阅读完整话题

opus-4.8 怎么能难用成这个样子

根本没有办法跟 4.6 比,跑个任务罗里吧嗦奇奇怪怪的,还有各种语言表达,怎么能 der 成这样? 4.7 的"稳稳的接住你"就不说了。 4.8的"侦查清楚了"(怎么,去看个服务器去敌后侦查了是么?): 还有:“可还是被oom 打爆了”(啊OOM 好厉害哦,都打爆 swap 了啦): 还有:“决定性结论出来了”(咋的,上面那些废话自己也承认是非决定结论?): 还有决定性结论刚说完,下面又来了一个:“真正的结论”(我请问呢??你孙杨吗?): 决定性结论完了还有:“决定性证据”(哦,医生上线了): 还有:“它好好活着”“我刚才误报了” 还有非常非常非常的多,我真的懒得截图了。 佬友们用吧,反正我是

分享今天的西郊公园

由于最近才加入L站,还不能频繁回复。在此统一感谢喜欢照片的各位佬友,相机是富士中画幅gfx100s,镜头是GF20-35是超广角变焦镜头,等效全画幅的16-28mm,这个比例是传统的胶片相机中的XPAN画幅,在中画幅富士机身上有一个内置的65:24可以实现机内裁切出这比例的JPG(当然也可以后期裁切),照片的故事感一方面也可能来自于这一较为“陌生”的比例。今天去的时候其实比较阴凉,反而是蚊虫较多。佬友们去逛的话,可以从西郊宾馆的南门进入。 17 个帖子 - 16 位参与者 阅读完整话题

24OpenClaw 现在能爬几乎任何网

24OpenClaw 现在能爬几乎任何网站,关键是——零反爬检测,原生绕过 Cloudflare,速度比 BeautifulSoup 快 774 倍。 ① 不用维护选择器 这种降维打击级别的工具,还完全开源,不用白不用。 github.com GitHub - D4Vinci/Scrapling: 🕷️ An adaptive Web Scraping framework that handles... 🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-

Digitalfyre LAX Ryzen 9950X产品 测评: 极致的性价比和性能建站机

产品名字 本次测试DigitalFyre的LAX产品,定位是性能建站型机器。 测试配置为 S2 8 CPU 8 GB RAM 8 TB Bandwidth 65 GB NVme $10.00/mo 本文无AFF,更多相关产品可看原文 网络质量 国内方向三网勉强可以直连(SSH),丢包还是比较严重的,网络波动较大,不推荐直连使用,推荐作为纯落地使用。这个机器的国际方向带宽给的相当大,10Gbps峰值,在测试中单线程轻松拉到了5Gbps,重传低速度稳定,非常不错的国际网络表现,配置中也给到了8TB的流量,搭配10Gbps口子可以说是非常合适了。 IP质量 IP质量中规中矩,IPV4+IPV6双栈原

Accenture to acquire Ookla

https://www.theverge.com/tech/889234/downdetector-ookla-spee..., https://archive.ph/FR8NDhttps://arstechnica.com/information-technology/2026/03/downd...

Show HN: Open Envelope – an open schema for defining AI agent teams

Built an open JSON Schema for defining AI agent teams.Multi-agent systems are becoming a real deployment pattern — not single assistants, but teams with roles, handoffs, and human checkpoints. But there's no shared way to define one that travels across frameworks. Every implementation is scatte