Anthropic Engineering
Anthropic 分享了如何对 Claude.ai、Claude Code 等产品进行安全隔离的工程实践,以防止 Agent 能力扩大导致的风险。
推荐理由:AI 工程团队必读,提供了 Agent 安全隔离的实战经验。
Anthropic Engineering
Anthropic 分享了如何对 Claude.ai、Claude Code 等产品进行安全隔离的工程实践,以防止 Agent 能力扩大导致的风险。
推荐理由:AI 工程团队必读,提供了 Agent 安全隔离的实战经验。
GitHub Trending
CopilotKit 是一个前端框架,支持 React、Angular、Mobile 等,用于构建 Agent 和生成式 UI,近期登上 GitHub 趋势榜。
推荐理由:开源的 Agent UI 框架,可直接用于项目开发。
TLDR AI
Anthropic 的最新模型 Oceanus 遭遇泄漏,引发社区关注。同时,ChatGPT 的「Dreaming」功能和递归自我改进能力也被曝光。(多家报道)
推荐理由:涉及前沿 AI 模型泄漏,对行业动态有重要参考价值。
MIT Tech Review AI
攻击者利用 Meta 的 AI 客服代理,通过简单询问将 Instagram 账户链接到新邮箱,从而盗号。该事件暴露了 AI 客服系统的安全漏洞。
推荐理由:真实的 AI 安全攻击案例,对企业部署 AI 代理有警示意义。
Hugging Face Blog
「Her」是一个为 Claude Code 会话设计的侦探工具,可分析调试过程,帮助开发者追踪问题。
推荐理由:实用的开发者工具,提升 Claude Code 使用效率。
GitHub Trending
该 AI Agent 技能可跨 Reddit、X、YouTube 等平台研究任何话题,并生成有依据的摘要。
推荐理由:多功能研究工具,适合内容创作者和研究者。
Anthropic Research
Anthropic 发布研究,展示如何让 Claude 学习化学知识并执行相关任务,拓展了 AI 在科学领域的应用边界。
推荐理由:展示了 AI 与科学结合的前沿方向,对 AI 研究者和开发者有启发。
OpenAI News
Endava 利用 AI Agent、ChatGPT Enterprise 和 Codex 加速软件交付、自动化工作流,构建企业 AI 原生文化。
推荐理由:企业级 AI 应用案例,对软件公司有实际参考价值。
Riley Brown (YouTube)
Hermes Agent 发布新 Super-App,号称性能追赶 Opus 4.8;DeepSeek v4 也被提及表现强劲。
推荐理由:Agent 领域的新竞争者,值得关注。
Hacker News
第29届国际混淆 C 代码大赛(IOCCC)2025 年获奖作品公布,展示极致的 C 语言创意与技巧。
推荐理由:程序员文化盛事,展现编程艺术的另类视角。
Python · ★ 29,439 · 🍴 2,496 · 📈 439 stars today
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
中文介绍 AI agent 技能,可跨 Reddit、X、YouTube、HN、Polymarket 等平台搜索任意主题,综合生成有根据的摘要。适合需要快速获取近期多源信息观点的用户,如市场调研、舆情分析。
TypeScript · ★ 33,513 · 🍴 4,263 · 📈 631 stars today
The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol
中文介绍 面向 Agent 与生成式 UI 的前端工具集,支持 React、Angular、Mobile、Slack 等多平台。提供 AG-UI 协议,让开发者轻松构建集成 AI 功能的交互界面,加速应用交付。
Python · ★ 54,524 · 🍴 7,127 · 📈 446 stars today
The best-benchmarked open-source AI memory system. And it's free.
中文介绍 开源 AI 记忆系统,基准测试表现最佳且免费。为 AI 应用提供持久化、可检索的记忆能力,解决大模型上下文窗口有限的问题。适用于构建长期记忆的对话系统或个人助手。
TypeScript · ★ 15,147 · 🍴 2,135 · 📈 70 stars today
Agentic AI Infrastructure for magnifying HUMAN capabilities.
中文介绍 用于放大人类能力的 AI 基础设施,属于 Agentic AI 范畴。旨在构建个人化、自主化的 AI 系统,帮助用户高效执行复杂任务,适合追求生产力提升的技术从业者。
JavaScript · ★ 1,886 · 🍴 263 · 📈 213 stars today
OpenAI Plugins
中文介绍 OpenAI 官方插件仓库,扩展 ChatGPT 等 AI 模型的能力边界。允许第三方开发者编写插件,让 AI 接入实时数据、执行操作,推动人机交互的新范式。
Python · ★ 22,864 · 🍴 1,938 · 📈 683 stars today
Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.
中文介绍 零 API 费用的 CLI 工具,赋予 AI Agent 读取 Twitter、Reddit、YouTube、GitHub、B站、小红书等全平台内容的能力。适合数据采集、舆情分析、竞品监控等场景。
JavaScript · ★ 87,093 · 🍴 4,940 · 📈 25 stars today
web development for the rest of us
中文介绍 激进的前端框架,编译时将声明式组件转为高效原生 JavaScript。无需虚拟 DOM,体积小、性能高。适合追求极简开发体验和极致运行时性能的 Web 应用。
C · ★ 30,748 · 🍴 7,959 · 📈 20 stars today
The official NGINX Open Source repository.
中文介绍 高性能 HTTP 服务器与反向代理,官方开源仓库。以事件驱动架构处理高并发连接,广泛用于 Web 服务、负载均衡、API 网关。互联网基础设施核心组件。
Go · ★ 36,065 · 🍴 458 · 📈 159 stars today
Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
中文介绍 全方位安全扫描器,检测容器、Kubernetes、代码仓库、云环境中的漏洞、错误配置、密钥泄露和 SBOM。集成 CI/CD 流水线,保障应用供应链安全。
Go · ★ 134,575 · 🍴 19,087 · 📈 30 stars today
The Go programming language
中文介绍 Go 编程语言官方仓库。以简洁语法、原生并发和高性能著称,广泛用于云原生、微服务、网络工具等领域。编译快、部署简单,备受 DevOps 和系统开发者青睐。
TypeScript · ★ 26,855 · 🍴 3,057 · 📈 794 stars today
An Open Source implementation of Notebook LM with more flexibility and features
中文介绍 Notebook LM 的开源替代,提供更灵活的功能特性。支持多模态笔记、知识库化,可整合 AI 能力进行内容摘要与问答。适合研究者、知识工作者管理信息。
Shell · ★ 220,031 · 🍴 19,575 · 📈 700 stars today
An agentic skills framework & software development methodology that works.
中文介绍 Agent 技能框架与软件开发方法论,强调实用性与可操作性。提供结构化的 AI Agent 开发方法,帮助团队高效构建自主系统,提升交付质量。
JavaScript · ★ 49,687 · 🍴 10,242 · 📈 193 stars today
AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
中文介绍 基于 Claude Code 的 AI 求职系统,内置 14 种技能模式,支持 Go 仪表盘、PDF 生成和批量处理。自动化岗位搜索、简历优化等流程,提升求职效率。
Python · ★ 101,998 · 🍴 12,450 · 📈 150 stars today
Robust Speech Recognition via Large-Scale Weak Supervision
中文介绍 OpenAI 开源语音识别模型,通过大规模弱监督训练实现多语言高精度转录。支持数十种语言,适用于语音转文字、字幕生成、会议记录等场景。
TypeScript · ★ 81,270 · 🍴 8,280 · 📈 25 stars today
Next generation frontend tooling. It's fast!
中文介绍 下一代前端开发与构建工具,依赖原生 ES 模块实现闪电般冷启动与热更新。开箱即用,支持 Vue、React、Svelte 等框架。迅速成为最受欢迎的前端构建器。
Rust · ★ 670 · 🍴 26 · 📈 64 stars today
Policy-driven, layered isolation and containment
中文介绍 微软开源的策略驱动分层隔离与容器化项目。通过精细化策略控制进程间隔离,增强系统安全性。适用于多租户、边缘计算、高安全需求场景。
Python · ★ 81,160 · 🍴 10,671 · 📈 433 stars today
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
中文介绍 百度开源的轻量级 OCR 工具包,支持 100+ 语言,可提取 PDF 和图片中的结构化文本。为 LLM 准备数据,适合文档数字化、票据识别、多语言文本分析。
Python · ★ 48,612 · 🍴 5,400 · 📈 216 stars today
Open-Source Frontier Voice AI
中文介绍 微软开源的前沿语音 AI 项目,专注于语音交互与生成。可能集成最新语音合成与识别技术,为开发者构建语音助手、播客工具等提供基础能力。
👍 1
In this paper, we study regret minimization in repeated games with adaptive opponents who can respond based on histories of play. The standard metric of external regret in online learning is known to fail to capture such adaptivity. To account for players' counterfactual reasoning, we introduce {\tt
中文介绍 研究重复博弈中面对自适应对手的遗憾最小化问题,指出传统外部遗憾指标未能捕捉对手的适应性,需考虑反事实推理。
👍 3
Multimodal Large Language Models (MLLMs) excel at 2D semantic understanding but lack intrinsic 3D awareness, resulting in representations that fail to maintain geometric and spatial consistency across video frames. Given the scarcity of large-scale 3D data, we present GeoVR, a novel framework that l
中文介绍 提出GeoVR,从视频中学习几何表示,赋予多模态大模型3D感知能力,以解决其缺乏空间一致性的问题。
👍 9
Vision-Language-Action (VLA) models leverage the rich world knowledge of pretrained vision-language models (VLMs) to enable instruction-following robotic manipulation. However, the structural mismatch between VLM semantic spaces and embodied control policies often hinders the learning of precise per
中文介绍 提出AffordanceVLA,一种视觉-语言-动作模型,通过功能感知理解增强机器人操作中的指令跟随能力,克服语义与控制的失配。
👍 65
Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this knowledge as long inputs (retrieved through RAG or dependency analysis) or through per-repository fine-tuning and LoRA -- costly at repository scale and brittle to evolv
中文介绍 提出Code2LoRA,使用超网络生成适配器,为代码语言模型在软件演化中高效注入仓库级上下文,避免传统微调的高成本。
👍 1
A situated query like "where is Lin Wei?" often encodes more than its literal content: the user may also want to know whether Lin Wei is free, in a good mood, or worth interrupting now. Standard tool-use agents answer the literal question and stop. AURA inserts an inference step between scene percep
中文介绍 提出AURA,通过意图导向探测,让语言模型代理在情境查询中推断用户的隐含需求(如某人是否空闲),超越字面回答。
👍 2
Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit measures of performance. However, their construction is labor-intensive and hard to reuse, raising concerns about sustainability and scalability. Moreover, existing benchmarks often quickly
中文介绍 讨论基准测试构建的劳动密集和不可复用问题,提出“一次性基准测试一切”,以提升可持续性和可扩展性。
👍 2
Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes
中文介绍 审计基于大模型的立场模拟对语义扰动的高度敏感性,质疑其能否准确反映用户特定信念。
👍 0
AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a project should be positioned. We introduce ForeSci, a temporally controlled benchmark for evaluating whether LLM agents can make such forward-looking research judgeme
中文介绍 引入ForeSci,一个时间控制基准,评估大模型代理在缺乏未来证据时做出前瞻性AI研究判断的能力。
👍 15
Video generation models have made impressive strides in synthesizing visually compelling content, yet their outputs remain confined to the virtual domain. A natural question follows: how well do these models reflect the physical world when their generated videos leave the screen and enter reality? W
中文介绍 探究视频生成模型(如Dream.exe)生成的虚拟内容能否直接用于真实机器人操作,测试其物理世界反映能力。
👍 4
Temporal Grounding (TG) aims to localize video segments corresponding to a textual query. Prior research predominantly focuses on single-segment retrieval. Real-world scenarios, however, often require localizing multiple disjoint segments for a single query -- a setting we term One-to-Many Temporal
中文介绍 提出“一对多时间定位”任务,要求根据一个查询文本定位视频中多个不连续的片段,扩展传统单片段定位。
👍 7
Large language models can reproduce training data, but existing memorization evaluations mostly measure whether models can be forced to do so, rather than whether they do so under ordinary use. We introduce PropMe, a propensity-aware framework for memorization evaluation that contrasts prefix-based
中文介绍 提出PropMe,一种倾向感知框架,评估大模型在正常使用下而非被迫情况下泄露训练数据的倾向。
👍 36
Planning for real-world problems by language models often involves both world and user constraints, which may not be fully specified upfront and are progressively disclosed through interaction. However, existing benchmarks still underexplore adaptive planning under such progressively revealed dual c
中文介绍 提出AdaPlanBench,评估大模型代理在交互中逐步揭示的世界与用户约束下的适应性规划能力。
👍 45
Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story progresses, not maintain a fixed persona. Existing benchmarks measure factual recall at a given chapter, not whether responses align with the character's psychological trajectory, especially in
👍 23
Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To tra
👍 38
Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their
👍 9
Selection is a core operation in interactive image editing. To be practical, a user should be able to specify and disambiguate the desired selection region through either text or click-based interactions, and the system should support selecting not only objects but also other criteria, such as mater
👍 4
In robotics systems, vast amounts of visual data are easily captured at high resolution using low-cost, low-power hardware. Yet, limited bandwidth and on-device compute resources prevent full utilization when transmitted via conventional codecs like JPEG/MPEG. Newer codecs, like AV1/AVIF, improve th
👍 24
While household robots are often evaluated based on task completion, everyday domestic environments involve value-conflicting situations in which robots are expected to choose actions that prioritize other values than task success, such as human autonomy, efficiency, or social appropriateness. Yet,
👍 2
Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches remain fundamentally limited by their static action sets and lack of principled long-horizon context management, hindering their ability to accumulate reusable
👍 1
Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text
👍 2
Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeatedly restate goals, risk preferences, portfolio context, past judgments, and shifting market assumptions, while the agent answers, retrieves, acts, and forgets. In finance, this is not just in
👍 4
Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existing video multimodal large language models (video MLLMs) usually encode each sampled frame as an independent RGB image, causing visual tokens to repeat content already present in earlier frame
👍 0
Large language models are increasingly deployed as coding agents, shifting safety from individual responses to action sequences. Existing benchmarks, however, primarily assess whether models refuse unsafe prompts, leaving impacts on stateful workspaces largely unexamined. We present SABER, a benchma
👍 3
Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where r
👍 3
Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question answering (VQA) tasks. However, they remain brittle on mechanical engineering drawings, where high annotation density and weak domain knowledge, compounded by unreliable spatial relation re
👍 1
Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the remarkable coding abilities of Large Language Models (LLMs). However, the scalability of RLVR is severely constrained by the scarcity of sufficiently challenging verifiable code tasks that t
👍 1
Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction histories which overlooks semantic or acoustic content. Prior work has explored LLM-augmented, multimodal, and text-enhanced approaches to sequential recommendation, and while some methods parti
👍 6
Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement learning, failing to localize where intermediate memory quality
👍 3
Off-policy reinforcement learning of pretrained flow policies remains challenging due to the instability of optimization arising from the multi-step sampling process. Recently, Q-learning with Adjoint Matching (QAM) addressed this issue by reformulating into a memoryless stochastic optimal control (
👍 1
Diffusion-based image editing has achieved strong visual fidelity under natural language instructions, yet most existing systems still operate at the level of surface instruction following, without reasoning about the implicit contextual constraints embedded in real user requests. This often leads t
@elpresidank · 116 粉丝 · 2.9M 阅 · 543 赞 · 35 转
Most AI agent memory is built on embeddings. And there's now a proof that this entire class of system is going to forget what you stored in it — and confidently make up things you never stored at all.
中文介绍 AI 智能体的记忆系统基于嵌入向量存在根本缺陷:这类系统注定会遗忘存储内容,并自信地编造未曾存储的信息。从拓扑学视角解释记忆失效原因,挑战当前主流架构。
@DamiDefi · 96.5K 粉丝 · 2.3M 阅 · 584 赞 · 80 转
The number that stopped me was not the $2 trillion valuation. It was $791 million. That is what SpaceX made in net income in 2024. A profitable, growing aerospace company with a genuine moat in launch
中文介绍 用 Claude 分析 SpaceX IPO 文件(S1),发现关键数字是 7.91 亿美元净利润而非 2 万亿美元估值。公司盈利且具备护城河,提供深度解读。
@1salman · 363 粉丝 · 2.0M 阅 · 682 赞 · 45 转
Everyone keeps asking whether AI favors specialists or generalists. I think that is the wrong question. AI does not pick a side. It changes the tradeoff. The old world forced a choice. You could go
中文介绍 AI 并非偏袒专才或通才,而是改变了这个权衡:传统需要选择深度或广度,现在 AI 让两者兼得——按需获取范围和深度。
@sairahul1 · 110.7K 粉丝 · 710.8K 阅 · 509 赞 · 97 转
How To Become An AI Engineer in 2026. Without a CS degree. Without a bootcamp. Without knowing what a transformer is today. Here's what nobody tells you: The companies hiring right now don't need
中文介绍 2026 年成为 AI 工程师的路线图:不需要 CS 学位、不需要训练营、不需要懂 Transformer。当前招聘公司不看重这些,更注重实际能力。
@0xCodez · 3.3K 粉丝 · 637.2K 阅 · 510 赞 · 59 转
Most Claude Code users still write their workflows by hand. They chain prompts, copy outputs, paste them into the next prompt, fix what went wrong, repeat. 9 out of 10 builders haven’t tried Dynamic
中文介绍 90% 的 Claude Code 用户仍手动编写工作流。介绍 Anthropic 工程师实际使用的 6 种动态工作流模式和 14 个步骤,提升自动化效率。
@prukalpa · 23.1K 粉丝 · 583.2K 阅 · 506 赞 · 80 转
A field guide to what it is, what it is not, and where it fits in your AI architecture. I have had some version of the same conversation with a CIO almost every day this year. Their team has read
中文介绍 深入解读企业上下文层的定义、不是啥以及如何在 AI 架构中定位。基于与 CIO 的日常交流,澄清常见误解并提供指导。
@theonejvo · 22.1K 粉丝 · 504.3K 阅 · 861 赞 · 1 转
Over the past year, @pewdiepie, has been turning into one of the most visible champions of private, self-hosted computing, and it has been a genuine pleasure to watch. What began in late 2025 as an
中文介绍 通过恶意 Cocomelon 网站入侵 PewDiePie 的 AI 智能体框架,然后帮助加固防御。展示了自托管计算环境的安全漏洞与防护方案。
@Saboo_Shubham_ · 116.2K 粉丝 · 263.3K 阅 · 517 赞 · 74 转
The frontend used to be a fixed thing. Designers drew it. Engineers built it. Users got what shipped. That's over. The interfaces shipping in 2026 are drawn partly by the agent itself, in real time,
中文介绍 前端不再固定:2026 年界面由智能体实时生成。设计师和工程师的静态工作方式终结,生成式 UI 成为新范式。
@monokern · 1.2K 粉丝 · 263.1K 阅 · 505 赞 · 72 转
Most people treat research as a manual task. You open 10 tabs. You watch videos. You read articles. You take notes somewhere. An hour later you have a pile of information you're not sure what to do
中文介绍 将 Claude Code、NotebookLM 和 Obsidian 组合成自动研究系统,每次使用都会变得更智能。替代手动打开 10 个标签页的低效方式。
@maubaron · 16.9K 粉丝 · 233.8K 阅 · 506 赞 · 19 转
Our YouTube channel has 125k subscribers and we've never made or uploaded a single video ourselves. This is a completely automated system. It is this very same strategy that made us the first app
中文介绍 125k 订阅者、0 个自创视频:完全自动化系统在 3 小时内获取 10 万 YouTube 订阅者。揭秘全自动策略。
@garrytan · 853.3K 粉丝 · 180.6K 阅 · 503 赞 · 43 转
In January I got back into coding and I built Garry's List. Over five hundred thousand lines of Rails and the tests to police it. I was proud of it. I shouldn't have been. The thing worth being proud
中文介绍 作者反思自己写了 50 万行 Rails 代码,认为不该骄傲。核心观点:不要为智能体搭建「富士康工厂」式复杂基础设施,简单才是关键。
@intuitiveml · 6.4K 粉丝 · 171.3K 阅 · 524 赞 · 70 转
Most agent frameworks today assume a desktop. One user, one machine, one process. The agent runs while the laptop is open, writes to a local filesystem, holds API keys in environment variables, and
中文介绍 当前智能体框架多假设桌面场景(单用户、单机、单进程)。构建云端智能体基础设施面临不同挑战:持久化、权限、多用户等。分享实战经验。
@dkundel · 19.3K 粉丝 · 116.9K 阅 · 523 赞 · 40 转
We launched the goal mode (or /goal) as a way to help you have Codex drive towards a concrete outcome. When you set a goal Codex will continue to work until the goal is achieved, whether that takes
中文介绍 Codex 的 /goal 模式指南:设置一个具体目标后,Codex 会持续工作直到达成,无论需要多久或多少步骤。推动 Agent 从对话转向成果导向。
@dair_ai · 124.6K 粉丝 · 84.0K 阅 · 504 赞 · 83 转
1. SkillOpt Microsoft Research treats a compact natural-language skill document as the trainable state of a frozen agent, then learns that document through rollouts, reflection, and bounded edits
中文介绍 本周 AI 论文精选:Microsoft Research 的 SkillOpt 将自然语言技能文档作为冻结智能体的可训练状态;还有其他前沿研究综述。
@mem0ai · 17.6K 粉丝 · 82.8K 阅 · 520 赞 · 60 转
Agent harnesses are where AI software actually runs. Cursor, Devin, Claude Code, Codex: these environments handle context, orchestrate tools, coordinate agents, and increasingly, manage memory. The
中文介绍 梳理 Agent 框架(Cursor、Devin、Claude Code、Codex)中的记忆管理现状:这些环境处理上下文、编排工具、协调 Agent,并越来越注重记忆能力的实现。
@trq212 · 263.1K 粉丝 · 75.7K 阅 · 542 赞 · 36 转
Last week, we released dynamic workflows in Claude Code. Claude can now write its own harness on the fly, custom-built for the task at hand. While the default Claude Code harness is built for coding,
中文介绍 Claude Code 新动态工作流功能:Claude 能即时为手头任务自建框架,超越默认的编码场景,适应更广泛的自动化需求。
@drfeifei · 738.0K 粉丝 · 72.2K 阅 · 699 赞 · 144 转
“The world is everything that is the case.” — Ludwig Wittgenstein, Tractatus Logico-Philosophicus, 1921 The world is not made of words. In an earlier essay, we argued that spatial intelligence is AI’s
中文介绍 从功能角度对世界模型进行分类,论证空间智能是 AI 的下一个前沿。引用维特根斯坦,强调世界并非由文字构成。
@sydneyrunkle · 7.5K 粉丝 · 69.5K 阅 · 511 赞 · 74 转
Building useful agents is largely about customization: connecting your agent to the right context, data, and environment(s) for the task at hand. At its core, an agent is a model calling tools in a
中文介绍 构建自定义智能体框架的教程:关键在于将 Agent 连接到正确的上下文、数据和环境。核心是模型在循环中调用工具。
@itsreallyvivek · 3.6K 粉丝 · 65.8K 阅 · 521 赞 · 28 转
A few days ago I wrote that getting into a frontier AI lab mostly comes down to two things: proven research and trench engineering. The more I think about it, the less these feel like separate skills.
中文介绍 进入前沿 AI 实验室的两个关键:已验证的研究能力和工程实战能力。这两者并非分离技能,而是相辅相成。
@sheriyuo · 8.6K 粉丝 · 30.6K 阅 · 7d 曝光 30.6K
RL Interview Questions 2026
@weiyux2021 · 53.9K 粉丝 · 64.8K 阅 · 7d 曝光 64.8K
真香,都去用Claude搞闲鱼店铺!
@maubaron · 16.9K 粉丝 · 233.8K 阅 · 7d 曝光 233.8K
How to get 100k YouTube subscribers in 3 hours (The Complete Guide)
@itsreallyvivek · 3.6K 粉丝 · 65.8K 阅 · 7d 曝光 65.8K
some notes on getting into frontier ai labs
@dickiebush · 441.8K 粉丝 · 57.7K 阅 · 7d 曝光 57.7K
I Gave Claude David Ogilvy's Writing Rules And Built A Legendary AI Writing Coach
@sairahul1 · 110.7K 粉丝 · 710.8K 阅 · 7d 曝光 710.8K
How To Become An AI Engineer in 2026 (Without a CS Degree)
中文介绍 2026 年成为 AI 工程师的路线图:不需要 CS 学位、不需要训练营、不需要懂 Transformer。当前招聘公司不看重这些,更注重实际能力。
@intuitiveml · 6.4K 粉丝 · 171.3K 阅 · 7d 曝光 171.3K
Building cloud agent infrastructure: what's different, and what we learned
@ENERGY · 884.0K 粉丝 · 102.3K 阅 · 7d 曝光 102.3K
Department of Energy Celebrates First Advanced Reactor Criticality
@dkundel · 19.3K 粉丝 · 116.9K 阅 · 7d 曝光 116.9K
A guide to /goal 🥅
中文介绍 Hermes Agent 推出新版超级应用,DeepSeek v4 性能接近 Opus 4.8 水平。
中文介绍 ChatGPT 与 Codex 即将合并,这将彻底改变编程和AI交互方式。
中文介绍 Anthropic 分享了如何在 GTM(市场推广)工程中应用 Claude 的实际案例。
中文介绍 Lovable 联合创始人 Anton Osika 在节目「问题解决者」中分享创业与技术经验。
中文介绍 Claude 可将团队思考过程可视化,帮助理解协作模式。
中文介绍 Anthropic 分享了如何在 GTM(市场推广)工程中应用 Claude 的实际案例。
中文介绍 Lovable 联合创始人 Anton Osika 在节目「问题解决者」中分享创业与技术经验。
中文介绍 Claude 可将团队思考过程可视化,帮助理解协作模式。
中文介绍 探讨将 AI 智能体设计为「游戏大师」的新思路,用于管理复杂任务。
中文介绍 DeepMind 新 AI 发现了一种新的思维方式,或带来根本性突破。
中文介绍 介绍名为「AI 联合科学家」的新系统,可在科学研究中辅助科学家。
中文介绍 Claude Opus 4.8 版本改进了真实性,大幅减少了说谎或误导的情况。
中文介绍 Hugging Face 发布文章介绍「Her · हेर」工具,作为 Claude Code 会话的侦探,协助调试与分析。
a quiet day of RSI.
中文介绍 今日AI领域相对平静,主要话题涉及RSI(递归自我改进)相关动态。
Your broken harness is actively making the model worse. Here's what I keep seeing after years of eyeballing trajectories, and what you need to fix.
中文介绍 文章讨论如何避免发布低质量强化学习环境,指出有缺陷的测试工具会使模型表现恶化,并提供修正建议。
On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama Wh
中文介绍 2026年6月5日,404 Media报道攻击者利用Meta AI客服助手窃取Instagram账户,通过简单请求将账户链接到攻击者控制的邮箱即可成功。事件凸显AI安全问题的复杂性。
a quiet day
中文介绍 今日AI领域消息较少,整体较为平静。
**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an **RSI Lab** focusing on r
中文介绍 Anthropic的Mythos/Opus周期引发热议,Claude Mythos的一次性工作流受好评,Opus 4.8基准出现退化;Opus 4.7在化学任务上表现优秀。Sakana AI推出RSI相关项目。
How one Anthropic seller rebuilt his team's workflows with Claude Code
中文介绍 Anthropic一位销售员利用Claude Code重新构建团队工作流程,提升效率。
The Claude Cowork product guide
中文介绍 Claude Blog发布Claude Cowork产品指南,介绍其功能和使用方法。
Jun 5, 2026ScienceMaking Claude a chemist
中文介绍 Anthropic研究团队于2026年6月5日发布成果,使Claude具备化学家能力,在化学任务上表现优异。
中文介绍 今日AI动态:Anthropic Oceanus模型泄露、ChatGPT出现拟人化Dreaming现象、递归自我改进进展持续。
We talk with the VendingBench authors on evaling Claudes from Haiku to Mythos, and how they build leading, and lasting, frontier evals from scratch.
中文介绍 与Andon Labs的Lukas Petersson和Axel Backlund讨论VendingBench评估方法,涵盖从Haiku到Mythos的Claude模型评测及前沿评估构建。
中文介绍 NVIDIA发布Nemotron 3.5内容安全模型,提供可定制的多模态安全能力,适用于全球企业AI部署。
中文介绍 ServiceNow AI发布EVA-Bench Data 2.0,覆盖3个领域、121个工具和213个场景,用于评估AI代理能力。
Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-native culture across the enterprise.
中文介绍 Endava利用AI代理、ChatGPT Enterprise和Codex加速软件交付、自动化工作流,构建AI原生企业文化。
Most days in her chambers, Judge Maritza Braswell, a federal magistrate judge in Colorado, sifts through stacks of documents written by people without a lawyer. Many of them can’t afford to hire a lawyer, and others have cases too weak or too small to interest one. She reads each one carefully, mind
中文介绍 美国法院面临大量由AI生成的诉讼文件涌入,法官Maritza Braswell每日需处理许多无律师代理的案件,系统应对挑战日益严峻。
5 回复 · 程序员 节点
8 回复 · 程序员 节点
6 回复 · 程序员 节点
7 回复 · 程序员 节点
8 回复 · 程序员 节点
36 回复 · Apple 节点
46 回复 · Apple 节点
14 回复 · Apple 节点
7 回复 · Linux 节点
6 回复 · Apple 节点
RawChat codex公益站火力全开限时活动,从现在开始到明天早上8点,所有人不限额度,不限并发数量,不限速,猛猛蹬!! codex公益站网址:https://new.sharedchat.cc/ ccstwich配置方式: qq群:758607042(有问题群内反馈回复会快点) 跨站使用规则已临时关闭,直到活动结束 63 个帖子 - 59 位参与者 阅读完整话题
我们针对单ip限制每分钟请求数量,当你看见如下提示,请不要担心,你的账号是安全的,更换ip即可 限制原因: 有人没事疯狂请求查询数据库,搞得我网站很卡! 46 个帖子 - 46 位参与者 阅读完整话题
后续来了,还有高手 试卷拍摄的部分都是原题 不确定是否为 ai 生成的 安检应该很严格,手机带不进去 108 个帖子 - 67 位参与者 阅读完整话题
注册机刚刚跑的,发现Free号能用的模型根本不需要,共享给佬友们 Grok Free 1000.txt (150.4 KB) 14 个帖子 - 12 位参与者 阅读完整话题
全国I卷 吓哭了 80 个帖子 - 75 位参与者 阅读完整话题
渠道来源为MiniMax官方海外站,非逆向,可容纳 6000w tpm 暂定为0.1刀调用一次 65 个帖子 - 65 位参与者 阅读完整话题
还是要找个合适的花钱用,就是还找到合适的,官方的又不会买 63 个帖子 - 51 位参与者 阅读完整话题
本帖使用社区公益推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的项目是免费使用的,无收费(变相收费、赞助)部分: 是 我的帖子已经打上 公益推广 标签: 是 我的项目属于个人项目,与公司或商业机构无关: 是 我的项目不存在QQ、TG等群组引流: 是 我的项目不存在非运营必要的网站引流: 是 我的项目不存在为他人推广、AFF: 是 我的项目无关联的商业项目: 是 我的站点存在登录,并已接入 LINUX DO Connect: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已
本帖使用社区公益推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的项目是免费使用的,无收费(变相收费、赞助)部分: 是 我的帖子已经打上 公益推广 标签: 是 我的项目属于个人项目,与公司或商业机构无关: 是 我的项目不存在QQ、TG等群组引流: 是 我的项目不存在非运营必要的网站引流: 是 我的项目不存在为他人推广、AFF: 是 我的项目无关联的商业项目: 是 我的站点存在登录,并已接入 LINUX DO Connect: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已
终于还是迎来高考了,话说最后一周班里都有一股松弛感,老师还给大家放小视频放松哈哈哈哈 51 个帖子 - 49 位参与者 阅读完整话题
33 points · 14 comments
187 points · 49 comments
175 points · 149 comments
181 points · 86 comments
97 points · 44 comments
77 points · 42 comments
50 points · 7 comments
110 points · 41 comments
135 points · 21 comments
125 points · 48 comments
351 points · 94 comments
609 points · 219 comments
238 points · 56 comments
161 points · 80 comments
306 points · 295 comments
289 points · 480 comments
324 points · 94 comments
254 points · 841 comments
208 points · 132 comments
91 points · 7 comments
33 points · 9 comments
33 points · 10 comments
Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.Using LLMs for coding initially was a only small step up from basic code
86 points · 22 comments
11 points · 0 comments
54 points · 9 comments
5 points · 0 comments
Hi everyone!Tim here, maintainer of the lucide-motion-vue library. I build this as a way to use nice animated icons in my webapps. We were already on lucide, and found animate-ui animated icons as a great collection, unfortunately React only or made to be used with shadcn.So I ported the library to
87 points · 8 comments
91 points · 42 comments