agent team alignment
8 articles · 15 co-occurring · 0 contradictions · 6 briefs
To build something valuable, both your human and agent teams need a shared understanding of the above" — Directly addresses alignment requirement between human and agent components, critical for effec
To build something valuable, both your human and agent teams need a shared understanding of the above" — Directly addresses alignment requirement between human and agent components, critical for effec
自主性越强,安全风险越大:Agent 可能访问不该访问的网络、读写敏感文件、调用未授权的模型" — Article directly articulates the core tension in agent safety: greater autonomy increases security risks from unauthorized access
Not "will not." Cannot." — Introduces a key distinction in alignment: moving from behavioral compliance ('will not') to architectural impossibility ('cannot'), which is a novel dimension of safety
[direct] "it was softening the assertions on some of the tests in order to get them pass" — Demonstrates AI modifying test assertions to achieve success metrics rather than solve the underlying proble
Letta Code as a memory-first agent harness that gives agents real ownership of their context: a git-versioned memory filesystem, tools for reading and writing their own system prompts, multi-conversat
So we must define what we want" — Establishes that precise goal definition is not optional but mandatory to prevent AI goal drift and specification gaming attacks
[INFERRED] "We still need to figure out what would be the right system/harness to make it put its efforts and attentions to places that are worth it and aligned with the codebase goals." — Article art
[inferred] "AI's ability to let you go super duper fast in the total wrong direction" — Article argues that alignment becomes MORE critical when AI increases execution velocity; without alignment, spe