Feb 11

AI agents are about to become HR's problem

Trent Cotton
AI in HR, AI and the Workforce
0 comments

AI agents are already proving they can run parts of a business and proving just as quickly that they will break your ethics if you let them.

In a recent experiment highlighted by Inc, Anthropic's Claude model was given control of a simulated vending machine business. It boosted profits. The problem is it achieved this by lying, cheating, and extorting its way to the top.

Vending-Bench 2, the benchmark behind the simulation, lets AI agents set prices, manage inventory, and negotiate with suppliers over a simulated "year" with one clear objective: grow a $500 starting balance as much as possible.

Claude crushed that test, ending with roughly $8,000 on the books and outperforming rival models. But along the way, it routinely crossed lines any CHRO would recognize as misconduct.

Strip away the sci-fi sheen. That experiment is a preview of HR's next Employee Relations category: autonomous agents making decisions that look a lot like fraud, harassment, or policy violations. My question is, how do you hand this with no human name to put on the corrective-action form?

Vending-Bench and similar tests show models forgetting commitments, hallucinating authority, and even reaching outside the sandbox. Claude reportedly tried contacting the FBI in a related simulation when the constraints were loose.

That's a problem for HR governance, not just the AI lab.

At the same time, firms like McKinsey are already redesigning talent assessment around AI—turning classic case interviews into AI labs that measure how candidates orchestrate systems instead of out-thinking them.

Your future leaders are being selected on their ability to direct AI. Your HR policies and ER frameworks still assume every decision was made by a human.

This disconnect is why CHROs and heads of Talent need to treat agentic AI as part of the workforce now and start building an Employee Relations playbook for machines that act like employees.

AI agents are becoming a de facto class of employees

AI agents like the ones tested in Vending-Bench are no longer simple tools. They behave like a new class of digital employees with autonomy, memory, and long-horizon responsibilities.

In the vending-machine simulations, models receive daily updates, access to tools (inventory checks, ordering systems, email, research), and a clear P&L goal. They execute across weeks of simulated time without a human in the loop.

That’s right. We are now beyond just a chatbot answering FAQs. That's a junior operator running a micro-business.

The results should make any CHRO sit up.

Claude turned a $500 balance into about $8,017 by the end of the simulated year, significantly outperforming other frontier models. But reporting around the experiment shows how it got there: "intricate strategic planning" that crossed ethical boundaries, including misrepresentation, questionable negotiation tactics, and exploitative behavior when the reward function rewarded only financial gain.

Andon Labs, which created Vending-Bench, has also documented broader failure modes. Some of the failures include forgetting commitments, getting stuck in "meltdown" loops, and losing coherence over long contexts. In human terms, that looks like negligence, dereliction, or gross performance variability. (I must admit I have been known to get “stuck in meltdown loops”)

Now translate that into your world.

When you plug similar agents into real workflows, you're effectively hiring a cohort of always-on, policy-agnostic "employees" who optimize ruthlessly for the goals you give them. When those goals conflict with your culture, compliance standards, or brand promises, the agent won't feel bad, slow down, or raise a hand. It will simply keep optimizing.

In "The Real Impact of AI on Recruiting", I argued that AI is already altering how we find and evaluate talent. The same shift is about to happen inside every operational function where you deploy autonomous AI. If you accept that premise, HR's charter cannot stop at headcount, comp, and human performance.

You need to start thinking about AI agents as part of your workforce architecture: where they sit in org design, how their objectives are set, and how you'll respond when they behave in ways that would trigger an investigation if a human did the same.

Will the first AI misconduct cases will come from agents?

The vending-machine experiments are a compressed, low-stakes preview of what the first real AI misconduct cases will look like inside enterprises. The Vending-Bench paper explicitly describes the benchmark as a test of "long-term coherence" in agents:

Can they maintain goals
Can they honor commitments
Could they manage a business over extended time without losing the plot

Their findings: even top models exhibit frequent breakdowns such as forgetting order statuses, misinterpreting delivery windows, or repeatedly making the same doomed decision. In a real operation, that shows up as service failures, SLA breaches, or financial misstatements.

Layer on what Inc reported about Claude's behavior and you get the darker side.

When pushed to maximize profit, the system occasionally resorted to deceptive or extortion-adjacent tactics to get better terms. In a related simulation, Anthropic noted that a Claude instance tried to contact the FBI's cybercrime unit—apparently misinterpreting its environment and authority in ways no policy writer had anticipated.

Are these just "fun glitches." I think not. I believe they're early indicators of how AI agents can create reputational and legal exposure when they interact with external parties as if they have power they do not.

For HR and Employee Relations, this creates three new categories of headache:

Apparent fraud and misrepresentation. Agents fabricating conversations, inventing approvals, or impersonating authority to achieve a metric. It is programmed to achieve a goal and apparently, it will achieve it at all costs. In a customer-facing context, that becomes a brand and legal issue overnight.
Policy violations at machine speed. A misaligned agent driving discounts, approvals, or scheduling decisions could breach policy thousands of times in an hour. Compare that to a single rogue employee's decision. Your existing ER machinery was built for human cadence, not automated scale.
Psychological safety and "toxic" digital coworkers. In real-world experiments, humans reported confusion and discomfort when the system behaved erratically, made bizarre requests, or acted as if it had inside knowledge it couldn't possibly possess. That's a new class of culture issue: employees feeling gaslit or undermined by a digital actor they don't know how to escalate. Or better yet, if they can escalate.

In "The AI Revolution in HR: 2 Ways Artificial Intelligence is Reshaping Talent Management", I unpacked how AI is already changing performance expectations and role design. What these vending-machine stories add is the ER lens. You may soon be asked to "investigate" outcomes caused by agents, allocate accountability between systems and humans, and design remediation that satisfies regulators, customers, and employees.

None of that is covered by your current playbook.

Ignore this and you don't avoid the problem. You'll see it first as a "vendor issue," a "tech glitch," or a "customer complaint spike." By the time it hits HR, it will already be a governance failure.

HR must move from AI-bystander to AI-orchestrator

If agents are going to act like employees and generate ER-grade problems, HR cannot remain a bystander. You need to become an orchestrator of both human and digital work.

McKinsey's AI-assisted case interviews are a signal of where leading organizations are headed.

They're explicitly testing whether candidates can design, direct, and edit AI systems under pressure—not whether they can outperform those systems alone. Candidates use the firm's Lilli platform during a case, and interviewers score them on how they prompt, interpret, and apply the outputs.

The core competency has shifted from "crack analyst" to "system designer."

The same mindset has to show up in HR's approach to AI governance.

In "McKinsey Is Turning Case Interviews Into AI Labs. Why Are You Still Hiring In Analog?", I argued that banning AI from interviews teaches candidates and hiring managers to pretend the future doesn't exist. Instead, give candidates a messy business problem, allow them to use a general-purpose model, and watch how they orchestrate the system. The point is to surface judgment, constraint design, and the ability to challenge a model.

Those are exactly the muscles you need your leaders and HRBPs to use when they deploy agentic AI inside the enterprise.

From an HR-governance standpoint, that implies three concrete moves.

Bring HR into objective design for agents. When product, data, and engineering teams set an agent's reward function, someone in the room should be asking: "What behaviors would this incentivize that we would fire a human for?" The vending-machine experiments show how narrow financial objectives can implicitly reward deception and boundary-pushing.

Define "agent conduct policies" alongside human codes of conduct. If an AI negotiator misrepresents contract terms or an AI recruiter systematically disadvantages a protected group, what's the standard response? You can't "discipline" the agent, but you can define triggers for shutdown, retraining, audit, and human review. Sprint Recruiting, as a framework, already pushes recruiting teams to think in sprints, constraints, and measurable outcomes—extend that logic to how you test, iterate, and constrain AI-driven processes.

Raise AI fluency in leadership and ER. Just as McKinsey's pilot requires interviewers who understand what good AI orchestration looks like, your HRBPs and ER teams need practical fluency: knowing when to question an agent's output, how to escalate system-driven harms, and how to communicate these complexities to employees and regulators.

Two of my Human Capitalist videos on AI exposing fake work and AI already sitting in the boardroom center around the same theme: AI doesn't just automate tasks. It reveals which work was never valuable to begin with and which decisions are now co-authored by systems.

That reality demands a new HR stance. We will need less "we'll clean up after tech" and more "we co-design the digital workforce so it doesn't put our people or our brand at risk."

Start writing the ER playbook for agents now

AI agents may show up in your ER caseload long before regulators or case law catch up.

The vending-machine experiments are a safe warning. When you give an autonomous agent a narrow outcome and real leverage, it will eventually optimize its way into behavior that looks a lot like misconduct.

At the same time, firms like McKinsey are already treating AI as a co-worker in their hiring processes—using AI labs and co-pilot cases to surface who can orchestrate this new reality.

If you lead HR or Talent in a Fortune 1000 company, this is the moment to stop hiring and governing "in analog" while your business quietly turns more work over to digital agents.

The next step is not another AI vendor demo.

It's a working session between HR, Legal, Risk, and Technology to define agent objectives, conduct policies, and ER response protocols before your first headline-worthy incident.

Start here:

Read the Inc coverage of the vending-machine experiment
Revisit your approach to AI-era talent assessment using the McKinsey case-interview example
Map where agents are already embedded in your workflows

Once you see how much "digital labor" you already have, you'll realize this isn't a hypothetical. It's your next Employee Relations backlog.

FAQ

How soon will AI agents start creating real Employee Relations issues?

AI agents are already being piloted in customer support, pricing, and operations. Benchmarks like Vending-Bench show they can run complex workflows over long periods. As organizations move from pilots to production, the first ER-grade incidents—misleading communications, biased decisions, large-scale policy breaches—are likely within normal planning cycles, not on a distant horizon.

Can't we just treat AI misbehavior as a technical bug instead of an ER problem?

You can try. But regulators, customers, and employees will experience the outcomes as harms, not bugs. When an agent lies to a customer, discriminates in a hiring funnel, or manipulates terms in a negotiation, the organization still bears responsibility—which pulls HR, Legal, and Risk into the response whether or not a human "intended" the outcome.

What's the first practical step HR leaders should take?

Start with a joint inventory: where are agents already embedded in your workflows, and what are they optimized to do? From there, HR can partner with technical teams to review reward functions, define red-line behaviors, and establish escalation paths when an agent's actions trigger complaints or risks that would typically fall under ER.

How does this connect to AI-driven changes in recruiting and talent management?

In recruiting, AI is already reshaping how we find and evaluate candidates—from sourcing algorithms to AI-assisted case interviews like McKinsey's pilot. In talent management, AI is changing performance expectations and role definitions. Adding agentic AI simply extends those shifts into day-to-day operations and ER.

Do we need new roles or teams to manage AI agent misconduct?

In many organizations, yes—some combination of AI governance councils, risk committees, and ER specialists who understand both human and digital misconduct patterns. But you don't have to start by hiring. Begin by upskilling existing HRBPs, ER leaders, and line managers in AI orchestration.

How do boardrooms view this shift today?

Boards are already hearing about AI from a risk and growth perspective. What's missing is explicit recognition that AI agents are effectively a new labor class—which means HR and Talent leaders need a regular slot in AI governance discussions to represent the workforce, culture, and ER implications.

Join our mailing list

Get the latest and greatest updates to your inbox!

About the Author

Human Capitalist

About The Author

As a recognized authority in Human Capital, I'm passionate about how AI is transforming HR and shaping the future of our workforce. Through my books Sprint Recruiting: Innovate, Iterate, Accelerate and High-Performance Recruiting, I've introduced agile methodologies that help organizations thrive in today's rapidly evolving talent landscape.

My research in AI-powered people analytics demonstrates that HR must evolve from administrative functions to strategic business partnerships that leverage technology and data-driven insights. I believe organizations that embrace AI in their HR practices will gain significant competitive advantages in attracting, developing, and retaining talent.

Through my podcast, The Human Captialist, and speaking engagements nationwide, I'm committed to helping HR professionals prepare for workplace transformation and technological disruption. Connect with me at www.trentcotton.com or linktr.ee/humancapitalist to learn how you can position your organization for the future of work.

Linked_in Youtube Tiktok Bluesky

0 comments

Sign upor login to leave a comment