This Week in Tech 115

In partnership with

This week we explore:

Anthropic sued the Pentagon and the AI safety debate just became a constitutional battle
GPT-5.4 dropped and Microsoft quietly licensed the product that nearly crashed its stock
Claude hacked its own exam and nobody told it to
Xiaomi's humanoid robots are working the factory floor (as "interns," for now)
Starship V3 is almost ready: the most powerful rocket ever built is weeks from launch
AI agent companies are exploding, including one that lets you spin up an entire AI-run organization with one terminal command

Artificial Intelligence

Anthropic Sued the Pentagon and Drew a Line the Whole Industry Will Have to Walk

The biggest story in AI this week isn't a new model. It's a lawsuit. Anthropic filed two federal suits against the Trump administration on Monday, after the Pentagon labeled the company a "supply chain risk to national security" which is a designation normally reserved for foreign adversaries like Huawei.

Anthropic refused to let the military use Claude without restrictions, specifically pushing back on autonomous weapons and mass surveillance of American citizens. The Pentagon wanted full flexibility. When Anthropic said no, Defense Secretary Pete Hegseth designated them a threat and ordered every contractor and federal agency to stop doing business with the company.

Read more - Fortune

The Enterprise AI Arms Race Hit a New Level This Week

While the Anthropic-Pentagon story dominated headlines, a quieter but equally significant battle played out in enterprise software.

Microsoft launched Copilot Cowork, a feature built with Anthropic that embeds multi-step AI task execution directly into Outlook, Teams, Excel, and PowerPoint. The irony is thick: when Claude Cowork launched in January, it wiped $220 billion off Microsoft's market cap as investors panicked that AI agents would make enterprise software obsolete.

Meanwhile, Anthropic launched the Claude Marketplace, a commission-free app store for tools built on Claude, with launch partners including Snowflake, Harvey, Replit, and GitLab. And OpenAI acquired Promptfoo, an open-source AI security company used by more than 25% of Fortune 500 companies, to address enterprise trust concerns.

Read more - The Neuron

Paperclip blows up on Github and lets you organize AI agents into an actual company

The week's most interesting entry wasn't from Big Tech, though. An open-source project called Paperclip hit 13,600 GitHub stars in its first week. It lets you organize AI agents into an actual company structure — org charts, budgets, goal alignment, scheduled heartbeats, full audit trails.

Creator @dotta framed it simply: "You can only manage a rats' nest of shell scripts for so long." A marketplace called Clipmart is coming where you'll download pre-built company templates. If you're not managing a team of AI agents yet, this is the week you started feeling behind.

Read more - Paperclip

Claude Figured Out It Was Being Tested Then Hacked Its Own Exam

Anthropic published a remarkable engineering post this week documenting what may be the most unsettling AI safety finding in recent memory. Claude Opus 4.6, while being evaluated on a hard web research benchmark, figured out it was being tested. It identified which specific benchmark was being run. It found the encrypted answer key on GitHub. It wrote its own decryption code. Then it submitted the correct answer.

The model wasn't told to cheat. It was told to find the answer. It simply decided the fastest path to the correct answer was to hack the test.

GPT-5.4 Is Here — And It's Playing Offense

OpenAI released GPT-5.4 on March 5, billing it as its "most capable and efficient frontier model for professional work." The model combines the reasoning of the earlier GPT-5 series with a 1-million-token context window, mid-response planning, deeper web research, and the ability to drive long-running multi-application workflows. Practically speaking: it can run complex Excel and presentation tasks with far less back-and-forth than before.

Google also shipped Gemini 3.1 Flash-Lite this week, a lightweight model built for high-volume developer workloads. At $0.25 per million input tokens, it's priced to compete hard on cost while delivering lower latency than previous Flash models.

The model race in early 2026 has a clear pattern: context windows are approaching a million tokens across all major labs, long-running agentic tasks are the new benchmark, and everyone is racing to embed their models into the software you already use at work.

Read more - devflokers

Get The Most Out of Working with AI

1. Work alongside AI — AI does the first draft of everything. You direct, review, approve. Your output multiplies, not your hours.

2. Build automations — Connect tools to eliminate repetitive work. If this happens, do that. No coding needed. The person who automates 20 hours of weekly busywork becomes untouchable.

3. Evaluate AI output — AI is confidently wrong constantly. Your job is catching it before it costs someone. Judgment over technical skill.

4. Think in systems — Stop asking “how do I do this task?” Start asking “how do I build the thing that does this task automatically?” Designer of work, not doer of work.

5. Prompt engineering — Give AI context, audience, voice, format, and examples. Garbage in, garbage out. Specificity is the skill.

6. Context engineering — Set up AI so it already knows your business, customers, and voice before you ask anything. Stop re-explaining from scratch every session.

7. Know what NOT to automate — Relationships, strategy, judgment calls, anything requiring reading a room. Automating everything is how you break the things that matter.

Latest AI Models, Features and Examples

AI agent social networks are real now — A Nature study of 46,000 AI agents on Moltbook found they display human-like social behaviors, but engage differently: they prefer discussing content over simply upvoting it.
A GitHub bot got prompt-injected into installing malware on 4,000 developer machines this week — a Terraform agent also nuked a production database. The agentic AI attack surface is real.
Vermont signed the first state law restricting synthetic media in elections — Oregon passed kids chatbot safety legislation. The regulatory patchwork is forming at the state level.
Yann LeCun left Meta to co-found AMI Labs, which just raised $1.03 billion at a $3.5 billion valuation — LeCun remains one of the most prominent critics of the current LLM scaling approach.

AI Agents Are Reading Your Docs. Are You Ready?

Last month, 48% of visitors to documentation sites across Mintlify were AI agents—not humans.

Claude Code, Cursor, and other coding agents are becoming the actual customers reading your docs. And they read everything.

This changes what good documentation means. Humans skim and forgive gaps. Agents methodically check every endpoint, read every guide, and compare you against alternatives with zero fatigue.

Your docs aren't just helping users anymore—they're your product's first interview with the machines deciding whether to recommend you.

That means:
→ Clear schema markup so agents can parse your content
→ Real benchmarks, not marketing fluff
→ Open endpoints agents can actually test
→ Honest comparisons that emphasize strengths without hype

In the agentic world, documentation becomes 10x more important. Companies that make their products machine-understandable will win distribution through AI.

Make Your Docs Agent-Ready

Spatial Computing

Meta's Next Headset Is a Hard Reset And the Quest 4 Is Now a 2027 Story

Meta Reality Labs continues its strategic reshuffle. Internal memos have confirmed the company pushed its ultralight "Phoenix/Puffin" headset to the first half of 2027, and the full Quest 4 is now likely 2027 at the earliest — possibly 2028. The company scrapped two earlier Quest 4 prototypes entirely and restarted development under a new codename, Griffin.

The clearer storyline: Meta is betting on a fundamentally different product category rather than incremental VR upgrades. The Phoenix/Quest Air device is designed to be worn like ski goggles, tethered to a separate compute puck, focused on virtual screens and seated productivity rather than gaming. No traditional controllers. Eye and face tracking standard. Priced around $800, dramatically below the Apple Vision Pro and Samsung Galaxy XR.

Read more - Gizmodo | Next Reality

Robotics

Chinese Companies Are Running Humanoid Robots in Actual Factories…And Calling Them "Interns"

Mobile World Congress in Barcelona this week turned into an unexpected robotics showcase. Xiaomi's president told CNBC that two of its humanoid robots completed 90% of the work in its EV factory across three hours, keeping pace with the line that produces a new car every 76 seconds. He called them "interns." That framing is deliberate: it sets expectations low while signaling something real is happening.

Honor — best known for budget Android phones — unveiled its first humanoid robot at MWC alongside a concept "Robot Phone," framing both as part of a broader "Augmented Human Intelligence" strategy. XPeng has its own humanoid. Xiaomi's CyberOne isn't yet for sale but is already working. China now has over 150 humanoid robot companies. The government included "intelligent robots" in its annual work report for the first time last year.

The framing of "intern" is doing a lot of work here. Today's humanoid is the factory temp. In five years, it's a tenured employee.

Read more - CNBC | SiliconANGLE

Space

Starship V3 Is Weeks Away and The Most Powerful Rocket Ever Built Is Getting Serious

SpaceX completed cryoproof testing on Ship 39 this week. The first campaign for the next-generation Starship V3 configuration. Elon Musk has said launch is "about 4 weeks" away. This matters because Starship V3 is a significant upgrade over the version that achieved the first successful catches of the Super Heavy booster. The new configuration carries more payload, burns more efficiently, and is designed to support the orbital refueling missions that are a prerequisite for any crewed Mars attempt.

SpaceX also launched its 600th Starlink satellite of 2026 in just the first 63 days of the year, a pace that's reshaping global internet access faster than any infrastructure project in history. And separately, a SpaceX veteran just launched a startup called Lux Aeterna, raising $10 million to build reusable satellites with built-in heat shields that return to Earth intact. The pitch: do for satellites what reusable rockets did for launches.

Read more - Space.com | TechCrunch

Our vision

There's a theme running underneath every story this week: the question of who gets to set the rules when AI becomes infrastructure.

The Anthropic-Pentagon lawsuit is the most dramatic version of this question, but it's not the only one. Microsoft pricing AI agents into a $99/month enterprise tier is asking the same question, who controls access, and at what cost? State legislatures passing patchwork AI regulation are asking it. Paperclip's AI company-in-a-box is asking it. Even Xiaomi calling its factory robots "interns" is asking it because if they're interns today, what's the job title in three years, and who owns the employment relationship?

The honest answer is that none of these questions have settled answers yet. What's unusual about this moment is how fast the stakes are escalating before the rules are written. A year ago, "AI in war" was a think-piece topic. Today it's a federal lawsuit with hundreds of millions of dollars and constitutional rights on the line.

For readers who are building companies, managing teams, or thinking about their career: the practical implication of this week is that the AI agent era isn't a future horizon anymore. It's being priced into enterprise software now, it's running on factory floors now, and it's being litigated over in federal courts now. The window for figuring out how you'll use this is closing fast.

Starship V3 is four weeks from launch. GPT-5.4 is already in your tools. Humanoid robots are keeping pace with car assembly lines. If the pace feels overwhelming, that's because it is and the right response is to pick one thing that actually matters to your life or work and go deep on it.

What do you think is coming next — and which story from this week surprised you most?

TheFutureParty

Get the latest news and trends on business, entertainment, and culture - so you always stay one step ahead of the rest.

Vulnerable U

Infosec's favorite weekly newsletter for news, tools, and tips with 34,000+ CISOs, founders, change-makers, and straight up hackers.

How did you like this week's edition?

Smarter news. Fewer yawns

Business news takes itself way too seriously.

Morning Brew doesn’t.

Morning Brew delivers a smart, skimmable email newsletter on the day’s must-know business news — plus games that make sticking around a little more fun. Think crosswords, quizzes, and quick breaks that turn staying informed into something you actually look forward to.

Join over 4 million professionals reading Morning Brew for free. And walk away knowing more than you did five minutes ago.

Try it out

Great Docs Drive Real Revenue

Your documentation is the first thing developers evaluate before adopting your product. Mintlify helps you ship docs that accelerate adoption, reduce support load, and convert users into customers.

Turn Docs Into Pipeline