From MCP and Vibe Coding to Harness Engineering: How Did AI Native Engineering Evolve in One Year

Presenters

Birgitta Böckeler

Source

InfoQ podcast

The AI Evolution in Software Development: From Autocomplete to Autonomous Agents 🚀

The world of AI and software development is moving at breakneck speed. What felt like science fiction just a year ago is now becoming commonplace. In this insightful conversation with Birgitta Böckeler, a Distinguished Engineer at ThoughtWorks, we dive deep into the rapid evolution of AI-powered coding tools and the emerging concepts shaping our development workflows.

From Autocomplete to Agentic Modes: A Year of Transformation 💡

Remember last year’s buzz around “vibe coding”? It was all about generating code snippets and stitching them together, feeling like a more integrated Stack Overflow. Tools like Cursor were leading the charge, with others playing catch-up.

Fast forward to today, and the landscape has dramatically shifted. Birgitta notes that what she termed “agentic modes” are now far more sophisticated. Last year, the focus was on concepts like autocomplete and rudimentary agents. Now, Cloud Code has arguably become the most popular coding assistant, setting the pace for innovation. While Cursor remains relevant, its team is increasingly adopting ideas pioneered by Cloud Code.

This shift highlights a key trend: the rapid development and adoption of AI in software delivery.

IDE vs. Terminal: A Matter of Preference and Power 💻

The debate between IDE-based and terminal-based coding agents continues. Cloud Code’s popularity has led some to believe its terminal-first approach is the sole driver. However, Birgitta argues that the underlying technology is equally crucial.

Terminal-based agents offer the significant advantage of headless operation, enabling seamless integration into pipelines and background execution.
IDE-based agents, like Cursor, provide a more visual and interactive experience. Birgitta finds the graphical interface invaluable for understanding what the agent is doing and for features like easily rolling back conversations.

Ultimately, the choice often comes down to personal preference and the specific situation. Designers, for instance, may find IDEs more intuitive. However, the rise of CLIs for tools like Cursor and GitHub Copilot means terminal-based capabilities are now accessible across the board, blurring the lines between the two.

Context Engineering: The Art of Guiding AI 🧠

A pivotal concept emerging is context engineering, which involves meticulously tuning the information an AI model receives to achieve better results. This is crucial for coding agents, where feeding them coding conventions, architectural guidelines, and business context significantly impacts their output.

Context interfaces are how we provide AI with the tools it needs. These include built-in agent functions (like editing files or code search) and external tools.
MCP servers, once popular for integrating with tools like Figma, are now often being replaced by skills. Skills allow for packaging multiple files, including scripts that can call APIs, offering a more efficient and flexible way to provide context.
Skills are more efficient on the context window, as the agent only loads resources when deemed relevant. This “lazy loading” conserves valuable context space.
The shift towards skills and CLIs also aligns with the principle of not running unnecessary local processes when a system-level tool (like a CLI) already exists and can be invoked.

Harness Engineering: Building Confidence in AI’s Output 🛠️

As AI agents gain more autonomy, ensuring confidence in their output becomes paramount. This is where harness engineering comes into play.

A “harness” encompasses everything except the model itself, orchestrating the agentic experience.
In the context of coding agents, harness engineering involves:
- Feed-forward: Providing the agent with upfront information like coding conventions and architectural context to increase its chances of success.
- Feedback: Offering immediate, automated feedback to the agent for self-correction. This includes static code analysis, test suite monitoring, and compiler error detection.
Tools like language servers can enable IDE-style refactorings, moving beyond simple text diffs.
While much of the current focus is on maintainability and internal code quality, Birgitta envisions extending harness engineering to areas like architecture fitness and even behavioral validation.

The Challenge of Behavior and the Future of Validation 🎯

A significant challenge remains in engineering harnesses for behavior. While generating tests and achieving a green test suite is a start, it doesn’t fully capture nuanced behavior. Birgitta highlights the need for more sophisticated feedback mechanisms, potentially involving AI-driven code reviews or advanced testing techniques like mutation testing.

The resurgence of formal validation methods from academia, such as deterministic simulation testing, is promising. However, bridging the gap to pragmatic, empirical solutions is key. The speed of AI development necessitates tools that can keep pace with validation.

Navigating the Codebase: Context and Privacy Concerns 🌐

Feeding our existing codebases into AI models as context is essential. Current approaches include:

Code search tools: Many agents offer basic text-based search (grep, glob).
Embeddings: Converting code into embeddings for local semantic search, as seen in tools like Cursor.
Language servers: Enabling more sophisticated code navigation and refactoring.
Integrated code search: Products like Amp by Sourcegraph and GitHub Copilot integrate advanced code search across multiple repositories.
Graph-based analysis: Loading codebases into graphs, enriched with data from tickets, Slack history, and wikis, to provide richer semantic context.

However, the question of code privacy remains a significant concern. While settings exist to control how code is used for training, the default configurations and the lack of clear opt-in mechanisms have sparked widespread debate. Birgitta hopes for more transparency and clearer communication from AI providers.

Predictions and the Road Ahead 🛣️

While Birgitta avoids firm predictions, she anticipates:

Filling in the gaps of harness engineering: Expect more specialized tools and techniques to emerge.
Uncertainty around costs: The current AI landscape is heavily subsidized, making future cost predictions difficult. IPOs from companies like Anthropic may shed more light.
More public stories of AI coding fails: These are crucial for learning and understanding the limitations and risks of AI in development.
More elaborate and specialized tooling: Expect APIs designed specifically for agents and more empowered documentation to foster a richer ecosystem.

The journey of AI in software development is far from over. It’s a dynamic, evolving space where continuous learning and adaptation are essential. The key lies in strategically leveraging these powerful tools while maintaining a critical eye and focusing on building quality, reliable software.

The AI Evolution in Software Development: From Autocomplete to Autonomous Agents 🚀#

From Autocomplete to Agentic Modes: A Year of Transformation 💡#

IDE vs. Terminal: A Matter of Preference and Power 💻#

Context Engineering: The Art of Guiding AI 🧠#

Harness Engineering: Building Confidence in AI’s Output 🛠️#

The Challenge of Behavior and the Future of Validation 🎯#

Navigating the Codebase: Context and Privacy Concerns 🌐#

Predictions and the Road Ahead 🛣️#

Appendix#