Vibe Coding in Defense Software

“There’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. I ‘Accept All’ always, I don’t read the diffs anymore.” — Andrej Karpathy

Executive Summary: Andrej Karpathy’s description of vibe coding captures a major shift in software development. In traditional software engineering, developers write code line by line, review changes carefully, and build systems step by step. In vibe coding, users describe the system they want to build in natural language, while generative AI translates that intent into code, interfaces, data models, tests, or documentation. In this sense, software development is moving from a code-centric activity toward an intent-driven engineering process.

For the defense industry, this transformation creates significant opportunities. Command-and-control systems, logistics, maintenance, cybersecurity, decision support, and autonomous platforms are becoming increasingly software-intensive. As a result, the speed of software development is no longer only a matter of engineering productivity; it is also part of strategic adaptation capacity. However, the fact that AI-generated code works does not necessarily mean that it is secure, correct, or sustainable. Therefore, vibe coding should not be treated as uncontrolled automation in defense software, but as a capacity multiplier that must be combined with human oversight, security testing, and risk-based governance.

From Code to Intent: What Makes Vibe Coding Different?

The history of software development can be understood as the gradual abstraction of communication between humans and machines. From machine code to high-level programming languages, and from AI-assisted coding tools to vibe coding, each stage has introduced a new layer of abstraction. The key difference today is that not only code, but also intent, is becoming a software interface.

In the vibe coding approach, a developer or domain expert describes what the system should do in natural language. The AI system can then transform that description into working code, a user interface, a data model, test scenarios, or technical documentation. As a result, the human role shifts from writing every line of code to defining the problem, setting boundaries, and evaluating the output.

This shift does not mean that software engineering skills disappear. Instead, the center of gravity changes. In traditional software development, the key question is “How do I write this code?” In vibe coding, the key question becomes “How do I describe what I want clearly, correctly, testably, and securely?” Ambiguous intent produces ambiguous code, and ambiguous code may create security and maintainability risks.

**Figure 1:** Evolution from code-centric development to intent-driven workflows.

LLM, AI Coding, Vibe Coding ve Agentic Coding

To position vibe coding correctly, it is important to distinguish it from related concepts. A large language model, or LLM, is the technological engine behind this transformation. It generates text or code by predicting the next token based on context. However, an LLM is not itself a software development method; it is the underlying capability that enables new development workflows.

AI coding refers to the integration of LLMs into development environments for tasks such as code completion, bug fixing, function generation, or explanation. In this stage, the developer remains the primary author, while AI acts as an assistant or co-pilot.

Vibe coding goes one step further. The user expresses intent in natural language, and the AI system generates code, interfaces, or application structures based on that intent. In this model, the human role moves from direct authorship toward supervision and evaluation.

The next stage can be described as agentic coding. In agentic coding, AI does not only generate code; it can also plan, use tools, run tests, interpret errors, and iterate toward a solution. This evolution does not remove the human from the process. On the contrary, it changes the human role from writer to reviewer, supervisor, and orchestrator. This distinction is especially important for defense software: while vibe coding may be useful for low-risk internal tools, AI-generated outputs should not be directly accepted in safety-critical or mission-critical systems.

**Figure 2:** Relationship between LLMs, AI coding, vibe coding, and agentic coding.

Why Speed Matters in Defense Software?

AI-assisted coding tools can provide significant productivity gains in software development. In a controlled experiment on GitHub Copilot, Peng et al. reported that developers using Copilot completed a programming task 55.8% faster than the control group. This finding suggests that AI-assisted coding tools can offer meaningful time advantages, especially in prototyping and repetitive development tasks.

Larger-scale field studies have also examined the impact of generative AI on software developers. Cui et al., based on field experiments conducted at Microsoft, Accenture, and an anonymous Fortune 100 company, show that generative AI tools can have measurable productivity effects in high-skilled software development work.

For defense software, speed is not merely a technical convenience. In a rapidly changing threat environment, slow software cycles may create operational obsolescence. Software capability is directly linked to operational effectiveness in areas such as command-and-control, logistics, maintenance, cybersecurity, decision support, and autonomous systems.

Therefore, AI-assisted development can provide significant value in low-risk components such as internal tools, maintenance dashboards, reporting modules, test panels, and prototype interfaces. The real value of vibe coding is not only that it produces code faster, but also that it shortens the path from idea to prototype and enables domain experts to participate more directly in the software development process.

Working Code Is Not Always Safe Code

The most critical dimension of vibe coding is security. Large language models do not “understand” code with human-like engineering responsibility. They generate likely outputs based on statistical patterns learned from training data. As a result, code that appears to work may still contain vulnerabilities, incorrect assumptions, missing validation, weak access control, or maintainability problems.

Academic studies highlight this concern. Pearce et al. analyzed the security of code generated by GitHub Copilot across 89 different security scenarios and 1,689 generated programs. Their findings showed that a significant portion of the generated code contained security vulnerabilities. This demonstrates that security review is not optional in AI-assisted software development; it is a necessary part of the process.

Similarly, Perry et al. found that participants using AI coding assistants could write less secure code in some tasks, while also being more likely to believe that their code was secure. This points to another important risk in AI-assisted development: not only technical vulnerabilities, but also a false sense of security.

In defense software, “the code works” cannot be an adequate acceptance criterion. Code must be secure, testable, traceable, and auditable. AI-generated outputs should therefore be evaluated through human review, automated testing, security checks, and institutional acceptance criteria.

Risk-Based Adoption and Governance

For vibe coding to be used safely in defense software, not all software components should be treated as having the same criticality level. AI-assisted code generation must be classified according to data sensitivity, operational impact, and system risk.

In low-risk areas, vibe coding can be highly useful for documentation, test scenario generation, reporting, internal tools, and prototyping. In medium-risk areas, such as internal portals, logistics integrations, or maintenance tracking systems, AI-generated outputs may be used, but human oversight, access control, error handling, and security testing must be mandatory. In high-risk areas such as flight control software, cryptographic components, safety-critical control code, and operational decision mechanisms, direct vibe coding is not appropriate.

However, risk-based adoption cannot rely on human oversight alone. AI-assisted development must also be supported by technical governance layers. In this context, Rules, Skills, and MCP become important.

Rules provide a persistent policy layer for AI systems. They can define which data must not be used, which coding standards must be followed, which files cannot be modified, which security checks are mandatory, and which outputs require human approval. This ensures that AI does not operate only according to the user’s immediate prompt, but also within institutional engineering and security standards.

Skills can be understood as reusable expertise modules for specific tasks. For example, separate skills can be defined for security review, test generation, performance optimization, documentation, or code review. This allows the AI system to operate within a defined expert framework instead of being prompted from scratch for every task.

MCP, or Model Context Protocol, can be considered a controlled interface between AI systems and institutional tools, data sources, and development environments. It helps define which systems AI can access, with what permissions, and within which data boundaries. For defense software, this layer is critical for balancing speed with security.

Therefore, reliable vibe coding cannot be achieved simply by using a powerful model. Reliability emerges from the combined design of rules, reusable expertise, controlled tool access, testing processes, and human approval mechanisms. With these layers, vibe coding becomes more than a fast code generation practice; it becomes an auditable, traceable, and institutionally governed engineering workflow.

**Figure 3:** Governance layers for secure AI-assisted defense software.

Conclusion

Vibe coding represents an important transformation in software development: the shift from code to intent. When used appropriately, it can shorten the path from idea to prototype, bring domain experts closer to software production, and increase the capacity of engineering teams.

However, the speed offered by vibe coding cannot replace security and quality processes. The fact that AI-generated code works does not mean that it is secure or sustainable. For the defense industry, the key question is not simply “Should AI be used?” The real question is: Which data, which risk level, and which governance model should be used?

In conclusion, vibe coding can become a powerful capacity multiplier for defense software development. Yet this capacity becomes valuable only when combined with security, traceability, human oversight, and institutional governance. The future of software development will not be determined only by how much code AI can generate, but by how effectively humans can guide, evaluate, and govern that generation.

References

Peng, S., Kalliamvakou, E., Cihon, P., & Demirer, M. (2023). The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. arXiv:2302.06590.
Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., & Karri, R. (2022). Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions. IEEE Symposium on Security and Privacy.
Perry, N., Srivastava, M., Kumar, D., & Boneh, D. (2023). Do Users Write More Insecure Code with AI Assistants? ACM Conference on Computer and Communications Security (CCS).
Cui, K. Z., Demirer, M., Jaffe, S., Musolff, L., Peng, S., & Salz, T. (2025). The Effects of Generative AI on High-Skilled Work: Evidence from Three Field Experiments with Software Developers. Working Paper.
Karpathy, A. (2025). Post on “vibe coding.” X (formerly Twitter), February 6, 2025.