Research Article |
|
Corresponding author: Kunter Orpak ( kunterorpak@gmail.com ) Academic editor: Annemarie Oord
© 2025 Kunter Orpak.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits to copy and distribute the article for non-commercial purposes, provided that the article is not altered or modified and the original author and source are credited.
Citation:
Orpak K (2025) Generative AI and cybersecurity: Exploring opportunities and threats at their intersection. Maandblad voor Accountancy en Bedrijfseconomie 99(4): 221-230. https://doi.org/10.5117/mab.99.149299
|
Generative AI, particularly large language models (LLMs), is reshaping the cybersecurity landscape by enabling both innovative defense mechanisms and novel forms of attack. This article explores the dual role of generative AI in both offensive and defensive cybersecurity operations. While GenAI offers significant advancements in defensive capabilities, it is also being leveraged by nation-state actors to enhance the sophistication and success rates of cyberattacks. The article analyzes how LLMs are applied in offensive engagements such as red teaming, penetration testing, and threat intelligence, while also identifying emerging technical, operational, and strategic risks associated with their deployment. Special attention is given to the cybersecurity challenges of generative AI systems themselves, highlighting limitations in conventional frameworks and proposing governance-oriented mitigations such as model evaluation, human-in-the-loop oversight, GenAI-specific red teaming, and the structured dissemination of threat intelligence derived from GenAI-enabled security practices.
Generative AI, cybersecurity, AI risk management, LLM security, AI governance, cyber threat intelligence, adversarial AI attacks, penetration testing, AI in offensive security, AI in cyber defense, AI red teaming, AI compliance in audit
As generative AI systems rapidly integrate into business and IT environments, internal auditors, internal control specialists, and IT audit professionals should understand their cybersecurity implications. This article explores the intersection of generative AI and cybersecurity, providing insights into both opportunities and risks. By examining AI-driven offensive and defensive security applications, associated threats, and mitigation strategies, the article equips professionals with the knowledge to assess and manage AI-related cyber risks in organizations.
Generative AI differs from other AI models primarily in its ability to generate novel content rather than just analyzing or acting on existing data. Traditional AI models typically use specific data to solve specific problems and generate specific answers based on input data. In contrast, generative AI models, like large language models (LLMs), are capable of creating new and original content by mapping input information into a high-dimensional latent space and driving stochastic behavior to produce novel outputs even with the same input stimuli (
In this article, “offensive cybersecurity” refers to proactive security testing methods such as penetration testing, red teaming and threat simulation, aimed at identifying vulnerabilities before malicious actors exploit them. In contrast, “defensive cybersecurity” encompasses technologies and processes focused on prevention, detection, response, and recovery from cyber threats.
This article is structured as follows: Section 2.1 explores the role of Generative AI in offensive cybersecurity, while Section 2.2 examines its applications in cyber defense. Section 3 highlights key risks associated with AI-driven security practices, followed by Section 4 discussing governance and mitigation strategies. Finally, the conclusion summarizes insights for internal audit, IT audit and internal control professionals.
The relationship between Generative AI and cybersecurity can be outlined across four distinct categories (
Generative AI has significantly impacted offensive cybersecurity by increasing the sophistication and scale of cyber threats by allowing more complex and varied types of cyber-attacks (
Traditional cybersecurity offensive testing methods, like red teaming and penetration testing, are often time and resource-intensive, necessitating the adoption of specialized tools and algorithms for improved efficiency. Integrating LLMs into the red team testing process offers new opportunities to enhance efficiency, precision, and cost-effectiveness by automating complex tasks, improving decision-making, and providing real-time insights during engagements (
The use of LLMs in the threat intelligence phases of offensive security tests significantly enhances the accuracy and speed of information extraction and analysis. LLMs can automate the extraction and summarization of important information from large datasets, such as historical cyber incident reports, thereby improving the accuracy of threat intelligence and the ability to forecast future threats (
LLMs, particularly ChatGPT, have high potential to enhance cyberattacks done by individuals with entry-level skills (
LLM agents are valuable tools for the reconnaissance phase of penetration tests (
Through neural machine translation, LLMs can effectively generate syntactically and semantically correct software exploits from natural language descriptions, though minor errors prevent full automation, indicating great potential (
LLMs can effectively facilitate cyber offensive attacks, specifically generating viruses and polymorphic malwares (
There are some early successful examples of LLM applications illustrating the potential of LLM-based automation to transform cybersecurity by reducing manual effort, enhancing accuracy, and enabling comprehensive threat assessment. PTHelper (Gracia and Sánchez-Macián 2024) streamlines the penetration testing process by automating transitions between phases, using modules for scanning, exploiting, natural language processing, and reporting, demonstrating effectiveness in both black-box and controlled environments. PentestGPT (
Figure
In the context of cyber defense, LLMs excel in tasks such as threat detection, vulnerability analysis, and automated defense mechanisms (
LLMs hold transformative potential in the field of cybersecurity defensive operations, offering significant advancements across various applications including threat intelligence, cybersecurity risk monitoring, vulnerability management, static malware analysis, dynamic debugging, anomaly detection and behavior analysis, web content security, phishing and spam detection, digital forensic, fuzz testing, program repairing, secure code generation, honeypots, and incident response and recovery (
A recent comprehensive review by
In the current cyber security landscape various real-world cybersecurity products are being used in the cybersecurity operations that leverage Generative AI to enhance security measures (
Figure
The risks associated with Generative AI can be classified under 3 categories particularly operational, technical and lastly systemic and strategic risk. These categories are derived from a synthesis of academic literature in this article.
As shown in Figure
While LLMs have significant potential in cybersecurity, particularly in threat intelligence process, they are not yet perfectly accurate due to hallucination where LLMs generate false information (
First major challenge is the potential for adversarial attacks, where malicious actors can exploit LLM models by feeding them deceptive inputs to manipulate their outputs (
Model theft is another critical risk, where unauthorized entities gain access to and replicate AI models, undermining proprietary technologies and security protocols (
Although many developers have adopted AI technology in their workflows and generally find the code provided by AI to be usable and fairly accurate, there is caution regarding AI-generated code being insecure and inaccurate in coding and scripting practices (
Despite the deployment of undisclosed defenses by service providers, LLM agents are vulnerable to jailbreak attacks, where malicious prompts can manipulate these models to bypass their safeguards and generate harmful or sensitive content (
An important concern regarding the systemic risks of LLMs is about their self-replication capability. AI systems driven by LLMs such as Meta’s Llama31-70B-Instruct and Alibaba’s Qwen25-72B-Instruct were demonstrated to be able to autonomously create separate copies of themselves, which could lead to the uncontrolled proliferation of AI systems, potentially forming independent networks that might act against its usage purpose (
The inherent black-box nature of LLMs presents significant challenges in understanding and controlling their operations, which raises critical concerns about transparency and accountability (
Overreliance on content generated by LLMs poses significant risks, particularly due to the difficulty in detecting incorrect or misleading information produced by these models (
The wide use of LLM based agents in the IT landscape has widened the exploit surface available to attackers. This expansion is driven by several risk factors outlined in Section 3 of this article, including vulnerabilities inherent to generative AI models, the increased feasibility of adversarial and jailbreak attacks, and the misuse of LLMs for generating potentially harmful or exploitable code. It is necessary to address cybersecurity risks specific to generative AI systems, on the top of traditional cybersecurity practices.
LLMs can amplify existing security risks and introduce new ones, emphasizing the need for a thorough understanding of the system’s capabilities and applications. About the amplified cybersecurity risks, Microsoft has published its early lessons learned from its red teaming of 100 generative AI products (
Figure
A critical enabler of this is AI governance, which plays a key role in ensuring secure and ethical AI deployment. It addresses systemic concerns such as algorithmic bias, data privacy, transparency, and responsible use of AI technologies (
Human oversight in Gen AI operations remains equally vital, since it involves assessing Gen AI safety questions that require emotional intelligence and understanding the full range of interactions users might have with Gen AI systems (
Furthermore, given that LLMs can amplify existing security risks and introduce new ones, Gen AI red teaming is a crucial practice for assessing the safety and security of Gen AI systems, as it pushes beyond model-level safety benchmarks by emulating real-world attacks against end-to-end systems (
Finally, aligning various cybersecurity efforts, including Gen AI red teaming, with real-world risks is indispensable, which necessitates the dissemination of insights and threat intelligence gathered from extensive cybersecurity practices (
A new dimension in securing LLMs is the integration of LLMs into red teaming practices, such as an automated red teaming LLM agent that simulates adversarial conversations with LLMs, leveraging multiple adversarial prompting techniques, allowing for scalable and efficient stress-testing of known vulnerabilities, thus freeing human testers to explore new risk areas (
Generative AI, particularly LLMs, has rapidly emerged as a powerful tool in cybersecurity, benefiting both cyber defenders and adversaries. On one hand, cybersecurity professionals leverage LLMs to enhance penetration testing, red teaming, and threat intelligence-driven security tests, enabling faster, more sophisticated, and cost-effective offensive security operations. On the other hand, malicious actors exploit the same technology to automate cyberattacks, craft advanced phishing campaigns, and develop polymorphic malware, expanding the cyber threat landscape. The cybersecurity community maintains a balanced perspective on the adoption of LLMs, recognizing both their value in strengthening defense operations and the significant challenges, risks, and potential for misuse they introduce. By openly addressing both the offensive and defensive capabilities of generative AI, this article aims to equip professionals with the knowledge to anticipate threats, not to support their misuse. Responsible innovation and risk-informed governance remain essential.
As AI-driven cyber security applications evolve, so do the risks and regulatory challenges associated with their use. Generative AI introduces vulnerabilities such as model exploitation, adversarial attacks, jailbreak exploits, and biased or unreliable outputs, which could undermine security efforts if not properly managed. While AI provides remarkable efficiencies, it also increases organizations’ exploit surfaces, requiring new control frameworks and continuous risk assessment.
For internal auditors, IT auditors, and internal control professionals, generative AI is not just an IT concern but a governance and risk management issue. To mitigate the risks associated with generative AI, these professionals can play a key role by assessing whether appropriate AI governance frameworks are in place and integrating AI-specific risks into enterprise risk management and audit plans. These professionals should understand the implications of generative AI in cybersecurity, ensuring that organizations harness AI’s benefits while mitigating its risks. By balancing innovation with security, they can contribute to the responsible adoption of AI, strengthen ethical AI governance, and ensure compliance with evolving regulatory standards. As generative AI continues to shape the cybersecurity domain, the key challenge will be ensuring AI remains an asset rather than a liability. By proactively addressing the risks and opportunities of AI in security, professionals across cybersecurity, audit, and internal control fields can play a pivotal role in securing the AI-driven future.
This article aimed to highlight the intersection between generative AI and cybersecurity, focusing on both opportunities and associated risks. Future research could further explore how audit, risk, and internal control functions can enhance the cybersecurity assurance of GenAI systems. This includes examining control frameworks, audit methodologies, and regulatory compliance strategies tailored to the unique characteristics of AI-based technologies.
K. Orpak RE CISSP CCSP CISA CIA ISO27001LA CSX-F CDPO CFSA CCSA – Kunter, Senior Supervision Officer – DORA TLPT / TIBER-EU Test Manager, Dutch Authority for the Financial Markets (AFM). PhD Researcher, Faculty of Economics and Business, University of Amsterdam. This article has been written within the scope of his academic affiliation with the University of Amsterdam.
The author confirms having no financial interests or conflicts of interest related to the subject matter or materials discussed in this article.
The author acknowledges the use of ChatGPT-4o, an advanced language model, to assist in the linguistic refinement and structural improvements of this manuscript. The tool was used solely for linguistic and structural refinement; all conceptual contributions, critical analysis, and findings are entirely the author’s own.