<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "https://mab-online.nl/nlm/tax-treatment-NS0.dtd">
<article xmlns:tp="http://www.plazi.org/taxpub" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">69</journal-id>
      <journal-id journal-id-type="index">urn:lsid:arphahub.com:pub:8D21F818-6EEF-540F-91C7-D50E3E5A13E0</journal-id>
      <journal-title-group>
        <journal-title xml:lang="en">Maandblad voor Accountancy en Bedrijfseconomie</journal-title>
        <abbrev-journal-title xml:lang="en">MAB</abbrev-journal-title>
      </journal-title-group>
      <issn pub-type="ppub">0924-6304</issn>
      <issn pub-type="epub">2543-1684</issn>
      <publisher>
        <publisher-name>Amsterdam University Press</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5117/mab.100.169964</article-id>
      <article-id pub-id-type="publisher-id">169964</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>MAB-scriptieprijs</subject>
        </subj-group>
        <subj-group subj-group-type="scientific_subject">
          <subject>Corporate governance (Corporate governance)</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Hidden in plain sight: Unraveling compensation disclosure bloat with generative AI and its impact on executive compensation</article-title>
      </title-group>
      <contrib-group content-type="authors">
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Burduli</surname>
            <given-names>Lizi</given-names>
          </name>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
        <contrib contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Kramer</surname>
            <given-names>Stephan</given-names>
          </name>
          <email xlink:type="simple">skramer@rsm.nl</email>
          <uri content-type="orcid">https://orcid.org/0000-0001-6055-8223</uri>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
      </contrib-group>
      <aff id="A1">
        <label>1</label>
        <addr-line content-type="verbatim">RSM Erasmus University, Rotterdam, Netherlands</addr-line>
        <institution>RSM Erasmus University</institution>
        <addr-line content-type="city">Rotterdam</addr-line>
        <country>Netherlands</country>
        <uri content-type="ror">https://ror.org/057w15z03</uri>
      </aff>
      <author-notes>
        <fn fn-type="corresp">
          <p>Corresponding author: Stephan Kramer (<email xlink:type="simple">skramer@rsm.nl</email>).</p>
        </fn>
        <fn fn-type="edited-by">
          <p>Academic editor: Paula Dirks</p>
        </fn>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2026</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>21</day>
        <month>04</month>
        <year>2026</year>
      </pub-date>
      <volume>100</volume>
      <issue>2</issue>
      <fpage>69</fpage>
      <lpage>78</lpage>
      <uri content-type="arpha" xlink:href="http://openbiodiv.net/9C3B275A-DCC2-562E-8C24-E533A44FCB87">9C3B275A-DCC2-562E-8C24-E533A44FCB87</uri>
      <history>
        <date date-type="received">
          <day>27</day>
          <month>08</month>
          <year>2025</year>
        </date>
        <date date-type="accepted">
          <day>28</day>
          <month>01</month>
          <year>2026</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>Lizi Burduli, Stephan Kramer</copyright-statement>
        <license license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/" xlink:type="simple">
          <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits to copy and distribute the article for non-commercial purposes, provided that the article is not altered or modified and the original author and source are credited.</license-p>
        </license>
      </permissions>
      <abstract>
        <label>Abstract</label>
        <p>Whether compensation contract design reflects efficient contracting or rent extraction is an ongoing debate in academic research and public discourse. We contribute to this debate by examining whether textual bloat in compensation contract disclosures is associated with excess CEO compensation. We construct a measure of bloat, defined as irrelevant, boilerplate, and redundant content, by summarizing firms’ Compensation Discussion and Analysis sections with a large language model for a sample of S&amp;P 1500 firms during 2011–2018. In line with our hypotheses, we find a positive association between bloat and excess CEO compensation. We find no empirical evidence that governance characteristics explain the magnitude of bloat in firms’ compensation disclosures. Our findings suggest that bloated disclosures can be used as an instrument to obscure compensation levels that are unrelated to the economics of the firm.</p>
      </abstract>
      <kwd-group>
        <label>Keywords</label>
        <kwd>Generative</kwd>
        <kwd>AI</kwd>
        <kwd>LLM</kwd>
        <kwd>executive compensation</kwd>
        <kwd>bloat</kwd>
        <kwd>corporate governance</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec sec-type="Relevance to practice" id="SECID0EXC">
      <title>Relevance to practice</title>
      <p>As reflected in the EU Shareholder Rights Directive (Directive 2007/36/EC), European regulators seek to increase transparency and shareholder involvement, improve the oversight of directors’ remuneration, and facilitate the flow of information. The findings of this study support these objectives by demonstrating that the substance and understandability of compensation disclosures, rather than their length, matter more for effective monitoring. In the Netherlands’ stakeholder-oriented governance model, generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0E4C">AI</abbrev> tools can help boards, auditors, and investors in evaluating whether compensation disclosures genuinely communicate important information.</p>
    </sec>
    <sec sec-type="1. Introduction" id="SECID0EBD">
      <title>1. Introduction</title>
      <p>The widening pay gap between CEOs and workers in recent decades underscores a central debate in corporate governance research: whether executive compensation reflects efficient contracting or managerial rent extraction. The efficient contracting view is based on classical agency theory (<xref ref-type="bibr" rid="B27">Jensen and Meckling 1976</xref>) and posits that boards design pay packages to align executive and shareholder interests. Conversely, the managerial power or rent extraction view (<xref ref-type="bibr" rid="B3">Bebchuk and Fried 2003</xref>) argues that powerful executives can leverage their influence over the board to extract rents, i.e., receive compensation packages that exceed what is justified by their talent or the firm’s economics.</p>
      <p>One potential mechanism to achieve this is obfuscation through bloat, i.e., adding redundant, overly complex, and irrelevant information to disclosures about compensation. In the United States, the Compensation Discussion and Analysis (<abbrev xlink:title="Compensation Discussion and Analysis" id="ABBRID0ERD">CD&amp;A</abbrev>) section of the proxy statement is the primary source of investors’ information about executive compensation packages.</p>
      <p>Prior research has attempted to capture obfuscation in other disclosure contexts by using traditional textual analysis methods, such as readability or sentiment (e.g., <xref ref-type="bibr" rid="B36">Li (2008)</xref>; <xref ref-type="bibr" rid="B37">Loughran and McDonald (2014)</xref>). While these studies provide valuable first insights about the textual accessibility of disclosures, they are not intended to measure contextual irrelevance.</p>
      <p>Recent advancements in generative Artificial Intelligence (<abbrev xlink:title="Artificial Intelligence" id="ABBRID0EBE">AI</abbrev>) and Large Language Models (<abbrev xlink:title="Large Language Models" id="ABBRID0EFE">LLMs</abbrev>) offer an innovative methodological solution to this problem. Unlike older models, <abbrev xlink:title="Large Language Models" id="ABBRID0EJE">LLMs</abbrev> are pre-trained on vast datasets, enabling them to grasp the contextual significance and filter out irrelevant content in a manner that approximates human-like judgment. This allows for a novel and more precise measure of obfuscation. By tasking an LLM with summarizing a document to its essential core, we can quantify bloat as the proportion of the original text that is discarded as irrelevant or redundant. To date, no study has applied an LLM-based bloat measure to compensation disclosures to examine its link with excess pay. We address this gap by answering the following research question:</p>
      <p><italic>How can a measure of compensation disclosure bloat be developed using generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0ERE">AI</abbrev>, and what is its relationship with excess executive compensation</italic>?</p>
      <p>Using 7,786 firm-year observations from S&amp;P 1500 companies between 2011 and 2018, we use a Large Language Model to construct summaries of CD&amp;A sections and measure bloat as the difference between the length of the original document and its summary, scaled over the length of the original document. We quantify excess compensation using the methodology of <xref ref-type="bibr" rid="B12">Core et al. (2008)</xref> and conduct robustness checks using the approach by <xref ref-type="bibr" rid="B31">Larcker et al. (2011)</xref>.</p>
      <p>Our results show a significant positive association between bloat and excess compensation. This association remains robust after controlling for traditional textual metrics, which suggests our measure provides an incrementally informative dimension to the study of compensation design. We find no empirical evidence that governance mechanisms, as proxied by board gender diversity percentage and the CEO serving as chair of the board, are significant determinants of bloat. Taken at face value, this opposes the idea that firms that award the CEO with abnormally high pay packages and are poorly governed should have more bloated compensation disclosures (<xref ref-type="bibr" rid="B11">Core et al. 1999</xref>; <xref ref-type="bibr" rid="B2">Basu et al. 2007</xref>). However, another potential explanation is that both proxies used in this study are relatively invariant over time and therefore do not possess much explanatory power in our empirical design with a tight fixed effects structure, and that more research is needed to shed light on this relationship.</p>
      <p>We contribute to the literature as follows. First, this study is one of the first attempts to develop a bloat measure from CD&amp;A sections using generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EMF">AI</abbrev>. While <xref ref-type="bibr" rid="B28">Kim et al. (2024)</xref> are the first to develop the concept of bloat and apply it to a different set of disclosures, we document the explanatory power of bloat in an executive compensation context. A key advantage of <abbrev xlink:title="Large Language Models" id="ABBRID0EUF">LLMs</abbrev> is that they ‘understand’ the textual content by contextualizing it against the sections in which that text is embedded. We further contribute to the literature on natural language processing in accounting by showing that this approach complements existing textual metrics. While traditional measures primarily use dictionary based methods to capture surface-level linguistic features such as readability, textual complexity or speech patterns (<xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B37">Loughran and McDonald 2014</xref>; <xref ref-type="bibr" rid="B20">Francis et al. 2020</xref>), our measure draws on the attention mechanism inherent to <abbrev xlink:title="Large Language Models" id="ABBRID0EEG">LLMs</abbrev> to capture contextual irrelevance, a dimension that is incrementally informative over existing approaches. Our results have direct implications for regulators, such as the SEC, by highlighting that longer or more complex executive compensation disclosures do not necessarily enhance transparency, and for investors, who can use generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EIG">AI</abbrev> models to analyze the most important parts of the compensation disclosure sections faster and more easily.</p>
    </sec>
    <sec sec-type="2. Literature review" id="SECID0EMG">
      <title>2. Literature review</title>
      <sec sec-type="2.1. Optimal contracting, rent extraction, and excess executive compensation" id="SECID0EQG">
        <title>2.1. Optimal contracting, rent extraction, and excess executive compensation</title>
        <p>A central question in corporate governance research is whether executive pay reflects efficient contracting or managerial rent extraction. The efficient contracting view, rooted in classical agency theory (<xref ref-type="bibr" rid="B27">Jensen and Meckling 1976</xref>), posits that boards design compensation packages to align managerial incentives with those of shareholders. From this perspective, pay levels simply reflect a competitive market for talent and economic determinants like firm size and value (<xref ref-type="bibr" rid="B21">Gabaix and Landier 2008</xref>).</p>
        <p>In contrast, the managerial power or rent extraction view (<xref ref-type="bibr" rid="B3">Bebchuk and Fried 2003</xref>) argues that executive compensation is not a solution to the agency problem, but a consequence of it. Specifically, it posits that executives can earn excessive levels of compensation due to their power over the board. In line with this view, previous research has shown that excess compensation is associated with weak governance, as proxied by entrenched boards or CEO-chair duality (<xref ref-type="bibr" rid="B11">Core et al. 1999</xref>).</p>
        <p>A central challenge to the rent extraction view is explaining how this practice persists in the face of mandatory compensation disclosures to enable investor monitoring. One potential mechanism is obfuscation, i.e., increasing the redundancy, complexity, and length of the provided information to draw attention away from key details and making it difficult for investors with limited attention and cognitive processing constraints to draw accurate conclusions (<xref ref-type="bibr" rid="B6">Blankespoor et al. 2019</xref>, <xref ref-type="bibr" rid="B5">2020</xref>). In other words, if investors cannot easily process the disclosed compensation information, they are less likely to thoroughly evaluate undeserved excess pay.</p>
        <p>In the United States, investors mainly rely on the CD&amp;A section of the proxy statement to understand executive compensation packages. While mandated by the SEC to improve clarity and transparency (<xref ref-type="bibr" rid="B39">SEC 2011</xref>), the CD&amp;A in practice is often long and complex. This raises the question whether CD&amp;A sections contain bloat, i.e., irrelevant, boilerplate, and redundant content, and whether the extent of bloat is associated with excess compensation. Although earlier studies have attempted to identify obfuscation in other disclosure contexts using traditional textual analysis methods, such as readability or sentiment (<xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B30">Laksmana et al. 2012</xref>; <xref ref-type="bibr" rid="B32">Lawrence 2013</xref>), these approaches are not designed to assess contextual irrelevance.</p>
      </sec>
      <sec sec-type="2.2. Bloat and generative AI" id="SECID0EFAAC">
        <title>2.2. Bloat and generative AI</title>
        <p>Financial disclosures have grown substantially in length over the past two decades (<xref ref-type="bibr" rid="B34">Lesmy et al. 2019</xref>), driven mainly by regulation as well as firms voluntarily adding extra information. While increased disclosure should theoretically enhance transparency and reduce information asymmetry (<xref ref-type="bibr" rid="B35">Leuz and Verrecchia 2000</xref>), in practice, excessive length may reduce the informativeness and usefulness of financial reporting by overwhelming readers with irrelevant information. This phenomenon, where disclosures become unnecessarily wordy and include redundant information, is referred to as bloat (<xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>). Bloat is especially problematic given that individuals have limited cognitive resources to process information (<xref ref-type="bibr" rid="B1">Albuquerque et al. 2024</xref>) and investors have restricted attention to analyzing complete financial data (<xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>), meaning that they cannot evaluate every detail, particularly when disclosures are repetitive or strategically ambiguous.</p>
        <p>Traditional textual analysis methods, such as document readability and length (e.g., <xref ref-type="bibr" rid="B37">Loughran and McDonald 2014</xref>), sentiment analysis (e.g., <xref ref-type="bibr" rid="B13">De Amicis et al. (2021)</xref>), and topic modelling (e.g., <xref ref-type="bibr" rid="B16">Dyer et al. (2017)</xref>), provide valuable insights into corporate reports. However, these tools use various dictionaries and rule-based approaches to determine the overall sentiment of the text without considering the relationship between words and sentences (<xref ref-type="bibr" rid="B33">Leippold 2023</xref>). As a result, they capture different aspects of textual complexity but are not designed to assess contextual relevance.</p>
        <p>The introduction of generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EXBAC">AI</abbrev> represents a methodological shift and overcomes existing limitations through contextual understanding and reasoning. Unlike traditional NLP models, <abbrev xlink:title="Large Language Models" id="ABBRID0E2BAC">LLMs</abbrev> are pre-trained on a large set of data, meaning that the model learns patterns in language, such as grammar, word associations, sentence structures, and facts (<xref ref-type="bibr" rid="B19">Floridi and Chiriatti 2020</xref>). By filtering out redundancy and producing concise yet fully informative summaries, generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EDCAC">AI</abbrev> models reduce the cost, time, and computing resources required for textual analysis (<xref ref-type="bibr" rid="B26">Huang et al. 2023</xref>) and outperform other NLP models in various domains, including question-answering and translation (<xref ref-type="bibr" rid="B8">Brown et al. 2020</xref>).</p>
        <p>Hence, <abbrev xlink:title="Large Language Models" id="ABBRID0ERCAC">LLMs</abbrev>’ main advantage over traditional models is their ability to perform tasks that involve human-like judgment by understanding the context surrounding each word and the relationships between sentences. These advantages of <abbrev xlink:title="Large Language Models" id="ABBRID0EVCAC">LLMs</abbrev> are expected to offer a new way of quantifying disclosure bloat in executive compensation disclosures and generate summaries that contain incrementally informative content associated with excess compensation beyond traditional textual measures.</p>
      </sec>
      <sec sec-type="2.3. Bloat and excess executive compensation" id="SECID0EZCAC">
        <title>2.3. Bloat and excess executive compensation</title>
        <p>Although the structure of performance-based incentives embedded within executive compensation contracts often serves as a mechanism to align managerial actions with shareholder interests (<xref ref-type="bibr" rid="B3">Bebchuk and Fried 2003</xref>), if executives benefit from rent extraction, they may have incentives to use disclosure bloat to mask excessive pay. Under the managerial power view, executives may use bloat in the CD&amp;A section to earn compensation that is higher-than-justified by firm performance. First, when financial documents are less readable, shareholders that are limited in time and processing capacity may be less willing to thoroughly evaluate and scrutinize the compensation details, even if the information is relevant (<xref ref-type="bibr" rid="B7">Bloomfield 2002</xref>; <xref ref-type="bibr" rid="B23">Hooghiemstra et al. 2017</xref>). Second, bloat may focus investors’ attention on qualitative narratives framed by the firm, rather than relevant hard information. Consequently, we hypothesize:</p>
        <p><italic>H1: Disclosure bloat in the CD&amp;A sections of proxy statements is positively associated with excess executive compensation</italic>.</p>
        <p>This hypothesis is not without tension, because not all bloat is necessarily opportunistic, as compliance with regulatory requirements and legal risk management may require firms to include lengthy and redundant text to their disclosures.</p>
      </sec>
      <sec sec-type="2.4. Governance mechanisms as bloat determinants" id="SECID0EQDAC">
        <title>2.4. Governance mechanisms as bloat determinants</title>
        <p>While high executive compensation may be concealed by bloat, executives’ power over their compensation may be influenced by the strength of the corporate governance systems in place. Although monitoring and higher shareholder power can limit executives’ opportunistic behaviour (<xref ref-type="bibr" rid="B17">Eisenhardt 1989</xref>; <xref ref-type="bibr" rid="B38">Ruiz‐Verdú 2008</xref>), CEO duality or weak board oversight can facilitate obfuscation (<xref ref-type="bibr" rid="B11">Core et al. 1999</xref>; <xref ref-type="bibr" rid="B2">Basu et al. 2007</xref>), in line with managerial power theory. Moreover, directors may support executive-friendly pay because they value their board positions, are influenced by personal relationships and loyalty to CEOs, and benefit directly from the rewards CEOs can offer (<xref ref-type="bibr" rid="B40">Van Essen et al. 2015</xref>). Consequently, a less powerful and independent board may allow executives to exert greater influence over the compensation disclosure content and increase its length, which leads to higher bloat.</p>
        <p>Thus, governance mechanisms can be possible determinants of disclosure bloat. Particularly, strong governance mechanisms, such as board gender diversity, should encourage more concise and transparent reporting, hence reducing bloat, while weaker governance, such as CEOs holding both CEO and board chair roles, can increase the opportunity for obfuscation and bloat. These arguments lead to the second hypothesis:</p>
        <p><italic>H2a: Board gender diversity is negatively associated with disclosure bloat</italic>.</p>
        <p><italic>H2b: CEO-Chair duality is positively associated with disclosure bloat</italic>.</p>
      </sec>
    </sec>
    <sec sec-type="3. Data and methodology" id="SECID0ETEAC">
      <title>3. Data and methodology</title>
      <sec sec-type="3.1. Data source and sample" id="SECID0EXEAC">
        <title>3.1. Data source and sample</title>
        <p>The dataset comprises CD&amp;A sections extracted from DEF 14A proxy filings in the SEC’s EDGAR database for S&amp;P 1500 firms between 2011 and 2018, 7,786 firm-year observations. Data regarding the executive compensation and control variables are obtained from Compustat, ExecuComp, and BoardEx. Missing values are further retrieved from Refinitiv Eikon and the company’s annual reports.</p>
      </sec>
      <sec sec-type="3.2. Variable measurement and summaries" id="SECID0E3EAC">
        <title>3.2. Variable measurement and summaries</title>
        <p>Variable measurement consists of three main sets of variables and the procedure to construct the summaries. All continuous variables are winsorized at the 1<sup>st</sup> and 99<sup>th</sup> percentiles to mitigate the influence of outliers, except when bounded between 0 and 1.</p>
        <sec sec-type="3.2.1. Measurement for excess compensation" id="SECID0EGFAC">
          <title>
            <italic>3.2.1. Measurement for excess compensation</italic>
          </title>
          <p>Excess executive compensation is measured using the residual of actual compensation compared to the predicted compensation based on the economic determinants, following prior literature by <xref ref-type="bibr" rid="B12">Core et al. (2008)</xref> using two different compensation measures. In a robustness check, excess executive compensation is estimated using the <xref ref-type="bibr" rid="B31">Larcker et al. (2011)</xref> methodology.</p>
        </sec>
        <sec sec-type="3.2.2. Generating the summaries" id="SECID0EXFAC">
          <title>
            <italic>3.2.2. Generating the summaries</italic>
          </title>
          <p>Recent advances in machine learning have improved natural language processing, with Transformer-based models becoming some of the most successful architectures to date. The Transformer is a type of deep learning model that uses a neural network architecture, processing sequences through multi-head self-attention to capture relationships between all parts of the input simultaneously (<xref ref-type="bibr" rid="B41">Vaswani et al. 2017</xref>). Similar to human thinking, the Transformer model, unlike previous deep learning models, pays attention to the most relevant words when understanding a sentence. For example, when reading ‘Amsterdam is the capital of ___,’ the model predicts ‘The Netherlands’ by recognizing the relationship between ‘Amsterdam’ and ‘capital’.</p>
          <p><abbrev xlink:title="Large Language Models" id="ABBRID0EGGAC">LLMs</abbrev> generate summaries by rephrasing ideas based on context and instruction (prompt), not by copying text. The model writes a shorter version using new sentences, and because the model is trained on vast amounts of data, including financial documents, it can infer what is relevant to investors, even when the language is complex. Therefore, <abbrev xlink:title="Large Language Models" id="ABBRID0EKGAC">LLMs</abbrev> can tailor summaries for investors and help them understand the CD&amp;A sections more quickly and easily.</p>
          <p>Summaries are generated using the DeepSeek-r1-distill-qwen-7b model hosted on a local machine by dividing each CD&amp;A section into 30,000-character chunks and using the following prompt: <italic>Write an investor-friendly summary of all relevant information in this document. The summary should be highly informative and detailed. Do not skip important data or context – include all major figures, policies, and justifications.”</italic></p>
          <p>The process works as follows: the CD&amp;A files are split into chunks on a local machine. These chunks, accompanied by a standardized prompt and the model’s role, are then sent individually to the LLM. The communication and data exchange between the local and hosted machines occur using Python and an Application Programming Interface (API), which is a set of rules and protocols that allows different applications to interact in a well-documented way (<xref ref-type="bibr" rid="B10">Chen 2025</xref>). The LLM processes the inputs remotely and sends the generated summaries back to the local machine, where the outputs are merged and stored.</p>
        </sec>
        <sec sec-type="3.2.3. Measurement of bloat and traditional textual methods" id="SECID0EXGAC">
          <title>
            <italic>3.2.3. Measurement of bloat and traditional textual methods</italic>
          </title>
          <p><italic>Bloat</italic> is measured following the methodology by <xref ref-type="bibr" rid="B28">Kim et al. (2024)</xref> as the difference between original text and summary lengths scaled by original text length. A high value of <italic>bloat</italic> implies more redundant and irrelevant information. To assess whether the bloat measure provides incremental information beyond established linguistic features, several traditional textual controls are included: <italic>Fog Index</italic> (<xref ref-type="bibr" rid="B22">Gunning 1952</xref>; <xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>), <italic>File_Length</italic> (<xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B32">Lawrence 2013</xref>), <italic>SentimentLM</italic> analysis (<xref ref-type="bibr" rid="B37">Loughran and McDonald 2014</xref>; <xref ref-type="bibr" rid="B24">Huang et al. 2014</xref>), <italic>Redundancy</italic> (<xref ref-type="bibr" rid="B16">Dyer et al. 2017</xref>; <xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>), <italic>Boilerplate</italic> language (<xref ref-type="bibr" rid="B16">Dyer et al. 2017</xref>; <xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>).</p>
        </sec>
        <sec sec-type="3.2.4. Control variables" id="SECID0E5IAC">
          <title>
            <italic>3.2.4. Control variables</italic>
          </title>
          <p>To control for the variables that may impact bloat and influence the relationship between the variables of interest, building on prior literature, several controls are incorporated: firm size (<xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B9">Carter et al. 2016</xref>), leverage (<xref ref-type="bibr" rid="B9">Carter et al. 2016</xref>), file size, industry, firm, and year-fixed effects. When the models are estimated using industry-fixed effects instead of firm-fixed effects, additional variables are included as controls: sales growth (<xref ref-type="bibr" rid="B29">Kramer and Matějka 2024</xref>), R&amp;D expenses (<xref ref-type="bibr" rid="B1">Albuquerque et al. 2024</xref>), and ROE (<xref ref-type="bibr" rid="B9">Carter et al. 2016</xref>) to control for cross-industry variation in performance, complexity, and growth that may influence disclosure practices and executive compensation.</p>
        </sec>
      </sec>
      <sec sec-type="3.3. Research design" id="SECID0E6JAC">
        <title>3.3. Research design</title>
        <p>To answer the research question and test H1, whether bloat is positively associated with excess compensation, the following OLS panel regression with fixed effects is employed:</p>
        <p><italic>Excess(Comp)<sub>it</sub></italic> = <italic>β<sub>0</sub></italic> + <italic>β<sub>1</sub>Bloat<sub>it</sub></italic> + <italic>β<sub>2</sub>TraditionalMethods</italic> + <italic>β<sub>3</sub>FirmControls<sub>it</sub></italic> + <italic>α<sub>i</sub></italic> + <italic>δ<sub>t</sub></italic> + <italic>θ<sub>i</sub></italic> + <italic>ɛ<sub>it</sub></italic> (1)</p>
        <p>where <italic>α<sub>i</sub></italic> is firm fixed effect, <italic>δ<sub>t</sub></italic> is year fixed effect, <italic>θ<sub>i</sub></italic> is industry fixed effects using the <xref ref-type="bibr" rid="B18">Fama and French (1997)</xref> 48–industry classification, and <italic>ɛ<sub>it</sub></italic> is the error term.</p>
        <p>To address H2a and H2b and find possible determinants of bloat, board gender diversity and CEO-chair duality are regressed with CEO pay slice (<xref ref-type="bibr" rid="B4">Bebchuk et al. 2011</xref>) and board size (<xref ref-type="bibr" rid="B42">Yermack 1996</xref>) as additional possible determinants and growth options, volatility, and loss (<xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>) as control variables in the OLS regression:</p>
        <p><italic>Bloat<sub>it</sub></italic> = <italic>β<sub>0</sub></italic> + <italic>β<sub>1</sub>GovernanceMechanisms<sub>it</sub></italic> + <italic>β<sub>2</sub>OtherFactors<sub>it</sub></italic> + <italic>β<sub>3</sub>FirmFactors<sub>it</sub></italic> + <italic>δ<sub>t</sub></italic> + <italic>θ<sub>i</sub></italic> + <italic>ɛ<sub>it</sub></italic> (2)</p>
      </sec>
      <sec sec-type="3.4. Descriptive statistics" id="SECID0E2NAC">
        <title>3.4. Descriptive statistics</title>
        <p>Table <xref ref-type="table" rid="T1">1</xref> shows summary statistics for the main variables used in the analysis. The interpretation of excess compensation is the log ratio of actual to predicted compensation. Therefore, <italic>ExcessComp1</italic> indicates that the median excess compensation is approximately 6.8 percent higher than predicted. All three proxy variables for excess executive compensation have a high standard deviation, meaning that the actual pay of some executives is more than their predicted pay, while for others, it is below predicted. Mean value of <italic>Bloat</italic> suggests that, on average, 87.4% of the original document content is not retained in the summary, implying a high level of redundancy in compensation disclosures.</p>
        <table-wrap id="T1" position="float" orientation="portrait">
          <label>Table 1.</label>
          <caption>
            <p>Descriptive statistics.</p>
          </caption>
          <table>
            <tbody>
              <tr>
                <th rowspan="1" colspan="1">
                  <bold>Variable</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Number of Observations</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Mean</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Standard Deviation</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Q1</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Median</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>Q3</bold>
                </th>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>ExcessComp1</italic>
                </td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.662</td>
                <td rowspan="1" colspan="1">–0.297</td>
                <td rowspan="1" colspan="1">0.066</td>
                <td rowspan="1" colspan="1">0.381</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>ExcessComp2</italic>
                </td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.790</td>
                <td rowspan="1" colspan="1">–0.428</td>
                <td rowspan="1" colspan="1">0.029</td>
                <td rowspan="1" colspan="1">0.464</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>ExcessPay</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">–0.942</td>
                <td rowspan="1" colspan="1">1.001</td>
                <td rowspan="1" colspan="1">–0.362</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.315</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Bloat</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.874</td>
                <td rowspan="1" colspan="1">0.033</td>
                <td rowspan="1" colspan="1">0.862</td>
                <td rowspan="1" colspan="1">0.888</td>
                <td rowspan="1" colspan="1">0.893</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Fog</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">22.100</td>
                <td rowspan="1" colspan="1">1.760</td>
                <td rowspan="1" colspan="1">20.900</td>
                <td rowspan="1" colspan="1">22.000</td>
                <td rowspan="1" colspan="1">23.100</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>File_Length</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">14,017</td>
                <td rowspan="1" colspan="1">6,882</td>
                <td rowspan="1" colspan="1">9,356</td>
                <td rowspan="1" colspan="1">12,998</td>
                <td rowspan="1" colspan="1">17,628</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>File_Size_kb</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">107.699</td>
                <td rowspan="1" colspan="1">49.903</td>
                <td rowspan="1" colspan="1">72.987</td>
                <td rowspan="1" colspan="1">101.836</td>
                <td rowspan="1" colspan="1">134.690</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>SentimentLM</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">–0.000</td>
                <td rowspan="1" colspan="1">0.004</td>
                <td rowspan="1" colspan="1">–0.003</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.003</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Redundancy</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.189</td>
                <td rowspan="1" colspan="1">0.058</td>
                <td rowspan="1" colspan="1">0.150</td>
                <td rowspan="1" colspan="1">0.184</td>
                <td rowspan="1" colspan="1">0.224</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Boilerplate</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.016</td>
                <td rowspan="1" colspan="1">0.005</td>
                <td rowspan="1" colspan="1">0.013</td>
                <td rowspan="1" colspan="1">0.015</td>
                <td rowspan="1" colspan="1">0.018</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Leverage</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.200</td>
                <td rowspan="1" colspan="1">0.195</td>
                <td rowspan="1" colspan="1">0.033</td>
                <td rowspan="1" colspan="1">0.153</td>
                <td rowspan="1" colspan="1">0.306</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Firm_Size</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">7.651</td>
                <td rowspan="1" colspan="1">1.637</td>
                <td rowspan="1" colspan="1">6.538</td>
                <td rowspan="1" colspan="1">7.590</td>
                <td rowspan="1" colspan="1">8.700</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Sales_Growth</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.089</td>
                <td rowspan="1" colspan="1">0.229</td>
                <td rowspan="1" colspan="1">-0.009</td>
                <td rowspan="1" colspan="1">0.062</td>
                <td rowspan="1" colspan="1">0.147</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>R&amp;D_Assets</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.038</td>
                <td rowspan="1" colspan="1">0.078</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.039</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>ROE</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.075</td>
                <td rowspan="1" colspan="1">0.497</td>
                <td rowspan="1" colspan="1">0.032</td>
                <td rowspan="1" colspan="1">0.108</td>
                <td rowspan="1" colspan="1">0.187</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>BoardDiversity</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.099</td>
                <td rowspan="1" colspan="1">0.136</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.200</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>CEO_Duality</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.459</td>
                <td rowspan="1" colspan="1">0.498</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">1.000</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>CEO_Pay_Slice</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.397</td>
                <td rowspan="1" colspan="1">0.120</td>
                <td rowspan="1" colspan="1">0.331</td>
                <td rowspan="1" colspan="1">0.404</td>
                <td rowspan="1" colspan="1">0.466</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Board_Size</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">5.698</td>
                <td rowspan="1" colspan="1">1.066</td>
                <td rowspan="1" colspan="1">5.000</td>
                <td rowspan="1" colspan="1">5.000</td>
                <td rowspan="1" colspan="1">6.000</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Loss</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">0.182</td>
                <td rowspan="1" colspan="1">0.390</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.000</td>
                <td rowspan="1" colspan="1">0.000</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Volatility</italic>
                </td>
                <td rowspan="1" colspan="1">7,526</td>
                <td rowspan="1" colspan="1">0.335</td>
                <td rowspan="1" colspan="1">0.160</td>
                <td rowspan="1" colspan="1">0.223</td>
                <td rowspan="1" colspan="1">0.301</td>
                <td rowspan="1" colspan="1">0.402</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">
                  <italic>Growth_Options</italic>
                </td>
                <td rowspan="1" colspan="1">7,786</td>
                <td rowspan="1" colspan="1">2.735</td>
                <td rowspan="1" colspan="1">2.346</td>
                <td rowspan="1" colspan="1">1.361</td>
                <td rowspan="1" colspan="1">1.959</td>
                <td rowspan="1" colspan="1">3.108</td>
              </tr>
            </tbody>
          </table>
          <table-wrap-foot>
            <fn>
              <p><italic>Note</italic>. The sample has 7,786 firm-year observations of S&amp;P 1500 companies from 2011–2018 with non-missing data for the main variables of interest, for which it was possible to extract CD&amp;A sections and each firm had at least 2 years of observations. Since excess compensation calculation involves lagged values, the final number of observations used is 6,668. There are 260 missing observations for <italic>Volatility</italic>.</p>
            </fn>
          </table-wrap-foot>
        </table-wrap>
      </sec>
    </sec>
    <sec sec-type="4. Results" id="SECID0ENRAE">
      <title>4. Results</title>
      <sec sec-type="4.1. Hypothesis 1 – Disclosure bloat and excess executive compensation" id="SECID0ERRAE">
        <title>4.1. Hypothesis 1 – Disclosure bloat and excess executive compensation</title>
        <p>The first analysis tests whether <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EXRAE">AI</abbrev>-generated summaries contain incrementally informative content beyond traditional textual measures in explaining excess compensation. Using a regression model in Equation (1), Table <xref ref-type="table" rid="T2">2</xref> shows that in column (1), a 0.01 point increase in net positive sentiment of <italic>SentimentLM</italic> (p &lt; 0.001) is associated with an 8.89% (= <italic>e</italic><sup>8.518×0.01</sup> – 1) increase in excess compensation, keeping other variables constant. While this result suggests that companies often use a more positive tone in their documents as a strategic tool to manage perception and shift readers’ attention away from things they want to hide, aligning with prior research (<xref ref-type="bibr" rid="B24">Huang et al. 2014</xref>, <xref ref-type="bibr" rid="B25">2018</xref>), positive sentiment could coincide with the underlying justification for higher pay, such as strong performance.</p>
        <table-wrap id="T2" position="float" orientation="portrait">
          <label>Table 2.</label>
          <caption>
            <p>Test of H1: Excess Compensation Consequences of Disclosure Bloat.</p>
          </caption>
          <table>
            <tbody>
              <tr>
                <th rowspan="2" colspan="1">
                  <bold>Dependent Variable =</bold>
                </th>
                <th rowspan="1" colspan="4">
                  <bold>
                    <italic>ExcessComp1</italic>
                  </bold>
                </th>
              </tr>
              <tr>
                <th rowspan="1" colspan="1">
                  <bold>(1)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(2)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(3)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(4)</bold>
                </th>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Bloat</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.520***</td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.152</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.200)</td>
                <td rowspan="1" colspan="1">(0.262)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Fog_Index</italic>
                </td>
                <td rowspan="1" colspan="1">–0.000</td>
                <td rowspan="1" colspan="1">–0.000</td>
                <td rowspan="1" colspan="1">0.009</td>
                <td rowspan="1" colspan="1">0.009</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.009)</td>
                <td rowspan="1" colspan="1">(0.009)</td>
                <td rowspan="1" colspan="1">(0.007)</td>
                <td rowspan="1" colspan="1">(0.007)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Redundancy</italic>
                </td>
                <td rowspan="1" colspan="1">0.340</td>
                <td rowspan="1" colspan="1">0.326</td>
                <td rowspan="1" colspan="1">0.304</td>
                <td rowspan="1" colspan="1">0.371</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.249)</td>
                <td rowspan="1" colspan="1">(0.247)</td>
                <td rowspan="1" colspan="1">(0.372)</td>
                <td rowspan="1" colspan="1">(0.299)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>SentimentLM</italic>
                </td>
                <td rowspan="1" colspan="1">8.518***</td>
                <td rowspan="1" colspan="1">8.285***</td>
                <td rowspan="1" colspan="1">–4.184</td>
                <td rowspan="1" colspan="1">–4.233</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(3.148)</td>
                <td rowspan="1" colspan="1">(3.160)</td>
                <td rowspan="1" colspan="1">(3.530)</td>
                <td rowspan="1" colspan="1">(3.504)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>File_Length</italic>
                </td>
                <td rowspan="1" colspan="1">0.007***</td>
                <td rowspan="1" colspan="1">0.007***</td>
                <td rowspan="1" colspan="1">0.016***</td>
                <td rowspan="1" colspan="1">0.015***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.002)</td>
                <td rowspan="1" colspan="1">(0.005)</td>
                <td rowspan="1" colspan="1">(0.004)</td>
                <td rowspan="1" colspan="1">(0.004)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Boilerplate</italic>
                </td>
                <td rowspan="1" colspan="1">0.200</td>
                <td rowspan="1" colspan="1">1.047</td>
                <td rowspan="1" colspan="1">–1.233</td>
                <td rowspan="1" colspan="1">–1.075</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(3.188)</td>
                <td rowspan="1" colspan="1">(3.223)</td>
                <td rowspan="1" colspan="1">(3.311)</td>
                <td rowspan="1" colspan="1">(3.304)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Firm_Size</italic>
                </td>
                <td rowspan="1" colspan="1">0.074*</td>
                <td rowspan="1" colspan="1">0.074*</td>
                <td rowspan="1" colspan="1">0.024</td>
                <td rowspan="1" colspan="1">0.024</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.045)</td>
                <td rowspan="1" colspan="1">(0.045)</td>
                <td rowspan="1" colspan="1">(0.017)</td>
                <td rowspan="1" colspan="1">(0.017)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Leverage</italic>
                </td>
                <td rowspan="1" colspan="1">–0.257***</td>
                <td rowspan="1" colspan="1">–0.259***</td>
                <td rowspan="1" colspan="1">0.181</td>
                <td rowspan="1" colspan="1">0.181*</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.088)</td>
                <td rowspan="1" colspan="1">(0.089)</td>
                <td rowspan="1" colspan="1">(0.091)</td>
                <td rowspan="1" colspan="1">(0.091)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Res_File_Size</italic>
                </td>
                <td rowspan="1" colspan="1">0.002***</td>
                <td rowspan="1" colspan="1">0.002***</td>
                <td rowspan="1" colspan="1">0.007***</td>
                <td rowspan="1" colspan="1">0.007***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Sales_Growth</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.268***</td>
                <td rowspan="1" colspan="1">0.267***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.038)</td>
                <td rowspan="1" colspan="1">(0.038)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>R&amp;D_Assets</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.523</td>
                <td rowspan="1" colspan="1">0.523</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.542)</td>
                <td rowspan="1" colspan="1">(0.542)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>ROE</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.005</td>
                <td rowspan="1" colspan="1">0.005</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.014)</td>
                <td rowspan="1" colspan="1">(0.014)</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Firm FE</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">No</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Year FE</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Industry FE</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Adj. R<sup>2</sup></td>
                <td rowspan="1" colspan="1">0.606</td>
                <td rowspan="1" colspan="1">0.606</td>
                <td rowspan="1" colspan="1">0.094</td>
                <td rowspan="1" colspan="1">0.094</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Within R<sup>2</sup></td>
                <td rowspan="1" colspan="1">0.015</td>
                <td rowspan="1" colspan="1">0.016</td>
                <td rowspan="1" colspan="1">0.074</td>
                <td rowspan="1" colspan="1">0.074</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">N</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
              </tr>
            </tbody>
          </table>
          <table-wrap-foot>
            <fn>
              <p><italic>Note</italic>. Presented are OLS estimates of the main research model described in Equation (1) with excess compensation calculated using tdc1 as a dependent variable. ***, **, * represent significance at the 0.01, 0.05 and 0.10 levels. Standard errors are in parentheses and clustered at the Fama-French 48–industry level in Columns (1) and (2) and at the firm level in Columns (3) and (4). Year-fixed effects are included in all regressions. Since excess compensation calculation involves lagged values, the final number of observations used is 6,668.</p>
            </fn>
          </table-wrap-foot>
        </table-wrap>
        <p>The positive direction of <italic>File_Length</italic> (p &lt; 0.001) and <italic>Res_File_Size</italic> (p &lt; 0.001) coefficients is not surprising since larger file lengths and sizes suggest higher textual complexity. Every additional 1,000 words is associated with a 0.7% increase in excess compensation. Moreover, the significance of the <italic>Res_File_Size</italic> implies that firms that use more complex formatting and other textual characteristics beyond the word count are also more likely to provide longer and more detailed information, not for the sake of transparency, but to overwhelm the readers and obscure important information (<xref ref-type="bibr" rid="B36">Li 2008</xref>; <xref ref-type="bibr" rid="B32">Lawrence 2013</xref>). Therefore, less scrutiny can allow excess executive compensation to go unnoticed.</p>
        <p>The coefficient of <italic>Leverage</italic> (–0.257, p &lt; 0.001) indicates that higher debt constrains executives’ ability to inflate pay because creditors, particularly banks, act as additional monitors when they provide substantial financing. Hence, increased oversight limits executives’ ability to earn excess compensation. The <italic>Fog_Index</italic> coefficient is insignificant (p &gt; 0.1), meaning that it cannot explain excess compensation. This conclusion opposes the initial expectation based on <xref ref-type="bibr" rid="B36">Li (2008)</xref> but shows that traditional readability metrics fail to capture textual complexity relevant to excess compensation, consistent with <xref ref-type="bibr" rid="B37">Loughran and McDonald (2014)</xref>.</p>
        <p>When industry-fixed effects replace firm-fixed effects in Columns (3) and (4), <italic>SentimentLM</italic>, <italic>Leverage</italic>, and <italic>Bloat</italic> lose significance, and the adjusted R<sup>2</sup> decreases from 60.6% to 9.4%, implying that industry fixed effects provide a much weaker control structure than firm fixed effects. Firm fixed effects absorb all stable, unobserved firm-level heterogeneity, whereas industry fixed effects only control for broad sector differences. As a result, switching to industry fixed effects introduces additional noise and reduces explanatory power, making coefficient estimates less precise.</p>
        <p>Table <xref ref-type="table" rid="T3">3</xref> displays the results of Equation (1) with <italic>ExcessComp2 as</italic> a dependent variable. The findings are consistent with Table <xref ref-type="table" rid="T2">2</xref>. <italic>Firm_Size</italic> becomes significant at the 5% level, previously 10%, suggesting larger firms are positively associated with excess executive compensation due to that larger firms have longer and more complex disclosures, which offer executives greater opportunity for obfuscation (<xref ref-type="bibr" rid="B36">Li 2008</xref>).</p>
        <table-wrap id="T3" position="float" orientation="portrait">
          <label>Table 3.</label>
          <caption>
            <p>Test of H1: Excess compensation consequences of disclosure bloat.</p>
          </caption>
          <table>
            <tbody>
              <tr>
                <th rowspan="2" colspan="1">
                  <bold>Dependent Variable =</bold>
                </th>
                <th rowspan="1" colspan="4">
                  <bold>
                    <italic>ExcessComp2</italic>
                  </bold>
                </th>
              </tr>
              <tr>
                <th rowspan="1" colspan="1">
                  <bold>(1)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(2)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(3)</bold>
                </th>
                <th rowspan="1" colspan="1">
                  <bold>(4)</bold>
                </th>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Bloat</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.944***</td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.317</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.278)</td>
                <td rowspan="1" colspan="1">(0.370)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Fog_Index</italic>
                </td>
                <td rowspan="1" colspan="1">0.009</td>
                <td rowspan="1" colspan="1">0.010</td>
                <td rowspan="1" colspan="1">0.022**</td>
                <td rowspan="1" colspan="1">0.022**</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.012)</td>
                <td rowspan="1" colspan="1">(0.012)</td>
                <td rowspan="1" colspan="1">(0.008)</td>
                <td rowspan="1" colspan="1">(0.008)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Redundancy</italic>
                </td>
                <td rowspan="1" colspan="1">–0.040</td>
                <td rowspan="1" colspan="1">–0.066</td>
                <td rowspan="1" colspan="1">0.231</td>
                <td rowspan="1" colspan="1">0.222</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.292)</td>
                <td rowspan="1" colspan="1">(0.289)</td>
                <td rowspan="1" colspan="1">(0.497)</td>
                <td rowspan="1" colspan="1">(0.494)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>SentimentLM</italic>
                </td>
                <td rowspan="1" colspan="1">10.302**</td>
                <td rowspan="1" colspan="1">9.880**</td>
                <td rowspan="1" colspan="1">–3.990</td>
                <td rowspan="1" colspan="1">–4.093</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(4.025)</td>
                <td rowspan="1" colspan="1">(4.030)</td>
                <td rowspan="1" colspan="1">(3.909)</td>
                <td rowspan="1" colspan="1">(3.822)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>File_Length</italic>
                </td>
                <td rowspan="1" colspan="1">0.010***</td>
                <td rowspan="1" colspan="1">0.009***</td>
                <td rowspan="1" colspan="1">0.015***</td>
                <td rowspan="1" colspan="1">0.014***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.002)</td>
                <td rowspan="1" colspan="1">(0.002)</td>
                <td rowspan="1" colspan="1">(0.004)</td>
                <td rowspan="1" colspan="1">(0.004)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Boilerplate</italic>
                </td>
                <td rowspan="1" colspan="1">–1.468</td>
                <td rowspan="1" colspan="1">0.689</td>
                <td rowspan="1" colspan="1">–1.868</td>
                <td rowspan="1" colspan="1">–1.538</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(3.822)</td>
                <td rowspan="1" colspan="1">(3.852)</td>
                <td rowspan="1" colspan="1">(3.473)</td>
                <td rowspan="1" colspan="1">(3.427)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Firm_Size</italic>
                </td>
                <td rowspan="1" colspan="1">0.090**</td>
                <td rowspan="1" colspan="1">0.090**</td>
                <td rowspan="1" colspan="1">0.023</td>
                <td rowspan="1" colspan="1">0.023</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.040)</td>
                <td rowspan="1" colspan="1">(0.040)</td>
                <td rowspan="1" colspan="1">(0.017)</td>
                <td rowspan="1" colspan="1">(0.017)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Leverage</italic>
                </td>
                <td rowspan="1" colspan="1">–0.218**</td>
                <td rowspan="1" colspan="1">–0.222**</td>
                <td rowspan="1" colspan="1">0.186*</td>
                <td rowspan="1" colspan="1">0.186*</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.104)</td>
                <td rowspan="1" colspan="1">(0.104)</td>
                <td rowspan="1" colspan="1">(0.104)</td>
                <td rowspan="1" colspan="1">(0.104)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Res_File_Size</italic>
                </td>
                <td rowspan="1" colspan="1">0.002***</td>
                <td rowspan="1" colspan="1">0.003***</td>
                <td rowspan="1" colspan="1">0.007***</td>
                <td rowspan="1" colspan="1">0.007***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
                <td rowspan="1" colspan="1">(0.001)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>Sales_Growth</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.333***</td>
                <td rowspan="1" colspan="1">0.333***</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.039)</td>
                <td rowspan="1" colspan="1">(0.039)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>R&amp;D_Assets</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.200</td>
                <td rowspan="1" colspan="1">0.200</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.436)</td>
                <td rowspan="1" colspan="1">(0.436)</td>
              </tr>
              <tr>
                <td rowspan="2" colspan="1">
                  <italic>ROE</italic>
                </td>
                <td rowspan="2" colspan="1"/>
                <td rowspan="2" colspan="1"/>
                <td rowspan="1" colspan="1">0.015</td>
                <td rowspan="1" colspan="1">0.015</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">(0.020)</td>
                <td rowspan="1" colspan="1">(0.020)</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Firm FE</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">No</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Year FE</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Industry FE</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">No</td>
                <td rowspan="1" colspan="1">Yes</td>
                <td rowspan="1" colspan="1">Yes</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Adj. R<sup>2</sup></td>
                <td rowspan="1" colspan="1">0.498</td>
                <td rowspan="1" colspan="1">0.499</td>
                <td rowspan="1" colspan="1">0.069</td>
                <td rowspan="1" colspan="1">0.069</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">Within R<sup>2</sup></td>
                <td rowspan="1" colspan="1">0.012</td>
                <td rowspan="1" colspan="1">0.014</td>
                <td rowspan="1" colspan="1">0.056</td>
                <td rowspan="1" colspan="1">0.056</td>
              </tr>
              <tr>
                <td rowspan="1" colspan="1">N</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
                <td rowspan="1" colspan="1">6,668</td>
              </tr>
            </tbody>
          </table>
          <table-wrap-foot>
            <fn>
              <p><italic>Note</italic>. Presented are OLS estimates of the main research model described in Equation (1) with excess compensation calculated using tdc2 as a dependent variable. ***, **, * represent significance at the 0.01, 0.05 and 0.10 levels. Standard errors are in parentheses and clustered at the Fama-French 48–industry level in Columns (1) and (2) and at the firm level in Columns (3) and (4). Year-fixed effects are included in all regressions. Since excess compensation calculation involves lagged values, the final number of observations used is 6,668.</p>
            </fn>
          </table-wrap-foot>
        </table-wrap>
        <p>Similarly, Columns (3) and (4) in Table <xref ref-type="table" rid="T3">3</xref> are consistent with the results in Columns (3) and (4) in Table <xref ref-type="table" rid="T2">2</xref>, but the <italic>Fog_Index</italic> coefficient becomes significant (p &lt; 0.001) when industry-fixed effects are included, which implies that the variation in <italic>Fog_Index</italic> that matters for excess compensation occurs between firms, not within a firm over time. This finding differs from previous outcomes but aligns with <xref ref-type="bibr" rid="B36">Li (2008)</xref>, who argues that higher textual complexity should enable executives to hide details and increase their compensation.</p>
        <p>The results align with theoretical expectations. As argued before, most traditional measures rely on keywords and rules, while ignoring contextual meaning (<xref ref-type="bibr" rid="B33">Leippold 2023</xref>). Since compensation contracts are often written in verbose and complex language (<xref ref-type="bibr" rid="B1">Albuquerque et al. 2024</xref>), in line with the management obfuscation hypothesis, shareholders are less likely to thoroughly evaluate compensation structures due to the time constraints they face, leading to excess compensation. In contrast, <abbrev xlink:title="Large Language Models" id="ABBRID0ENVAG">LLMs</abbrev> are designed to grasp the contextual significance, as <xref ref-type="bibr" rid="B19">Floridi and Chiriatti (2020)</xref> argue, allowing them to filter out unnecessary and repetitive language and extract essential informational content from complex disclosures.</p>
        <p>Following this rationale, the added explanatory power of the <italic>Bloat</italic> variable is a reasonable outcome of using a technology that more accurately replicates human judgment and intent when determining relevant content (<xref ref-type="bibr" rid="B15">Dong et al. 2024</xref>). Hence, the main implication of H1 is that <abbrev xlink:title="Large Language Models" id="ABBRID0E4VAG">LLMs</abbrev> can improve the quality of corporate disclosures and help address the principal-agency problem by assisting stakeholders in better understanding the company’s financial statements and reports.</p>
        <p>To address whether summaries predict excess compensation, given that <italic>Bloat</italic> varies over time within firms, I use column (2) in Tables <xref ref-type="table" rid="T2">2</xref>, <xref ref-type="table" rid="T3">3</xref> with firm-fixed effects to test H1. The coefficient of <italic>Bloat</italic> in column (2) of Table <xref ref-type="table" rid="T2">2</xref> is positive and significant at a 1% significance level; a 0.01 unit increase in <italic>Bloat</italic> is associated with a 0.52% (= <italic>e</italic><sup>0.520×0.01</sup> – 1) increase in <italic>ExcessComp1</italic> and a 0.95% (= <italic>e</italic><sup>0.944×0.01</sup> – 1) increase in <italic>ExcessComp2</italic>, when all other variables are kept constant. Hence, firms with more bloated CD&amp;A disclosures are more likely to give executives compensation higher than what is justified by the firm’s performance. As a result, H1 is not rejected.</p>
        <p>The result is consistent with expectations from prior literature. According to agency theory, executives are incentivized to exploit their managerial power and camouflage the structure of their compensation (<xref ref-type="bibr" rid="B3">Bebchuk and Fried 2003</xref>), and bloated disclosures can be a mechanism to earn excess pay without detection by investors. Since investors and shareholders face cognitive and informational constraints (<xref ref-type="bibr" rid="B1">Albuquerque et al. 2024</xref>; <xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>), they may give less scrutiny to compensation disclosures. <xref ref-type="bibr" rid="B30">Laksmana et al. (2012)</xref> also find lower CD&amp;A readability in the 2007 proxy statements when the CEO’s compensation exceeds what is justified by the firm’s performance.</p>
        <p>Similar to the results in <xref ref-type="bibr" rid="B28">Kim et al. (2024)</xref>, <italic>Redundancy</italic> and <italic>Boilerplate</italic> are not significant in any of the columns. This finding further highlights the advantage of generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0E2XAG">AI</abbrev> in better understanding the textual complexities. Nevertheless, the size and significance of the <italic>SentimentLM</italic>, <italic>File_Length</italic>, and <italic>Res_File_Size</italic> variables stay mostly constant, and the within R<sup>2</sup> does not substantially increase after incorporating the <italic>Bloat</italic> variable into the regression, even though the coefficient of <italic>Bloat</italic> is statistically significant. All of these suggest that the <italic>Bloat</italic> does not correct for omitted variable bias; instead, it reflects a unique but very small component of disclosure complexity that is not fully explained by sentiment or document length.</p>
        <p>Untabulated tests on sentiment tone reveal that the negative sentiment coefficient is insignificant (p &gt; 0.1), while the positive sentiment coefficient is significant (p &lt; 0.001) in the main regression results. This suggests that firms use a more positive tone to obscure information since highlighting positive events can make investors judge higher executive compensation less, consistent with <xref ref-type="bibr" rid="B25">Huang et al. (2018)</xref>, who show that managers use a positive tone to enhance earnings management.</p>
        <p>The robustness check, where excess executive compensation is calculated using an alternative proxy of the <xref ref-type="bibr" rid="B31">Larcker et al. (2011)</xref> model, shows that the estimate of <italic>Bloat</italic> remains significant and its magnitude is similar to the one in Column (2) of Table <xref ref-type="table" rid="T2">2</xref>. Moreover, as before, <italic>Bloat</italic> is significant only with firm-fixed effects, not with industry-fixed effects.</p>
        <p>Regarding the economic magnitude of the <italic>Bloat</italic> variable, using the estimates from column (2) of Table <xref ref-type="table" rid="T3">3</xref>, a one standard deviation increase in <italic>Bloat</italic> is associated with a 3.16% (= <italic>e</italic><sup>0.033 × 0.944</sup> – 1) increase in excess executive compensation. Although 3.16% might appear small, a slight change in executive pay can amount to thousands of dollars for CEOs in larger firms, which highlights the practical importance and LLM’s potential for increasing transparency and accountability in corporate governance.</p>
      </sec>
      <sec sec-type="4.2. Hypothesis H2a and H2b – Determinants of bloat" id="SECID0EOZAG">
        <title>4.2. Hypothesis H2a and H2b – Determinants of bloat</title>
        <p>We find no significant relationship between <italic>BoardDiversity</italic> or <italic>CEO</italic>_Duality and <italic>Bloat</italic>, suggesting that formal governance structures may not effectively constrain redundancy. Hence, H2a and H2b are rejected. Overall, the model explains only 0.9% of the variation in <italic>Bloat</italic>, further indicating the heterogeneous complexity of disclosure practices. However, the findings align with managerial power theory (<xref ref-type="bibr" rid="B17">Eisenhardt 1989</xref>) that since board members value their positions and personal relationships with CEOs, they are less likely to disagree with the CEO’s pay structure and reduce redundant compensation content.</p>
      </sec>
    </sec>
    <sec sec-type="5. Conclusion and discussion" id="SECID0EA1AG">
      <title>5. Conclusion and discussion</title>
      <p>This research explored whether large language models can identify redundant content in the CD&amp;A sections of proxy statements and whether this redundancy is associated with excess executive compensation. The study introduces a new way of understanding excess executive compensation using a quantifiable, verbose content measure – <italic>bloat</italic> – beyond traditional textual metrics. To address the research question of how a measure of compensation disclosure bloat can be developed using generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EI1AG">AI</abbrev> and its relationship with excess executive compensation, fixed-effects panel regression models were created for 7,786 firm-year observations covering S&amp;P 1500 companies between 2011–2018.</p>
      <p>The results show that the bloat measure significantly explains variation in excess executive compensation, and it captures the part of excess compensation that traditional natural language processing models cannot explain. The regression results from Equation (1) are consistent with H1, which suggests that bloated CD&amp;A sections are associated with higher excess executive compensation. The rationale is that according to agency theory, executives’ goals are to maximize personal benefits, and bloat allows them to obscure key pay details by overwhelming investors with irrelevant information and thus justifying excess compensation because investors have limited cognitive resources and restricted attention to analyzing full financial data (<xref ref-type="bibr" rid="B1">Albuquerque et al. 2024</xref>; <xref ref-type="bibr" rid="B28">Kim et al. 2024</xref>).</p>
      <p>Interestingly, including the industry-fixed effects rather than the firm-fixed effects turned a significant bloat coefficient into an insignificant one. This implies that the effect of the bloat variable is driven by within-firm variation over time, and a large portion of the variation in excess compensation is explained by firm-specific factors that do not vary much over time.</p>
      <p>Contrary to H2a and H2b expectations, we did not find strong evidence that governance mechanisms are associated with less bloat. This may be due to measurement limitations or the choice of governance proxies. In addition, compensation disclosure practices may be shaped more by institutional or advisory norms than by formal governance structures, which can be a focus for future studies.</p>
      <sec sec-type="5.1. Implications" id="SECID0EY1AG">
        <title>5.1. Implications</title>
        <p>This research improves both social and scientific understanding of corporate disclosure by demonstrating the value of <abbrev xlink:title="Large Language Models" id="ABBRID0E51AG">LLMs</abbrev> in identifying complex textual relationships that traditional metrics often fail to identify (<xref ref-type="bibr" rid="B33">Leippold 2023</xref>). It is the first study to use <abbrev xlink:title="Large Language Models" id="ABBRID0EG2AG">LLMs</abbrev> to quantify disclosure bloat in corporate governance and examine its relationship with excess executive compensation. The results can offer a solution to a longstanding problem of excessive executive compensation by highlighting the potential of generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EK2AG">AI</abbrev> to address it.</p>
        <p>Our findings suggest important implications for regulators by showing that lengthy or more complex executive compensation disclosures do not automatically improve transparency. They also matter for investors, who can leverage generative <abbrev xlink:title="Artificial Intelligence" id="ABBRID0EQ2AG">AI</abbrev> to analyze the key elements of the compensation disclosure sections faster and more easily. Finally, corporate boards should emphasize the significance of clear and less complex communication in financial documents.</p>
      </sec>
      <sec sec-type="5.2. Limitations and future research" id="SECID0EU2AG">
        <title>5.2. Limitations and future research</title>
        <p>We acknowledge some limitations that offer opportunities for future research. First, the output of the LLM varies depending on the model and prompt used. As a result, the length of the summarized documents reflects the LLM’s judgment of contextual relevance and can change depending on the assigned task to a specific LLM, which may also affect the measure of the bloat variable. Future research can utilize different and more advanced models to assess whether the findings are generalizable.</p>
        <p>Moreover, the analysis is restricted to S&amp;P 1500 firms from 2011–2018, limiting generalizability to smaller or international firms with different disclosure practices. Expanding the sample and improving CD&amp;A data extraction would strengthen external validity. Future research can also examine non-financial reports, such as sustainability reports, to investigate whether management obscures relevant information when the content is less numerical.</p>
        <p>Although our empirical analysis is based on U.S. firms, the main message that quality matters more than length is highly relevant to the European context. Future research could apply the bloat measure to European remuneration reports to see whether the results hold under the Shareholder Rights Directive.</p>
        <p>Finally, as with most empirical corporate governance research, absent an exogenous shock endogeneity remains a concern. We therefore do not claim that our associations imply causality. Nevertheless, the findings provide an important starting point for deepening the understanding of how the complexity of text in executive pay disclosures may relate to governance outcomes and remain valuable for theory development and policy discussions.</p>
        <boxed-text id="box1">
          <p><bold>L. Burduli – Lizi Burduli</bold> holds a MScBA in Accounting &amp; Financial Management from RSM, Erasmus University and is currently an auditor at Deloitte.</p>
        </boxed-text>
        <boxed-text id="box2">
          <p><bold>Dr. S. Kramer – Stephan Kramer</bold> is a Professor of Financial Decision Making and Control at RSM, Erasmus University.</p>
        </boxed-text>
        <boxed-text id="box3">
          <p>This article is based on Lizi Burduli’s master’s thesis. This makes her one of the winners of the MAB Thesis Award 2025.</p>
        </boxed-text>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="B1">
        <mixed-citation>Albuquerque A, Carter ME, Guo ZM, Lynch LJ (2024) Complexity of CEO compensation packages. Journal of Accounting and Economics 79(1): 101709. <ext-link xlink:type="simple" xlink:href="10.1016/j.jacceco.2024.101709" ext-link-type="doi">https://doi.org/10.1016/j.jacceco.2024.101709</ext-link></mixed-citation>
      </ref>
      <ref id="B2">
        <mixed-citation>Basu S, Hwang LS, Mitsudome T, Weintrop J (2007) Corporate governance, top executive compensation and firm performance in Japan. Pacific-Basin Finance Journal 15(1): 56–79. <ext-link xlink:type="simple" xlink:href="10.1016/j.pacfin.2006.05.002" ext-link-type="doi">https://doi.org/10.1016/j.pacfin.2006.05.002</ext-link></mixed-citation>
      </ref>
      <ref id="B3">
        <mixed-citation>Bebchuk LA, Fried JM (2003) Executive compensation as an agency problem. Journal of Economic Perspectives 17(3): 71–92. <ext-link xlink:type="simple" xlink:href="10.1257/089533003769204362" ext-link-type="doi">https://doi.org/10.1257/089533003769204362</ext-link></mixed-citation>
      </ref>
      <ref id="B4">
        <mixed-citation>Bebchuk LA, Cremers KM, Peyer UC (2011) The CEO pay slice. Journal of Financial Economics 102(1): 199–221. <ext-link xlink:type="simple" xlink:href="10.1016/j.jfineco.2011.05.006" ext-link-type="doi">https://doi.org/10.1016/j.jfineco.2011.05.006</ext-link></mixed-citation>
      </ref>
      <ref id="B5">
        <mixed-citation>Blankespoor E, DeHaan E, Marinovic I (2020) Disclosure processing costs, investors’ information choice, and equity market outcomes: A review. Journal of Accounting and Economics 70(2–3): 101344. <ext-link xlink:type="simple" xlink:href="10.1016/j.jacceco.2020.101344" ext-link-type="doi">https://doi.org/10.1016/j.jacceco.2020.101344</ext-link></mixed-citation>
      </ref>
      <ref id="B6">
        <mixed-citation>Blankespoor E, DeHaan ED, Wertz J, Zhu C (2019) Why do individual investors disregard accounting information? The roles of information awareness and acquisition costs. Journal of Accounting Research 57(1): 53–84. <ext-link xlink:type="simple" xlink:href="10.1111/1475-679X.12248" ext-link-type="doi">https://doi.org/10.1111/1475-679X.12248</ext-link></mixed-citation>
      </ref>
      <ref id="B7">
        <mixed-citation>Bloomfield RJ (2002) The ‘incomplete revelation hypothesis’ and financial reporting. Accounting Horizons 16: 233–243. <ext-link xlink:type="simple" xlink:href="10.2308/acch.2002.16.3.233" ext-link-type="doi">https://doi.org/10.2308/acch.2002.16.3.233</ext-link></mixed-citation>
      </ref>
      <ref id="B8">
        <mixed-citation>Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems 33: 1877–1901. <ext-link xlink:type="simple" xlink:href="10.48550/arXiv.2005.14165" ext-link-type="doi">https://doi.org/10.48550/arXiv.2005.14165</ext-link></mixed-citation>
      </ref>
      <ref id="B9">
        <mixed-citation>Carter ME, Li L, Marcus AJ, Tehranian H (2016) Excess pay and deficient performance. Review of Financial Economics 30: 1–10. <ext-link xlink:type="simple" xlink:href="10.1016/j.rfe.2015.08.003" ext-link-type="doi">https://doi.org/10.1016/j.rfe.2015.08.003</ext-link></mixed-citation>
      </ref>
      <ref id="B10">
        <mixed-citation>Chen M (2025) [February 24] What is an API (Application Programming Interface)? <ext-link xlink:type="simple" xlink:href="https://www.oracle.com/nl/cloud/cloud-native/api-management/what-is-api" ext-link-type="uri">https://www.oracle.com/nl/cloud/cloud-native/api-management/what-is-api</ext-link></mixed-citation>
      </ref>
      <ref id="B11">
        <mixed-citation>Core JE, Holthausen RW, Larcker DF (1999) Corporate governance, chief executive officer compensation, and firm performance. Journal of Financial Economics 51(3): 371–406. <ext-link xlink:type="simple" xlink:href="10.1016/S0304-405X(98)00058-0" ext-link-type="doi">https://doi.org/10.1016/S0304-405X(98)00058-0</ext-link></mixed-citation>
      </ref>
      <ref id="B12">
        <mixed-citation>Core J, Guay W, Larcker D (2008) The power of the pen and executive compensation. Journal of Financial Economics 88: 1–25. <ext-link xlink:type="simple" xlink:href="10.1016/j.jfineco.2007.05.001" ext-link-type="doi">https://doi.org/10.1016/j.jfineco.2007.05.001</ext-link></mixed-citation>
      </ref>
      <ref id="B13">
        <mixed-citation>De Amicis C, Falconieri S, Tastan M (2021) Sentiment analysis and gender differences in earnings conference calls. Journal of Corporate Finance 71: 101809. <ext-link xlink:type="simple" xlink:href="10.1016/j.jcorpfin.2020.101809" ext-link-type="doi">https://doi.org/10.1016/j.jcorpfin.2020.101809</ext-link></mixed-citation>
      </ref>
      <ref id="B14">
        <mixed-citation>Directive 2007/36/EC (2007) Directive 2007/36/EC of the European Parliament and of the Council of 11 July 2007 on the exercise of certain rights of shareholders in listed companies. <ext-link xlink:type="simple" xlink:href="https://eur-lex.europa.eu/eli/dir/2007/36/oj/eng" ext-link-type="uri">https://eur-lex.europa.eu/eli/dir/2007/36/oj/eng</ext-link></mixed-citation>
      </ref>
      <ref id="B15">
        <mixed-citation>Dong MM, Stratopoulos TC, Wang VX (2024) A scoping review of ChatGPT research in accounting and finance. International Journal of Accounting Information Systems 55: 100715. <ext-link xlink:type="simple" xlink:href="10.1016/j.accinf.2024.100715" ext-link-type="doi">https://doi.org/10.1016/j.accinf.2024.100715</ext-link></mixed-citation>
      </ref>
      <ref id="B16">
        <mixed-citation>Dyer T, Lang M, Stice-Lawrence L (2017) The evolution of 10-K textual disclosure: Evidence from Latent Dirichlet Allocation. Journal of Accounting and Economics 64(2–3): 221–245. <ext-link xlink:type="simple" xlink:href="10.1016/j.jacceco.2017.07.002" ext-link-type="doi">https://doi.org/10.1016/j.jacceco.2017.07.002</ext-link></mixed-citation>
      </ref>
      <ref id="B17">
        <mixed-citation>Eisenhardt KM (1989) Agency theory: An assessment and review. Academy of Management Review 14(1): 57–74. <ext-link xlink:type="simple" xlink:href="10.5465/amr.1989.4279003" ext-link-type="doi">https://doi.org/10.5465/amr.1989.4279003</ext-link></mixed-citation>
      </ref>
      <ref id="B18">
        <mixed-citation>Fama EF, French KR (1997) Industry costs of equity. Journal of Financial Economics 43(2): 153–193. <ext-link xlink:type="simple" xlink:href="10.1016/S0304-405X(96)00896-3" ext-link-type="doi">https://doi.org/10.1016/S0304-405X(96)00896-3</ext-link></mixed-citation>
      </ref>
      <ref id="B19">
        <mixed-citation>Floridi L, Chiriatti M (2020) GPT-3: Its nature, scope, limits, and consequences. Minds and Machines 30: 681–694. <ext-link xlink:type="simple" xlink:href="10.1007/s11023-020-09548-1" ext-link-type="doi">https://doi.org/10.1007/s11023-020-09548-1</ext-link></mixed-citation>
      </ref>
      <ref id="B20">
        <mixed-citation>Francis BB, Shohfi T, Xin D (2020) Gender and earnings conference calls. SSRN 25: 2020. <ext-link xlink:type="simple" xlink:href="10.2139/SSRN.3473266" ext-link-type="doi">https://doi.org/10.2139/SSRN.3473266</ext-link></mixed-citation>
      </ref>
      <ref id="B21">
        <mixed-citation>Gabaix X, Landier A (2008) Why has CEO pay increased so much? The Quarterly Journal of Economics 123(1): 49–100. <ext-link xlink:type="simple" xlink:href="10.1162/qjec.2008.123.1.49" ext-link-type="doi">https://doi.org/10.1162/qjec.2008.123.1.49</ext-link></mixed-citation>
      </ref>
      <ref id="B22">
        <mixed-citation>Gunning R (1952) The technique of clear writing. McGraw-Hill.</mixed-citation>
      </ref>
      <ref id="B23">
        <mixed-citation>Hooghiemstra R, Kuang YF, Qin B (2017) Does obfuscating excessive CEO pay work? The influence of remuneration report readability on say-on-pay votes. Accounting and Business Research 47(6): 695–729. <ext-link xlink:type="simple" xlink:href="10.1080/00014788.2017.1300516" ext-link-type="doi">https://doi.org/10.1080/00014788.2017.1300516</ext-link></mixed-citation>
      </ref>
      <ref id="B24">
        <mixed-citation>Huang X, Teoh SH, Zhang Y (2014) Tone management. The Accounting Review 89(3): 1083–1113. <ext-link xlink:type="simple" xlink:href="10.2308/accr-50684" ext-link-type="doi">https://doi.org/10.2308/accr-50684</ext-link></mixed-citation>
      </ref>
      <ref id="B25">
        <mixed-citation>Huang X, Krishnan S, Lin P (2018) Tone analysis and earnings management. Journal of Accounting and Finance 18(8): 46–61. <ext-link xlink:type="simple" xlink:href="10.33423/jaf.v18i8.110" ext-link-type="doi">https://doi.org/10.33423/jaf.v18i8.110</ext-link></mixed-citation>
      </ref>
      <ref id="B26">
        <mixed-citation>Huang AH, Wang H, Yang Y (2023) FinBERT: A large language model for extracting information from financial text. Contemporary Accounting Research 40(2): 806–841. <ext-link xlink:type="simple" xlink:href="10.1111/1911-3846.12832" ext-link-type="doi">https://doi.org/10.1111/1911-3846.12832</ext-link></mixed-citation>
      </ref>
      <ref id="B27">
        <mixed-citation>Jensen MC, Meckling WH (1976) Theory of the firm: Managerial behavior, agency costs, and ownership structure. Journal of Financial Economics 3(4): 305–360. <ext-link xlink:type="simple" xlink:href="10.1016/0304-405X(76)90026-X" ext-link-type="doi">https://doi.org/10.1016/0304-405X(76)90026-X</ext-link></mixed-citation>
      </ref>
      <ref id="B28">
        <mixed-citation>Kim A, Muhn M, Nikolaev VV (2024) Bloated disclosures: can ChatGPT help investors process information? Chicago Booth Research Paper (23-07): 2023–59. <ext-link xlink:type="simple" xlink:href="10.48550/arXiv.2306.10224" ext-link-type="doi">https://doi.org/10.48550/arXiv.2306.10224</ext-link></mixed-citation>
      </ref>
      <ref id="B29">
        <mixed-citation>Kramer S, Matějka M (2024) Disturbing the Quiet Life? Competition and CEO Incentives. The Accounting Review 99(2): 279–305. <ext-link xlink:type="simple" xlink:href="10.2308/TAR-2022-0393" ext-link-type="doi">https://doi.org/10.2308/TAR-2022-0393</ext-link></mixed-citation>
      </ref>
      <ref id="B30">
        <mixed-citation>Laksmana I, Tietz W, Yang YW (2012) Compensation discussion and analysis (CD&amp;A): Readability and management obfuscation. Journal of Accounting and Public Policy 31(2): 185–203. <ext-link xlink:type="simple" xlink:href="10.1016/j.jaccpubpol.2011.08.003" ext-link-type="doi">https://doi.org/10.1016/j.jaccpubpol.2011.08.003</ext-link></mixed-citation>
      </ref>
      <ref id="B31">
        <mixed-citation>Larcker DF, Ormazabal G, Taylor DJ (2011) The market reaction to corporate governance regulation. Journal of Financial Economics 101(2): 431–448. <ext-link xlink:type="simple" xlink:href="10.1016/j.jfineco.2011.03.002" ext-link-type="doi">https://doi.org/10.1016/j.jfineco.2011.03.002</ext-link></mixed-citation>
      </ref>
      <ref id="B32">
        <mixed-citation>Lawrence A (2013) Individual investors and financial disclosure. Journal of Accounting and Economics 56(1): 130–147. <ext-link xlink:type="simple" xlink:href="10.1016/j.jacceco.2013.05.001" ext-link-type="doi">https://doi.org/10.1016/j.jacceco.2013.05.001</ext-link></mixed-citation>
      </ref>
      <ref id="B33">
        <mixed-citation>Leippold M (2023) Sentiment spin: Attacking financial sentiment with GPT-3. Finance Research Letters 55: 103957. <ext-link xlink:type="simple" xlink:href="10.1016/j.frl.2023.103957" ext-link-type="doi">https://doi.org/10.1016/j.frl.2023.103957</ext-link></mixed-citation>
      </ref>
      <ref id="B34">
        <mixed-citation>Lesmy D, Muchnik L, Mugerman Y (2019) Doyoureadme? temporal trends in the language complexity of financial reporting. Temporal Trends in the Language Complexity of Financial Reporting (September 26, 2019). <ext-link xlink:type="simple" xlink:href="10.2139/ssrn.3469073" ext-link-type="doi">https://doi.org/10.2139/ssrn.3469073</ext-link></mixed-citation>
      </ref>
      <ref id="B35">
        <mixed-citation>Leuz C, Verrecchia RE (2000) The economic consequences of increased disclosure. Journal of Accounting Research: 91–124. <ext-link xlink:type="simple" xlink:href="10.2307/2672910" ext-link-type="doi">https://doi.org/10.2307/2672910</ext-link></mixed-citation>
      </ref>
      <ref id="B36">
        <mixed-citation>Li F (2008) Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics 45(2–3): 221–247. <ext-link xlink:type="simple" xlink:href="10.1016/j.jacceco.2008.02.003" ext-link-type="doi">https://doi.org/10.1016/j.jacceco.2008.02.003</ext-link></mixed-citation>
      </ref>
      <ref id="B37">
        <mixed-citation>Loughran T, McDonald B (2014) Measuring readability in financial disclosures. The Journal of Finance 69(4): 1643–1671. <ext-link xlink:type="simple" xlink:href="10.1111/jofi.12162" ext-link-type="doi">https://doi.org/10.1111/jofi.12162</ext-link></mixed-citation>
      </ref>
      <ref id="B38">
        <mixed-citation>Ruiz‐Verdú P (2008) Corporate governance when managers set their own pay. European Financial Management 14(5): 921–943. <ext-link xlink:type="simple" xlink:href="10.1111/j.1468-036X.2008.00465.x" ext-link-type="doi">https://doi.org/10.1111/j.1468-036X.2008.00465.x</ext-link></mixed-citation>
      </ref>
      <ref id="B39">
        <mixed-citation>SEC [U.S. Securities and Exchange Commission] (2011) Executive compensation. <ext-link xlink:type="simple" xlink:href="https://www.sec.gov/answers/execomp.htm" ext-link-type="uri">https://www.sec.gov/answers/execomp.htm</ext-link></mixed-citation>
      </ref>
      <ref id="B40">
        <mixed-citation>Van Essen M, Otten J, Carberry EJ (2015) Assessing managerial power theory: A meta-analytic approach to understanding the determinants of CEO compensation. Journal of Management 41(1): 164–202. <ext-link xlink:type="simple" xlink:href="10.1177/0149206311429378" ext-link-type="doi">https://doi.org/10.1177/0149206311429378</ext-link></mixed-citation>
      </ref>
      <ref id="B41">
        <mixed-citation>Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30. <ext-link xlink:type="simple" xlink:href="10.48550/arXiv.1706.03762" ext-link-type="doi">https://doi.org/10.48550/arXiv.1706.03762</ext-link></mixed-citation>
      </ref>
      <ref id="B42">
        <mixed-citation>Yermack D (1996) Higher market valuation of companies with a small board of directors. Journal of Financial Economics 40(2): 185–211. <ext-link xlink:type="simple" xlink:href="10.1016/0304-405X(95)00844-5" ext-link-type="doi">https://doi.org/10.1016/0304-405X(95)00844-5</ext-link></mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>
