The Structural Topic Model

The STM is a generative probabilistic model based on Bayesian distribution that allows, thanks to a semi-supervised procedure, to identify the thematic patterns underlying the text (latent topics or knowledge areas). Unlike LDA, STM supports the use of relevant metadata as covariates for model estimation (see Figure I; X and Y parameters) and to estimate the impact of such covariates on the size and content of latent topics. In the STM, each document is a mixture of all topics, and each topic ( Σi ) is a distribution of words (W ) included in documents. Given a number of topics = K , each latent topic then has an assigned probability, from 0 to 1, to appear in a document ( θi , i.e., topic proportion) and the sum of θ for all topics in each document = 1.

Figure I: STM generative algorithm

From: Roberts et al., 2016.

Topics-Metadata Relation

To better qualify knowledge areas, the effect of relevant corpus metadata on topic proportion has been assessed via logistic regression analysis, using the EstimateEffect function provided by the STM R package.

Looking at the impact of Environmental Consciousness (EC) on knowledge areas (Figure II), the analysis shows that sustainability concerns positively affect the emergence of Topic 1, Topic 3 and Topic 4 while result negatively associated with Topic 5, Topic 7, and Topic 8.

Figure II: Impact of Environmental Consciousness on topic proportion

Note. Environmental Consciousness on the horizontal axis; Topic proportion on the vertical axis.

Topic 2 and Topic 6, on the other hand, do not appear to be particularly affected by the level of firms' EC, showing a constant proportion in the corpus at both low and high levels of EC. Topics evidencing a negative relationship with EC (i.e., Topic 5, Topic 7, and Topic 8) result as particularly associated with the social dimension of sustainable development (Figure III), while Topic 1 and Topic 6 mostly relate to the environmental dimension. Topic 2, Topic 3 and Topic 4, on the other hand, are found to be focused on the economic dimension of sustainable development.

Figure III: Impact of Sustainability Dimensions on topic proportion

Note. ENV = Environmental Dimension; ECO = Economic Dimension; SOC = Social Dimension.

Knowledge Areas Description

Based on the emerged results, it is now possible to describe the eight knowledge areas of the energy SMEs sample. As shown in Table I, Topic 8, which reflects knowledge aspects related to business development and research activities, results as the most represented knowledge area (16.38%), followed by Topic 7 (15.20%) which encompasses the customer and societal implications of business. Other topics are found to be homogeneously distributed, with an expected proportion from 9.82% of Topic 1 to 12.46% of Topic 2.

Label Theta Top Terms
Note. The table reports the expected proportion (i.e., Theta) within the research corpus and the most representative terms of each knowledge area. Moreover, by inspecting most relevant terms and documents, a label has been assigned to each area.

Further details on the description of each knowledge area can be found hereafter.


T1: Energy Alternatives

This area is associated with high levels of Environmental Consciousness and primarily focuses on environmental aspects of sustainable development. It encompasses a wide range of understandings referring both to fossil energy sources and their environmental impact (e.g., “principal driver climate change greenhouse effect causes vast quantities of carbon dioxide emit byproduct industrial process transportation electricity generation anthropogenic source modern measure reduce carbon emission atmospheric […]”; Doc. ID: 30,024) and to renewable and low-impact energy sources (e.g., “single fast efficient realistic alternative reach fossil free vehicle fleet phase fossil fuel quickly replace biofuel biofuel directly reduce emission fossil vehicle reduction lignol problem climate cost effective biofuel model operation achieve goal take year […]”; Doc. ID: 34,311).


T2: Policy & Regulations

The second area results not particularly influenced by Environmental Consciousness level and mostly refers to economic sustainability implications of energy policies and regulations (e.g., “modification energy certification procedure building draft article financial incentive improvement energy efficiency obligation report prior nature energy rating sale rent property announcement mean owner party represent real estate miss modification affect explain today force june legal text derogate regulate basic procedure energy certification […]”; Doc. ID: 13,187), new responsible practices being imposed on firms (e.g., “royal decree oblige large company perform energy audits year gese conduct energy audits aim improve energy efficiency reduce energy consumption company […]”; Doc. ID: 12,898) and consumer protection policies (e.g., “accordance spanish law lopd personal datum protection energías renovable […]”; Doc. ID: 14,492).


T3: Energy Costss

The thrird area positively associates with Environmental Consciousness and focuses on economic and environmental sustainability concerns emerging from energy pricing strategies and energy efficiency measures aimed at more efficient consumption (e.g., “install heat pump housing economical ecological indeed reduce cost energy renovation limit work later heat pump reduce energy consumption compare electric heating system […]”; Doc. ID: 11,166) and energy costs reduction for consumers (e.g., “good energy guarantee percentage gas electricity buy renewable source alternative fix price tariff […]”; Doc. ID: 26,557).


T4: Financial Investments

This area reflects a high Environmental Consciousness of firms and focuses on the economic sustainability aspects related to investment (e.g., “chief executive say it fantastic achievement reach financial close phase world s large offshore wind project week cop conclude glasgow today mark important early milestone delivery net zero acceleration programme […]”; Doc. ID: 21,426) and public financing initiatives (e.g., “addition exist national government grant scheme renewable energy […]”; Doc. ID: 28,592) aimed at building renewable energy infrastructure such as wind and solar farms.


T5: Green Product Design

The area includes a wide variety of content referring to B2C products and services showing an ecological (e.g., “economic form ecological cultivation increase production biological product […]”; Doc. ID: 2,218) or sustainable nuance (e.g., “neutral suitable biobase bag sustainable alternative plastic wood fiber base […]”; Doc. ID: 18,977). Although this area addresses issues of societal relevance, the STM results show that firms with high Environmental Consciousness tend not to cover this knowledge area.


T6: Energy Management

The area focuses on environmental sustainability aspects related to processes and technologies that enable efficient energy management (e.g., “solar produce distribute electricity intelligently possible depend time individual consumption situation build in energy management system decide energy need […]”; Doc. ID: 5,001) and storage (e.g., “system important prerequisite successful energy transition power storage power generation renewable energy […]”; Doc. ID: 3,486). As seen in T2, the proportion of T6 remains stable over the corpus, meaning that the prominence of this knowledge area seems not to be affected by the level of Environmental Consciousness.


T7: Human Care

The area results as strongly linked to issues of social relevance referring to Corporate Social Responsibility (e.g., “invest social responsibility initiative mean business involve local event community experience […]”; Doc. ID: 26,820), intended both as a focus on the well-being and growth of employees (e.g., “organisation industry discover attract retain right talent require excellent salary corporate wellbeing program quickly emerge ultimate employer differentiator invest workplace wellbeing […]”; Doc. ID: 21,966), and the support of the human communities in which organizations operate (e.g., “strategy instance claim care environment involve local sustainability drive clean community customer you re authentic credible build emotional connection base feeling […]”; Doc. ID: 20,934). However, this area is found to be negatively associated with Environmental Consciousness, meaning that when Environmental Consciousness is high this topic is addressed with lower frequency.


T8: Research & Development

Negatively associated with Environmental Consciousness, this last area focuses on social and economic aspects, mainly concerning technical innovation (e.g., “global ambition support utility world accelerate energy transition technology innovation […]”; Doc. ID: 19,919), research & development (e.g., “engineering experience involve aspect energy research year include modelling datum analysis project management […]”; Doc. ID: 29,309) that organizations perform and bring to the market.