Stigmergic influence of simple bots on human cooperation in digital environments

Bassanetti, Thomas; Cezera, Stéphane; Delacroix, Maxime; Escobedo, Ramón; Blanchet, Adrien; Sire, Clément; Theraulaz, Guy

doi:10.1140/epjds/s13688-026-00653-2

Stigmergic influence of simple bots on human cooperation in digital environments

Research
Open access
Published: 13 April 2026

Volume 15, article number 49 (2026)
Cite this article

You have full access to this open access article

Download PDF

EPJ Data Science Aims and scope Submit manuscript

Stigmergic influence of simple bots on human cooperation in digital environments

Download PDF

Thomas Bassanetti^1,2,
Stéphane Cezera³,
Maxime Delacroix¹,
Ramón Escobedo^2,4,
Adrien Blanchet^3,5,
Clément Sire¹ &
…
Guy Theraulaz^2,5

335 Accesses
1 Altmetric
Explore all metrics

Abstract

In the digital era, human cooperation is increasingly mediated by indirect social cues such as ratings, reviews, and other digital traces left in online environments. These traces often guide collective behavior via stigmergy, a coordination mechanism whereby individuals interact through modifications of a shared environment. In this study, we explore how simple model-driven bots can influence human cooperation or defection in a competitive rating game inspired by online marketplaces. Participants, unaware of the bots’ presence, interacted with either four human partners or four bots exhibiting predefined behaviors—cooperative, neutral, deceptive, or optimized for group performance. We show that the presence and behavior of bots significantly affect human strategies and performance. Higher levels of cooperation among bots improve human outcomes but also increase the frequency of deceptive human strategies, suggesting exploitation of reliable social information. Conversely, in less cooperative environments, participants adopt more collaborative or neutral behaviors to preserve informational value. By classifying individuals into three behavioral profiles—collaborators, neutrals, and defectors—we develop a linear regression model using three cues: the average value of rated cells, the diversity of rated cells, and the player’s rank. These cues allow accurate prediction of behavioral profile distributions across experimental conditions. An adaptive agent-based model further reproduces the empirical results. Our findings demonstrate that even simple bots can strongly influence collective dynamics in human groups. These insights have implications for the design of recommendation systems, the regulation of automated agents, and the understanding of cooperation and deception in digital societies.

Warmth and competence in human-agent cooperation

Article Open access 22 May 2024

Analyzing Human Decision Making Process with Intention Estimation Using Cooperative Pattern Task

Towards Cognitive Bots: Architectural Research Challenges

1 Introduction

Human cooperation underlies the emergence of social norms, the functioning of institutions, and the production of collective goods. Yet cooperation in large and partly anonymous groups remains difficult to sustain, because individuals often lack reliable information about others’ intentions, reliability, or past behavior [1, 2]. In such contexts, cooperation depends on the flow of indirect social cues, such as public signals, behavioral traces, reputational markers, and patterns of past actions, that help individuals align expectations and coordinate strategies [3, 4]. These cues are frequently subtle or ambiguous, but collectively they shape perceptions of what is normal, desirable, or strategically advantageous within a group [5, 6].

A general mechanism that integrates these ideas is stigmergy. Originally introduced by Pierre-Paul Grassé to describe collective nest building in termites, stigmergy refers to coordination through persistent modifications of a shared environment [7]. Instead of communicating directly, individuals leave physical or chemical traces that subsequently influence the actions of others [8]. Although rooted in ethology, stigmergy has become a powerful conceptual framework for understanding how human groups self-organize in settings ranging from collaborative editing to online marketplaces and shared navigation systems [9, 10]. In these digital environments, behavioral traces such as ratings, likes, clicks, or comments act as informational cues and shape large-scale patterns of participation, reputation, and cooperation [11, 12].

The increasing digitization of social life has intensified the importance of such traces. Digital platforms algorithmically rank, amplify, and display signals that encode others’ behavior in real time [4, 11]. These traces often persist and accumulate, guiding how individuals infer quality, trustworthiness, or social norms [10]. However, a crucial transformation arises from the fact that digital traces are no longer generated solely by humans. Automated agents, the so-called social bots, now participate extensively in online environments and contribute to shaping these informational landscapes [13]. Bots act at machine timescales, with high consistency and persistence, and can therefore create, reinforce, or distort perceived behavioral patterns in ways that humans may misinterpret as genuine social signals.

Most research on bots has focused on their disruptive capacities. Bots have been shown to amplify misinformation, distort collective attention, and manipulate perceived consensus [14, 15]. Large-scale analyses reveal that bots disproportionately contribute to the early diffusion of low-credibility content, increasing human exposure and resharing of misleading information [14]. Modeling studies further demonstrate that bot-generated content competes with human information, sometimes dominating information ecosystems and altering public opinion trajectories [16]. This perspective frames bots primarily as sources of manipulation and risk in digital societies.

Yet a growing body of work suggests a more nuanced view. Under some conditions, bots can promote cooperation, enhance coordination, or stabilize prosocial behavior in hybrid human–agent populations. In networked public-goods experiments, even a single autonomous agent embedded in a human group can increase cooperation by reshaping local structural connections [17, 18]. Evolutionary game-theoretic studies show that simple “committed” or persistent agents can shift group behavior toward cooperative equilibria, even in one-shot anonymous interactions that otherwise favor defection [19, 20]. Extensions of these models reveal that bots with fixed behavioral strategies can promote cooperation across discrete, continuous, and mixed strategic frameworks [21]. Related work highlights how committed minorities, whether human or artificial, can accelerate norm formation, trigger coordination transitions, or overturn established conventions [20].

In organizational and virtual-team environments, bots acting through limited interaction rules can influence micro-level social processes. Studies of Slackbot, for example, show that automated agents can shape relational communication and affect social-emotional dynamics within professional teams [22]. In virtual-strategy contexts such as multi-agent gaming environments, bots endowed with simple heuristics or swarming algorithms can support exploration, optimize group strategy, or structure collective behavior [23]. These results collectively challenge the dominant view of bots as primarily malicious, suggesting instead that simple artificial agents can serve as stabilizers of cooperation or catalysts of collective intelligence.

Recent experimental evidence extends this perspective by demonstrating that human–bot co-play cannot be reduced to a binary opposition between prosocial preferences and confusion. In a large-scale study of one-shot prisoner’s dilemma games, humans responded positively to persistently cooperative “zealot” bots, yet cooperation declined sharply when participants learned that these bots could not derive material benefits, a pattern indicating that belief-driven expectations, perceived intentionality, and authenticity shape responses to artificial agents [24]. These findings demonstrate that even extremely simple bots with fixed strategies and incapable of reciprocity can reshape human decisions in systematic ways, not because they communicate directly, but because their consistent behavior alters how individuals interpret the social environment.

Despite these advances, little is known about how bots influence human behavior through stigmergic interactions alone. Existing experimental studies typically involve direct interactions, strategic reciprocity mechanisms, punishment or reward systems, or identity-based cues [3, 6]. Conversely, research on social bots in information ecosystems focuses on observational consequences such as manipulation, diffusion dynamics, and large-scale behavioral shifts, yet it does not examine the fine-grained decision processes that unfold within controlled groups [13, 14]. Far fewer studies examine situations in which bots exert influence purely by depositing behavioral traces in a shared environment, without communication, identity cues, or algorithmic sophistication.

Yet such situations are increasingly common. Many online platforms mediate cooperation through trace-based interfaces: rating systems, popularity signals, public histories of contributions, and algorithmically generated ranking cues. Users rarely know whether a given trace was produced by a human, a bot, or an automated system, yet these signals guide expectations, shape norms, and influence cooperation or defection. Understanding trace-mediated hybrid interactions is therefore essential for data science, platform governance, and the design of trustworthy human–AI systems.

Collective search tasks provide a clear case where stigmergic traces structure cooperative decision-making. In such tasks, individuals must explore uncertain environments while inferring useful information from the traces left by others [25]. Ratings or evaluations serve both as signals about the underlying environment and as social cues indicating how others have behaved. Prior research shows that visible contributions can increase cooperation by establishing expectations of norm compliance [6, 26], whereas misleading traces can erode trust and generate cascades of defection [10]. Bots embedded in these environments can systematically bias the informational landscape by persistently leaving cooperative, neutral, or deceptive signals. The resulting behavioral dynamics emerge from interactions between human inference processes, cumulative traces, and iterative decision-making.

In this study, we investigate how minimal bots, defined as simple artificial agents with predetermined behavioral profiles, modulate human cooperation in a collective decision-making task solely through the stigmergic traces they produce. Participants repeatedly engage in a grid-based search problem derived from public goods experiments [25]. They may leave ratings as visible traces, which serve as the only mechanism for coordination or influence. In hybrid conditions, participants unknowingly interact with bots programmed either to provide informative (cooperative), misleading (defective), or uninformative (neutral) signals. Bots do not communicate, adapt, or engage in reciprocity. Their influence arises purely from the persistent traces they create in the shared environment. The bots are controlled by a behavioral model that has been shown to faithfully reproduce the actions of human participants in [25]. In addition, this same type of model, together with the methodology used to define and analyze behavioral observables in [25], allows us to quantify, characterize, and interpret the actions and strategies of human participants interacting with bots.

This controlled setting enables systematic exploration of how artificial traces shape human expectations, behavioral strategies, and group-level outcomes. By comparing hybrid groups with fully human groups in which the proportions of cooperators, neutrals, and defectors are manipulated directly, we can assess whether human behavior responds differently to artificial versus human-generated traces and whether bots effectively behave as committed agents. Our analysis combines behavioral data, modeling of decision strategies, and statistical inference to characterize how different types of traces influence cooperation.

By situating our work at the intersection of behavioral experiments, human–AI interaction, and complex systems, this study makes three specific contributions. First, it isolates a form of human–agent influence that has received comparatively little direct experimental attention, namely, influence mediated solely by persistent traces in a shared environment, without communication, reciprocity, or identity cues. Second, it shows that minimal artificial agents do not simply improve or degrade coordination mechanically, by changing the quality of available information. They also reshape the strategic incentives faced by humans, thereby shifting the equilibrium distribution of collaborative, neutral, and deceptive behaviors. Third, by combining controlled experiments with interpretable behavioral and statistical models, our study identifies a small set of environmental cues that are sufficient to account for these shifts. In this sense, the bots are both controlled generators of stigmergic environments and experimentally tractable artificial agents whose traces can reorganize human collective dynamics.

2 Experimental setup

2.1 Description of the game

The experimental setup is based on a game developed in [25] to study how groups of individuals leave and use digital traces in a controlled environment. The game uses a 5-star rating system, similar to those used on online marketplaces and platforms, allowing users to rate products, services, or sellers and use these ratings to identify the best options.

The game is played in groups of five players, who simultaneously and independently explore a common $15 \times 15$ table containing 225 hidden values (Fig. 1A). The goal is to identify the cell with the highest value. Cells represent options, the hidden number indicating the intrinsic quality of the option. Values range from 0 to 99 and are randomly assigned to the cells (Fig. 1D) according to the distribution shown in Fig. 1E. Participants play in isolation, without access to the actions or screens of the others (Fig. 1B), and take turns interacting with the shared table through an online application specifically developed for the experiments (Fig. 1C).

The game progresses over 20 rounds, with each round requiring each of the 5 players to visit and rate 3 distinct cells. When visiting a cell, a player discovers its hidden value and must then rate it using a 5-star scale. A round ends when all group members have visited and rated their 3 cells. As the next round begins, the colors of the cells are updated based on the fraction of stars that have accumulated in each cell since the game began. This fraction is determined by dividing the number of stars a cell has received by the total number of stars across all cells. The color scale goes from white ($0\,\%$) to black ($100\,\%$) through a gradient of shades of red (Fig. 1F), highlighting cells that have accumulated the highest fraction of stars. This evolving color map acts as a long-term collective memory for the group.

The game includes a scoring system that ultimately determines the payouts of the participants, thus creating competition among them. Indeed, a player’s score increases each round by the values of the 3 cells they visit, regardless of the ratings they assign to those cells, thus encouraging them to visit high-value cells. During a one-hour session, the five participants play 10 to 12 games, and their final score is cumulated over these games. Ultimately, they are ranked according to their cumulated score and paid accordingly (20 €, 15 €, 10 €, 10 €, 10 € for players ranked from first to fifth place; see Materials and Methods for more details).

2.2 Experimental conditions

In our previous study [25], three different behavioral profiles were observed among human participants: collaborators, who rated cells proportionally to their value, thus sharing informative signals to the group; defectors, who rated cells in inverse proportion to their value, thus misleading the other members of the group; and neutral players, whose ratings conveyed no useful information, because they rated all cells either with the same number of stars, randomly, or with an indiscernible pattern. A computational model was then designed that reproduces with high fidelity the behavior of human players during both exploration and rating phases, across the three strategic profiles described above. Based on this model, we constructed autonomous agents, or “bots”, that consistently behaved like human collaborators, defectors, or neutral players. These validated bots now enable the implementation of a controlled strategic environment for participants.

In the present experiment, groups of five participants were brought into the room and seated at individual workstations (Fig. 1B). To examine how human behavior is shaped by the strategic environment, we designed a series of experimental conditions in which each participant was in fact playing with four bots adopting one of the three predefined strategies: Col (collaborator), Def (defector), or Neu (neutral). The experimental setup and user interface remained unchanged across conditions, and participants were unaware of the true nature of their co-players. From their perspective, they were part of a group of five humans playing together.

Ten experimental conditions were considered, involving a total of 185 participants. In the first condition (70 participants), the five human participants actually played together, providing the baseline behavior for comparison. In the remaining nine conditions (115 participants), each human participant played in the same physical setup but alongside four bots. We considered the five possible combinations of strategies combining collaborators and defectors: 4 Col – 0 Def, 3 Col – 1 Def, 2 Col – 2 Def, 1 Col – 3 Def, and 0 Col – 4 Def. Next, we considered three conditions where the four bots adopted the same neutral strategy, consistently giving the same rating irrespective of the value of the cell. Three different fixed ratings were used: one star (Neu-1), three stars (Neu-3), and five stars (Neu-5). Finally, in the tenth condition, participants played with four optimized bots (Opt) designed to maximize the group score when playing together [25].

3 Results

3.1 Impact of bots on participant’s performance

We begin by examining the mean normalized scores of both bots and human participants across each experimental condition (see Fig. 2). The results show substantial variability, highlighting how group composition, and thus bot behavior, can significantly affect human performance. Notably, there is a strong positive correlation between bot and human scores, indicating that higher bot scores are associated with higher participant scores.

A detailed analysis of the mean normalized scores reveals a clear trend: groups with a higher number of collaborator bots tend to achieve better performance. For example, the scenario with four collaborator bots (4 Col - 0 Def) resulted in $\langle S \rangle = 0.56$, while the scenario with four defector bots (0 Col – 4 Def) resulted in $\langle S \rangle = 0.31$. Experiments with neutral bots, regardless of their consistent ratings (Neu-1, Neu-3, Neu-5), yielded tightly clustered mean normalized scores around 0.43. The experiment with optimized bots (Opt) achieved a mean normalized score of 0.48, the second-highest overall. Interestingly, the condition with five human participants achieved a mean normalized score of 0.40, placing it among the lower-performing groups.

To further analyze participant behavior, Figs. 3, 11, and 12 present key observables characterizing the visits and ratings of human participants in the group: $q(t)$, $Q(t)$, $p(t)$, and $P(t)$, which represent the instantaneous (at round t) and cumulative (up to round t) visit and rating performance of the human participants. Additionally, $V_{1}(t)$, $V_{2}(t)$, and $V_{3}(t)$ indicate the average value of the first-, second- and third-best cells visited by humans at round t, while $B_{1}(t)$, $B_{2}(t)$, and $B_{3}(t)$ quantify the probability of revisiting the first-, second-, and third-best cell from the previous round of human participants. Full definitions are provided in the Materials and Methods (see Sect. 5.4).

In scenarios with both collaborator and defector bots, human participants achieved higher normalized scores when more collaborator bots were present (Fig. 3A). Participants in these settings tended to open higher-value cells (Fig. 3B, 3C, 3F, 3G, 3H) and assign them higher ratings (Fig. 3D, 3E). They also showed a greater tendency to revisit previously explored high-value cells (Fig. 3I, 3J, 3K), particularly in more collaborative environments.

In contrast to the diversity of performance observed in the experiments with collaborator and defector bots, the three experiments with neutral bots showed remarkable similarity in performance metrics (Fig. 11), suggesting that the consistent ratings of neutral bots had a similar influence on human behavior. Note that since neutral bots tend to revisit and rate the cells with the best scores, they have a somewhat similar effective impact as collaborators in the long run. In fact, performance in these scenarios was analogous to conditions with two collaborator bots and two defector bots. The optimized bots (Fig. 12) led to participant behavior similar to the scenario with three collaborator bots and one defector bot. Finally, the experiment with five humans mirrored the scenario with one collaborator bot and three defector bots, consistent with the findings of [25].

Figure 4 shows the probability of human participants finding the highest-value cells in the different experiments. Similar to the mean normalized scores, the probability of finding high-value cells increased with the level of cooperation in the group. Notably, the five-human condition ranked among the lowest in this metric.

Human participants consistently outperformed bots across all conditions (Fig. 5). However, their average rank varied across experimental conditions. In groups with fewer defector bots, participants ranked higher, likely due to improved quality of social information in the trace. Neutral bot scenarios again produced near-identical rank distributions. In the optimized bot condition, most participants (66%) ranked first, though a notable portion (19%) ranked last, suggesting that some participants failed to interpret the optimized bot strategy effectively.

Bots’ relatively poor performance compared to humans can be attributed to their inability to adapt to their teammates (3 bots and 1 human) and to the colored table. In contrast, human participants could observe natural cues and adapt their strategies accordingly, as discussed below.

As introduced in [25], participant behavior was classified into three profiles: collaborators (whose average ratings increase with cell value), defectors (ratings decrease with value), and neutrals (ratings independent of value). Figure 6 and Table 3 show the proportions of human participants exhibiting these behaviors across experiments. In conditions with varying proportions of collaborator and defector bots, an increase in collaborator bots correlated with more deceptive behavior among humans, suggesting that deception is advantageous in cooperative groups, as individuals can exploit shared information while misleading the others. However, in less cooperative groups, deceptive strategies were less effective, and humans tended toward more collaborative or neutral behaviors. This also suggests that when faced with exceptionally low-quality information (e.g., with many defector bots), individuals turn to collaboration or a neutral behavior to leave some high-quality information for themselves. Experiments with optimized bots had the highest proportion of collaborative behavior among participants. The experiment with five humans showed similar behavioral proportions to the scenario with four collaborating bots.

3.2 Model of the visit and rating strategies

We now examine the agent-based model, introduced in [25] and detailed in Materials and Methods, to simulate human behaviors across the ten experimental conditions. These agents mimicking human behaviors, referred to as Mimic agents, have a strategy that is divided into two parts: the visit strategy and the rating strategy.

Each participant’s behavioral profile—whether collaborator, neutral, or defector—is associated with a distinct rating strategy, which we found to be consistent across experiments (see Fig. 14). This consistency implies that a single set of rating strategies can represent human participants across all nine bot conditions, removing the need to define separate strategies for each experiment.

For simplification, the probabilities of rating cells with 1 to 4 stars were grouped together, while those for 0 and 5 stars were modeled using a sigmoid function (see Eq. (6)) for collaborators and defectors, and using a linear function (see Eq. (7)) for neutrals. Moreover, the probabilities of rating a cell of value V with 1, 2, 3, or 4 stars are all equal and given by $P_{1234}(V) = [1 - P_{0}(V) - P_{5}(V)] / 4$. These resulting probabilities are shown in Fig. 7, with parameter values detailed in Table 5.

Simulations were then run with one Mimic agent playing with four bots. In each simulation game, the Mimic agent’s behavioral profile (i.e., rating strategy) was randomly set according to the observed fractions in the corresponding experiment. The visit strategy was then derived by minimizing the error between experimental and simulated observables (see Eq. (8)). The resulting parameters are presented in Table 4. Notably, since the observables are almost identical in the three neutral bot experiments (see Fig. 11), the same visit strategy was used for these three situations.

Analyzing the threshold parameters $a_{1}$, $a_{2}$, and $a_{3}$ in experiments with collaborator and defector bots provides insights. In particular, as the number of defector bots increases, participants begin revisiting cells from the previous round at lower thresholds—indicating a tendency to settle for lower-value cells rather than continuing to explore for higher ones. In addition, across all nine bot experiments, the parameter α is positively correlated with conditions that lead to high-quality social information. A higher α corresponds to a stronger preference for visiting highly marked (i.e., dark-colored) cells, whereas a lower α leads to a more uniformly distributed selection among the marked cells (see Eq. (4)).

Hence, we find that in the conditions where the social information is trustworthy (i.e., where dark cells correspond to higher values than light cells), the human participants consistently tend to give a larger credit to the cell colors on the table. This also indicates that the human participants are well aware of the degree of collaboration of the four other members of their group and can then adapt their visit and rating strategy according to this qualitative observation. Note that this also explains the better scores achieved, on average, by the participants compared to bots using a fixed strategy. This adaptation highlights the performance gap between humans and bots. While humans adjust their strategies based on group behavior and table properties, bots lack this adaptability, contributing to their comparatively lower scores. In fact, this analysis lays the foundation for designing bots that are able to adapt to their environment.

Overall, the simulation results shown in Figs. 2, 3, 11, 12, 13, 4, and 5 indicate that the model reproduces experimental behavior accurately and offers a faithful representation of participant strategies.

3.3 Predicting the behavioral profiles of human participants

3.3.1 Cues available to human participants

In the previous section, we developed a model for understanding human behavior across experimental conditions. However, this model does not explicitly predict or explain the distribution of observed behavioral profiles within each condition. The purpose of the following analysis is therefore to identify which cues available in the stigmergic environment are sufficient to explain how human participants adjust their strategic profile across conditions. This step is essential because it links the informational structure generated by bots to the observed redistribution of human behaviors. Therefore, it is important to examine more closely the factors that influence an individual’s decision to engage in cooperative, neutral, or deceptive behavior in response to the actions of others, thereby shaping a specific social information context (the colored table).

Figure 6A illustrates that as the number of collaborating bots in the group increases, human participants tend to engage in deceptive behavior more frequently. This suggests that a highly collaborative environment may paradoxically encourage deception. The α parameter discussed earlier strongly suggests that participants have a discernible perception of the trustworthiness of the colored table and, by extension, the qualitative degree of cooperation among other group members. But beyond this perception, what other factors influence individual behavioral choices?

We now introduce several natural qualitative cues available to human participants to evaluate their environment and the properties of social information. First, individuals can judge whether highly colored cells correspond to high- or low-value cells. This evaluation is encapsulated in the observable $P(t)$ (formally defined in Sect. 5.4), which represents the average value of colored cells weighted by their respective evaluations since the start of the game. Figures 8A–D show significant variation in $P(t)$ across different experimental conditions. In scenarios with collaborator bots, ratings are predominantly aligned with high-value cells, while in cases with defector bots, ratings tend to favor low-value cells. In the experiment with optimized bots (Fig. 8C), ratings focus on cells with very high values. The experiments with neutral bots (Fig. 8B) and with five humans (Fig. 8D) are similar to the mixed condition with two collaborator and two defector bots. Therefore, $P(t)$, which can be qualitatively evaluated by players (especially in the later rounds) by visiting cells that have been visited and rated by others, provides a reliable indication of the level of cooperation of other group members. However, despite the reliability of $P(t)$ as an indicator of the level of cooperation, experiments with optimized bots and those with four collaborating bots, despite having high $P(t)$ values, show different fractions of collaborators. Therefore, $P(t)$ may not be the only cue that determines behavior. In the following sections, we will use regression models that incorporate $P(20)$ as a quantifier of collaboration.

Another accessible cue is the effective number of distinct cells that have been rated. While a large number of colored cells may be useful when they correspond to high-value cells, excessive dispersion can obscure social information and may also signal the presence of defectors. This dispersion is quantified by the Inverse Participation Ratio of $\mathbf{P}(t)$, $\mathrm{IPR}(\mathbf{P}(t))$ (see Sect. 5.4 for a formal definition), which effectively provides a measure of the apparent complexity of the colored map. In the experiments with collaborator and defector bots (see Fig. 8E), $\mathrm{IPR}(\mathbf{P}(t))$ varies widely: the more defector bots present, the higher the IPR. When many collaborators are present, IPR is low, reflecting more focused evaluations. Figure 8G shows that the IPR is especially low with optimized bots because, compared to the collaborator bots, these bots only rate the cells with very high values. In subsequent sections, we define regression models that include $\mathrm{IPR}(\mathbf{P}(20))$, the IPR at the end of a game, as a measure of the effective number of rated cells.

Finally, another natural cue available to human participants is their rank among the five players in the group at the end of each game (see Fig. 5; recall that each participant played around a dozen games during a one-hour session). This rank was explicitly displayed by the user interface at the end of each game. A low (i.e., good) rank signals an effective strategy, while a high (i.e., bad) rank suggests room for improvement and may prompt players to reconsider and adjust their behavior.

3.3.2 Linear model for predicting individual behavioral profiles

We now introduce linear regression models to predict the distribution of behavioral profiles across conditions from three cues that are directly available to participants during the game: the average value of rated cells at the end of the game, $P(20)$, the effective diversity of rated cells, $\mathrm{IPR}(\mathbf{P}(20))$, and the player’s rank (see Sect. 5.4 for a detailed definition of these observables). $P(20)$ captures the average informational value of the public trace and the degree of collaboration of the players, as it effectively measures a mean correlation between the value of the cells and the total number of stars received by each of them. $\mathrm{IPR}(\mathbf{P}(20))$ captures the spatial dispersion of the distribution of stars, and hence the apparent complexity of the colored table. Finally, the rank captures the participant’s recent success, which can motivate them to keep or change their strategy. These observables are not introduced as purely technical quantities but as operational proxies for the cues that participants can plausibly extract from the shared environment during repeated play.

The linear regression models presented below are not intended as cognitive models of decision-making in a strong sense. Their role is to test whether a small number of observable stigmergic cues is sufficient to account for the systematic variation in strategic behavior across bot environments and thereby to connect the informational landscape generated by bots to the systematic shifts in the distribution of behavioral profiles measured in the experiments. In this sense, the regression models act as an explanatory bridge between stigmergic cues and strategic adaptation. The full methodology is described in the Materials and Methods section.

We find that only using a single feature ($P(20)$, or $\mathrm{IPR}(\mathbf{P}(20))$, or rank) in a regression model does not provide strong predictive power for the fractions of the three behavioral profiles. Nevertheless, even these simple one-feature models reveal that $P(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$ are the best individual predictors, with similar correlations to the data, while rank is less predictive on its own.

We then consider a linear model incorporating the two main features, $P(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$, hereafter referred to as the “PI model” (see Fig. 9A). Fitting this model to the data resulted in a prediction error of $E = 0.204$ (where the error is defined in Materials and Methods) and a coefficient of determination $R^{2} = 0.45$. As shown in Table 6, the regression parameters for collaborators indicate that both features contribute almost equally to the prediction. For defectors, however, $\mathrm{IPR}(\mathbf{P}(20))$ has a stronger correlation with the observed data than $P(20)$. This is confirmed by the partial $R^{2}$ obtained by dropping each cue, $R^{2}_{\mathrm{P}} = 0.32$ and $R^{2}_{\mathrm{I}} = 0.39$, showing that $\mathrm{IPR}(\mathbf{P}(20))$ removes slightly more variance than $P(20)$.

Finally, we have also examined a linear model that includes all three features: $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and rank, hereafter referred to as the “PIR model” (see Fig. 9B). This model reduces the prediction error to $E = 0.159$ and leads to a significantly higher $R^{2} = 0.76$. This is at the cost of introducing two additional regression coefficients, for a total of six parameters fitted to twenty independent data points. Analysis of the regression coefficients (see Table 6) confirms that the rank contributes less than the other two features, as already suggested by the single-feature regressions. This is also confirmed by the respective partial $R^{2}$, $R^{2}_{\mathrm{P}} = 0.71$, $R^{2}_{\mathrm{I}} = 0.74$, and $R^{2}_{\mathrm{R}} = 0.57$, showing that $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$ have a similar impact in reducing the residual variance, higher than that of the rank.

It is worth noting that in both models described above, the quantities $P(t)$ and $\mathrm{IPR}(\mathbf{P}(t))$ are evaluated at the end of the game (i.e., at the final round, $t=20$). Interestingly, considering the midpoint of the game ($t=10$) yields similar, though slightly higher, prediction errors ($E=0.211$ for the model with $P(10)$ and $\mathrm{IPR}(\mathbf{P}(10))$ and $E = 0.186$ for the model including rank as well).

In addition, we tested alternative linear models that included other features (e.g., fidelity F or other qualitative markers of cooperation). However, the three cues/features ($P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and rank) consistently demonstrated the highest explanatory power.

3.3.3 Interpretation of the model

Let us now examine the actual equations used to predict the fractions of collaborators, neutrals, and defectors in the PI and PIR models.

For the PI model, the fractions are given by

$$ \textstyle\begin{cases} C_{\text{pred}} = \mu _{C} + 0.13 \, \hat{P} + 0.13 \, \hat{I}, \\ N_{\text{pred}} = \mu _{N} - 0.04 \, \hat{P} + 0.00 \, \hat{I}, \\ D_{\text{pred}} = \mu _{D} - 0.09 \, \hat{P} - 0.13 \, \hat{I}, \end{cases} $$

(1)

where P̂ and Î are the standardized values of $P(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$, respectively, and $\mu _{C} = 0.25$, $\mu _{N} = 0.45$, and $\mu _{D} = 0.30$ are the mean fractions of collaborators, neutrals, and defectors observed in all experiments.

According to Eq. (1) and Fig. 15, in the PI model, an increase in $P(20)$ is associated with a higher fraction of collaborators and lower fractions of both neutrals and defectors. The parameter $\mathrm{IPR}(\mathbf{P}(20))$ has no effect on the neutral profile but contributes positively to the collaborator fraction and negatively to the defector fraction.

For the PIR model, the predicted fractions take the form

$$ \textstyle\begin{cases} C_{\text{pred}} = \mu _{C} + 0.28 \, \hat{P} + 0.25 \, \hat{I} + 0.09 \, \hat{R}, \\ N_{\text{pred}} = \mu _{N} - 0.11 \, \hat{P} - 0.05 \, \hat{I} - 0.04 \, \hat{R}, \\ D_{\text{pred}} = \mu _{D} - 0.17 \, \hat{P} - 0.20 \, \hat{I} - 0.05 \, \hat{R}, \end{cases} $$

(2)

where P̂, Î, and R̂ are the standardized values of $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and the rank, respectively. The values of $\mu _{C}$, $\mu _{N}$, and $\mu _{D}$ remain the same as in the PI model.

The effects of $P(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$ remain qualitatively similar to those in the PI model: increases in either parameter are associated with more collaborators and fewer defectors, while the effect on neutrals is comparatively weaker. Finally, lower ranks (i.e., better performance) are associated with an increase in the proportion of collaborators and a decrease in both neutrals and defectors.

3.3.4 Adaptive model

We then ran simulations with agents whose visit and rating strategies followed the previously described PI model, but with an additional mechanism allowing their behavioral profiles to evolve between games based on the linear model predicting the fractions of each behavioral profile. Our objective is to test whether the experimentally observed dynamics can be reproduced by a closed loop in which agents first read a small set of stigmergic cues from the environment and then update their strategic profile accordingly.

In this approach, after each game, Mimic agents assess the values of $P(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$ from the previous game and randomly adjust their behavior according to the probabilities derived from the PI model (see Eq. (1)). Specifically, in experiments where a single human participant plays alongside four bots, only the Mimic agent’s profile is updated. In contrast, in experiments involving five human participants, all five Mimic agents adapt their profiles.

This adaptive process leads to stable equilibrium fractions of each behavioral profile, regardless of the initial conditions of the simulation. Notably, the resulting fractions (see Fig. 6) closely align with the predictions of the PI model, with the exception of the experiment involving optimized agents, where the model’s predictive accuracy is lower (see Fig. 9A).

Figures 2, 4, 5, and 3, 11, 12, 13 show that the main observables capturing individual and collective dynamics, although not perfect, follow the same trends as the experimental data. For instance, as shown in Fig. 2, the mean score increases with the level of cooperation within the group.

4 Discussion

In this work, we have investigated how simple autonomous bots can influence human cooperation and defection in a group setting through digital traces. Human groups frequently coordinate by reacting not only to each other’s actions but also to digital traces left in shared environments, such as scores, votes, or reputational signals. This stigmergic process, whereby individuals respond to persistent environmental markers rather than direct interactions, has been widely observed in online settings [9, 10]. Digital traces, ranging from user reviews to real-time indicators like trending tags or scores, serve as public signals that guide group behavior and facilitate large-scale coordination without centralized control [4, 27, 28].

Through controlled experiments, we have examined whether bots using only such indirect signals can effectively promote either cooperative or defector behavior in human groups. The bots are controlled by a behavioral model introduced in [25], which was shown to provide a faithful representation of the visit and rating strategies of human participants, with interpretable parameters. In addition, the model and the methodology of [25] are used to quantify, characterize, and interpret how participants, interacting with bots and unaware of their artificial nature, changed their own behaviors depending on whether the bots followed cooperative or deceptive strategies. This setup allows us to investigate how social information, conveyed through deposited traces, influences participants to adopt cooperative or defective behaviors in these competitive scenarios. In that sense, the study isolates a minimal yet pervasive mechanism in which artificial agents influence behavior solely through stigmergic traces, without direct interaction or explicit signaling.

One of the key findings is a shift in behavioral dynamics depending on the proportion of cooperative bots. When more bots displayed cooperative behavior, human participants tended to adopt more defector-like strategies. This suggests that some players exploited the cooperative environment by manipulating shared information, i.e., the traces left in the system, thereby increasing their individual performance.

We found that clear behavioral patterns emerge depending on the cooperative level of the group. In highly cooperative conditions, participants tended to rate high-value cells and left fewer ratings overall, resulting in clearer, more informative traces. In contrast, in less cooperative groups, more cells were rated, but these tended to have lower or intermediate values, degrading the quality and usefulness of the shared information and therefore weakening the coordination.

Our results also reveal that optimized bots can successfully shift participants’ behavior toward more cooperation. Bots that consistently contributed to the public good nudged human participants to increase their own contributions. The stigmergic signals left by those bots created a normative pattern that humans seemed to follow. These results support earlier studies on social influence showing that individuals tend to match the observable behavior of others, even in the absence of direct interaction [29–31].

Our findings also highlight the consistent superiority of human participants over bots using fixed strategies. Regardless of overall group cooperation, humans performed better—particularly as the bots became more cooperative—by identifying high-value cells more effectively. This highlights the strong influence of group composition on both collective dynamics and individual outcomes. Humans outperformed bots by flexibly adapting their visiting and rating strategies in response to bot behavior, which in turn shaped the social information shared via colored cells on the table. This adaptive flexibility gave human players a distinct advantage over the bots’ fixed strategies, which could not respond to the changing game environment.

Our analysis suggests that participants’ behavior was primarily shaped by three key cues: the perceived benefits of cooperation, the clarity and distinctiveness of social information, and the personal performance relative to others. Cooperation was inferred from whether colored cells corresponded to high-value targets; social information was evaluated based on the diversity and coverage of rated cells; and personal performance was gauged through participants’ relative ranking within the group.

To better understand how participants used these cues in their decisions, we developed a linear model incorporating the three primary factors. The model not only accurately predicted the observed proportions of each behavioral profile but also provided a quantitative framework for understanding how social cues are integrated under varying conditions. Notably, the model shows that the value and number of colored cells have a stronger influence on participant behavior than their ranking. Higher average values of colored cells and broader coverage across the table are associated with increased cooperation and reduced defection. This suggests that the richness and reliability of shared information may carry more weight than competitive motives. Finally, the model revealed that players’ rankings also shaped their strategic decisions. Those with higher (worse) ranks were more likely to adopt cooperative behaviors, while those with lower (better) ranks tended to shift toward neutral or defector profiles. This suggests that players adjusted their strategies in an attempt to optimize their relative position, potentially moving away from cooperation when their performance was already strong.

We then embedded this model into our behavioral framework, enabling agents to dynamically adapt their behavior in subsequent games based on the cues identified. This integration allowed Mimic agents to update their strategy probabilistically, creating simulations that closely mirrored the dynamics observed in human groups.

Despite its insights, our study has some limitations. The experiments were conducted in a highly controlled digital environment in which bots followed fixed strategies and participants were unaware of their presence. This design was necessary to isolate the stigmergic mechanism of interest, but it does not capture several important features of real platforms, including persistent identities, explicit knowledge that some agents may be artificial, network structure, or adaptive bots capable of strategically distorting credibility. Participants were anonymous and had no long-term identity, which differs from how people behave in persistent digital communities. Prior studies have shown that persistent identities and reputational feedback are crucial for sustaining cooperation and social accountability online [32–34]. Moreover, the bots employed in this study followed static strategies and did not adapt to human behavior. In real platforms, bots may learn, evolve, or coordinate in more complex ways [35, 36]. Future research should investigate how bots interact with network structures, identity cues, or mixed strategies. Another open question is whether bot influence changes when users are explicitly aware that some agents are artificial. Recent findings suggest that it may not: even when participants are explicitly informed that they are interacting with bots, strong herding behavior persists in an online minority game [37]. Finally, while our study highlights the power of stigmergic cues, it does not disentangle how much influence comes from social comparison, norm internalization, or risk aversion. Understanding these mechanisms would improve the ethical deployment of bots and help design digital spaces that promote sustainable cooperation. Our results should thus be understood as establishing a proof-of-principle: even simple, behaviorally consistent artificial traces can be sufficient to reshape human strategic adaptation. Extending this framework to more realistic settings remains an important direction for future research.

In conclusion, our study reveals that even simple bots can shift human behavior, showing that human decision-making is highly sensitive to observed regularities. This has implications for online environments such as social media, recommendation systems, or collaborative tools where algorithmic decisions act as digital traces that influence user behavior. By reinforcing certain patterns, bots can steer human groups toward new equilibria. These results are consistent with theoretical findings showing that even a small number of bots can nudge human groups toward new behavioral equilibria through visible and consistent actions in online environments [19, 20]. When embedded in stigmergic environments, i.e., spaces where all actions leave public traces, bots can shape the future path of collective behavior. This provides a useful framework for designing bots in online communities, but it also calls for strong oversight, transparency, and ethical guidelines, especially when bots are introduced in sensitive domains like politics, education, or finance. Overall, our study opens an exciting path for interdisciplinary research on how human groups can be shaped, for better or worse, by artificial agents embedded in stigmergic systems.

5 Materials and methods

5.1 Experimental procedure

In total, the study involved 185 participants, 70 in the all-human baseline condition and 115 in the hybrid human–bot conditions. Upon entering the experimental room, participants first signed a consent form. They were then informed about the rules of the experiment, the payment conditions, and the guarantee of anonymity. The instructions were delivered both orally and through a short sequence of slides. Screenshots of these slides along with the exact oral script are provided in Supplementary Information. Participants were also instructed to turn off their cell phones. Each participant was then seated in a randomly assigned cubicle, linked to an ID in our database, which prevented any interaction between participants during the experiment (See Figure S1).

Experiments were conducted using a custom-designed interactive web application introduced in [25]. On their computer screens, participants were shown an identical $15 \times 15$ table of 225 cells, with each cell linked to a hidden value ranging from 0 to 99. During the instruction phase, examples of these tables were provided. The tables used in the experiments were generated by randomly shuffling the same set of values (see Fig. 1E), ensuring that each table contained the same values but arranged differently (see Fig. 1D).

Each game consisted of 20 consecutive rounds. In each round, participants had to visit and rate 3 distinct cells within a recommended time of 20 seconds. If participants exceeded this time, a warning appeared on their screens. A round ended when all participants in the group had visited and rated 3 cells. The colors of the cells in the table were then updated according to a palette of red hues, reflecting the fraction of stars allocated to each cell since the start of the experiment (see Fig. 1F). Participants then moved on to the next round. Each game typically lasted three to four minutes, and each session consisted of 10 successive games.

Participants were told that the goal of the game was to find the cells with the highest values in the table. However, their payout depended on their rank, which was based on their score, which was calculated as the sum of the values of the cells they visited. This meant that players had an incentive not only to find high-value cells but also to accumulate the highest possible score. The payout scheme thus fostered competition among participants, motivating them to achieve the highest payout at the end of the session.

The experiments were conducted in two different settings: one in which five humans played together and another in which humans played with four bots. For each experimental condition, there were multiple sessions in which participants played 10 to 12 repetitions of the game.

For the experimental condition in which five humans play together, we conducted a total of 7 sessions, each involving ten participants at the Toulouse School of Economics Experimental Laboratory in January 2023. A total of 70 participants (33 females; 37 males) were recruited with an average age of 20. At the beginning of each session, each participant performed two consecutive games alone. Then, the participants were randomly divided into two groups of five and performed at least 10 five-player games. During each experiment, the two groups explored different tables that changed in each game. At the end of the session, the score over the whole session of the five participants of each group is calculated, and the player ranked first is paid 20 €, the second is paid 15 €, and the three remaining players (ranked 3–5) receive 10 € each. This incentivizes participants to have the highest score in their group, creating competition among them.

For the experiments in which humans played with four bots, sessions consisted of five players. Those experiments were carried out in the Laboratoire de Physique Théorique of the University of Toulouse in May 2022. A total of 115 participants (63 females, 52 males) were recruited with an average age of 26. Each participant could participate in a maximum of two different sessions. The participants were mostly students and researchers from the University of Toulouse. Each participant performed 10 five-player games with four model-controlled bots. To prevent bias in participant behavior resulting from playing with bots instead of other humans, participants were unaware that they were playing with bots and believed they were playing with one another. To achieve this, participants were instructed not to communicate with each other and were unable to view each other’s screens. In addition, despite the independence of each participant’s game, participants were required to wait for each other at the end of each round before moving on to the next. This feature prevents desynchronization of the games, which could cause participants to realize that others were still playing after their game had finished. Thanks to these measures, only a very few participants suspected that something “dodgy” was happening, and none of them expressed a belief that they were playing with bots. At the end of the session, the score over the whole session of the five human participants is calculated, and the player ranked first is paid 20 €, the second is paid 15 €, and the three remaining players (ranked 3–5) receive 10 € each. To ensure fairness in participants’ payments, who were ranked together but do not play in the same group, all participants in the same experimental session play on the same table, with the same shuffling of values, and against identical types of bots.

The number of participants in each experimental condition is summarized below:

5 Humans (70 participants divided in 14 groups),
1 Human vs. 4 Col – 0 Def bots (10 participants),
1 Human vs. 3 Col – 1 Def bots (15 participants),
1 Human vs. 2 Col – 2 Def bots (15 participants),
1 Human vs. 1 Col – 3 Def bots (15 participants),
1 Human vs. 0 Col – 4 Def bots (10 participants),
1 Human vs. 4 Neu-1 bots (10 participants),
1 Human vs. 4 Neu-3 bots (10 participants),
1 Human vs. 4 Neu-5 bots (10 participants),
1 Human vs. 4 Opt bots (20 participants).

5.2 Model of the visit and rating strategies

The stochastic agent-based model for modeling human behavior is divided into two parts: the agent’s strategy for visiting cells and the strategy for rating the visited cells.

5.2.1 Visit strategy

In the first round ($t=1$), agents have no prior information, so they randomly select 3 cells to visit. For subsequent rounds ($t > 1$), agents use a different approach. For each of the three cells $i = 1, 2, 3$ to visit, they can either choose the ith best cell from the previous round, with value $V_{i}(t-1)$, based on probability $P^{\text{R}}_{i}(V_{i}(t-1))$, or choose to explore new cells with probability $1 - P^{\text{R}}_{i}(V_{i}(t-1))$.

The probability $P^{\text{R}}_{i}(V_{i}(t-1))$ is defined as

$$ P^{\text{R}}_{i}(V_{i}(t-1)) = \left \{ \textstyle\begin{array}{l@{\quad}l} 0 & \text{if } V_{i}(t-1) < a_{i} \\ \displaystyle \frac{V_{i}(t-1) - a_{i}}{99} b_{i} & \text{if } \displaystyle a_{i} \leq V_{i}(t-1) < a_{i} + \frac{99}{b_{i}} \\ 1 & \text{otherwise} \end{array}\displaystyle \right . , $$

(3)

where $a_{i}$ and $b_{i}> 0$ are parameters. This implies that an agent will never revisit a cell with a value of $V_{i}(t-1) < a_{i}$ and will always revisit a cell with a value of $V_{i}(t-1) > a_{i} + \frac{99}{b_{i}}$ (if this threshold is below the maximum value of 99). Between these thresholds, the probability of revisiting the ith best cell increases linearly from 0 to 1.

If agents do not revisit a previously visited cell, they explore other cells. Each cell c is assigned a probability $P^{\text{E}}(c, t)$ of being selected in round t:

$$ P^{\text{E}}(c, t) = \varepsilon \frac{1}{N} + (1-\varepsilon ) \frac{P^{\alpha}_{c}(t-1)}{\sum _{c'} P_{c'}^{\alpha}(t-1)} , $$

(4)

where $P_{c}(t-1)$ is the cumulative fraction of stars assigned to cell c up to time $t-1$, and $\varepsilon \in (0, 1)$ and $\alpha > 0$ are parameters. To avoid selecting the same cell multiple times in the same round or revisiting cells from the previous round, a new cell is randomly selected if the first selected cell is unsuitable. In this equation, ε controls the balance between exploring unmarked versus marked cells: a higher ε leads to more random selection, while α determines the preference for highly marked cells. A larger α value results in a stronger preference for highly marked cells, while a smaller α value distributes the selection more evenly among marked cells.

The functional forms of Eqs. (3) and (4) are versatile enough to capture a wide range of behavior while being defined by only 8 parameters.

5.2.2 Rating strategy

Ratings are assigned by a stochastic process governed by a discrete probability distribution that depends on the value of the cell. This distribution specifies a probability, $P_{s}(V)$, of assigning a s-star rating ($s = 0, 1, \ldots, 5$) to a cell with value V. The rating strategy does not depend on the round, the number of cells already opened in the round, or the color of the cell.

As observed in [25], individuals predominantly rate cells with 0 or 5 stars, while ratings of 1, 2, 3, or 4 stars are less frequent and have similar probabilities. Consequently, in our model, the probabilities for ratings of 1 to 4 stars are equal and determined by imposing the probabilistic normalization condition $\sum _{s=0}^{5} P_{s}(V)= 1$ for each value of V. Thus, for $s = 1, 2, 3, 4$ we have:

$$ P_{s}(V) = P_{1234}(V) = \frac{1}{4} (1 - P_{0}(V) - P_{5}(V)) . $$

(5)

For $s = 0$ and $s = 5$, the probability $P_{s}(V)$ can either be modeled by sigmoid-like functions:

$$ P_{s}(V) = c_{s} + d_{s} \tanh \left (\frac{v - e_{s}}{99} f_{s} \right ) , $$

(6)

where $c_{s}$, $d_{s} > 0$, $e_{s}$, and $f_{s}$ are parameters; or by linear functions:

$$ P_{s}(V) = c'_{s} + f'_{s} \frac{V}{99} , $$

(7)

where $c'_{s}$ and $f'_{s}$ are parameters.

The functional forms of Eqs. (6) and (7) are designed to accurately reflect observed rating probabilities while being flexible enough to accommodate a variety of behaviors.

5.2.3 Strategies of the Mimic agents

The Mimic agents, which are designed to replicate human behaviors, are controlled using the model framework described above. The parameters that dictate the agents’ visit strategy are detailed in Table 4, while those governing their rating strategy are listed in Table 5. A visual representation of the bots’ rating strategy is provided in Fig. 7.

The visit strategy for these agents is defined by eight parameters. These parameters were optimized by minimizing the discrepancy between a set of n round-dependent observables, $O_{1}(t), \ldots , O_{n}(t)$, as measured in the experiments (averaged over all experiments) and the corresponding observables, $\hat{O}_{1}(t), \ldots , \hat{O}_{n}(t)$, obtained from extensive model simulations (averaging over 1,000,000 numerical experiments for each experimental condition). The error is quantified as follows:

$$ \Delta = \sum _{i=1}^{n} \frac{\sum _{t=1}^{20} (\hat{O}_{i}(t) - O_{i}(t))^{2}}{\sum _{t=1}^{20}{O}_{i}^{2}(t)}. $$

(8)

The round-dependent observables used in this error calculation include (as defined later in Materials and Methods, see Sect. 5.4): $q(t)$, $Q(t)$, $p(t)$, $P(t)$, $\mathrm{IPR}(\mathbf{q}(t))$, $\mathrm{IPR}(\mathbf{Q}(t))$, $\mathrm{IPR}(\mathbf{p}(t))$, $\mathrm{IPR}(\mathbf{P}(t))$, $V_{1}(t)$, $V_{2}(t)$, $V_{3}(t)$, $B_{1}(t)$, $B_{2}(t)$, and $B_{3}(t)$. These observables are computed exclusively for the human participants. To illustrate this, in the experiment with five human participants, $V_{1}(t)$ represents the average value of the highest-valued cell opened by any player in round t, averaged across the five participants. In contrast, in the experiments with one human and four bots, $V_{1}(t)$ represents the average value of the highest-valued cell opened by the human participant in round t.

To minimize the error Δ, a zero-temperature Monte Carlo method was employed. At each Monte Carlo step, a small random adjustment was made to a randomly chosen parameter. If this adjustment resulted in a decrease in the error Δ, the new parameter value was accepted; otherwise, the previous parameter value was retained. The optimization process continued until the error ceased to decrease. To avoid being trapped in local minima, the Monte Carlo simulations were initiated from several starting points. The parameters selected were those yielding the smallest error. It is worth noting that the final parameters obtained from different low-error Monte Carlo runs produced similar functions characterizing the visit strategy (see Eqs. (3) and (4)).

5.2.4 Strategies of the model-controlled bots

The bots used in the experiments are controlled by the model described above. The specific parameters that define their visit strategy are listed in Table 1, while the parameters for their rating strategy are provided in Table 2. Additionally, the rating strategy of the bots is visually represented in Fig. 10.

The collaborator and defector bots emulate the behavior of humans in games involving five human participants. These bots were derived from preliminary experiments conducted in 2017. The three neutral bots, Neu-1, Neu-3, and Neu-5, employ a visit strategy identical to that of the collaborator and defector bots, making their visit behavior comparable to that of humans. Their rating strategy offers three variations of a neutral rating, always assigning 1, 3, or 5 stars to a visited cell. Finally, the optimized bots have been designed to maximize their scores while playing in groups of five identical agents (see Opt-1 agents in [25]). They explore the table until they identify high-value cells, at which point they cease further exploration and repeatedly revisit these identified high-value cells. The rating strategy employed by these bots involves rating only cells with values greater than 50. Consequently, only a very limited number of cells are rated during the game.

5.3 Linear model for predicting individual behavioral profiles

The linear regression model employed to predict the behavioral profile of each individual in the game across various experimental conditions utilizes three quantifiers: $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and rank. These quantifiers are used to represent the three potential cues influencing participants’ behavior.

To ensure consistency in the model, standardized data are employed. A standardized quantity is indicated with a hat: $\hat{X} = (X - \mu ) / \sigma $, where μ is the mean and σ is the standard deviation of X over the experimental data. This standardization results in X̂ having a zero mean and a unit standard deviation.

Let $C_{\text{exp}}$, $N_{\text{exp}}$, and $D_{\text{exp}}$ denote the fractions of humans exhibiting collaborator, neutral, and defector behaviors observed in a given experimental condition, respectively. Similarly, $C_{\text{pred}}$, $N_{\text{pred}}$, and $D_{\text{pred}}$ represent the predicted fractions. A feature vector $\hat{\mathbf{x}}$ with components $\hat{x}_{i}$, where $i = 1, 2, \ldots, f$, contains f standardized features or quantifiers that are expected to explain the data. In this context, the features are $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and rank, with f ranging from one to three.

The linear regression model is defined as follows:

$$ \textstyle\begin{cases} \hat{C}_{\text{pred}} = \displaystyle \sum _{i=1}^{f} c_{i} \hat{x}_{i}, \\ \hat{D}_{\text{pred}} = \displaystyle \sum _{i=1}^{f} d_{i} \hat{x}_{i}, \\ N_{\text{pred}} = 1 - C_{\text{pred}} - D_{\text{pred}}, \end{cases} $$

(9)

where $c_{i}$ and $d_{i}$ are regression parameters for $i = 1, 2, \ldots, f$.

These parameters are obtained by fitting the model predictions to the data by minimizing the error E defined as:

$$ E = \sqrt{ \frac{\displaystyle \sum _{s} \left ( (C_{\text{exp}} - C_{\text{pred}})^{2} + (N_{\text{exp}} - N_{\text{pred}})^{2} + (D_{\text{exp}} - D_{\text{pred}})^{2} \right )}{\displaystyle \sum _{s} \left ( C_{\text{exp}}^{2} + N_{\text{exp}}^{2} + D_{\text{exp}}^{2} \right )}}, $$

(10)

where

$$ \textstyle\begin{cases} C_{\text{pred}} = \mu _{C} + \sigma _{C} \displaystyle \sum _{i=1}^{f} c_{i} \hat{x}_{i}, \\ D_{\text{pred}} = \mu _{D} + \sigma _{D} \displaystyle \sum _{i=1}^{f} d_{i} \hat{x}_{i}, \\ N_{\text{pred}} = 1 - C_{\text{pred}} - D_{\text{pred}}. \end{cases} $$

(11)

Due to the symmetric nature of the error in C, D, and N, linear regressions on any two of these variables (C and D, or C and N) would yield the same predictor.

In this study, ten distinct and independent experimental conditions are considered, with two independent variables C and D (since $N = 1 - C - D$), resulting in twenty independent measurements to be explained by the linear regression model. The number of unknown parameters is equal to two times the number of features, which ranges from one to three, depending on the number of cues used as features among $P(20)$, $\mathrm{IPR}(\mathbf{P}(20))$, and rank.

5.4 Definition of the observables

All observables discussed here are derived from four fundamental vectors (vectors are shown in boldface): $\mathbf{q}(t)$, $\mathbf{Q}(t)$, $\mathbf{p}(t)$, and $\mathbf{P}(t)$. We define $q_{c}(t)$ as the fraction of visits a cell c receives at round t. The collection of $q_{c}(t)$ for all cells c forms a vector $\mathbf{q}(t)$ of size 225. Another relevant vector is $\mathbf{Q}(t)$, which represents the cumulative fraction of visits $Q_{c}(t)$ attributed to each cell from the start to round t. Similarly, $\mathbf{p}(t)$ and $\mathbf{P}(t)$ are vectors whose components $p_{c}(t)$ and $P_{c}(t)$ denote the fraction of stars given to each cell in round t and up to round t, respectively.

In experiments with five humans playing together, these vectors represent the fraction of cells visited and stars rated by the five group members. However, in experiments with humans and bots, these vectors represent the fractions of cells visited and stars rated by the human participant (and not the bots). This allows us to specifically characterize the human behavior. For the linear regression model, the cues $\mathbf{P}(20)$ and $\mathrm{IPR}(\mathbf{P}(20))$ are calculated for all players (humans and bots), since the information displayed in the game is the collective one.

We define the normalized average of the visited cells at round t as $q(t) = \sum _{c} q_{c}(t) V_{c} \times 3 / (V_{\text{max}_{1}} + V_{ \text{max}_{2}} + V_{\text{max}_{3}})$, where V is the vector of cell values $V_{c}$, and $V_{\text{max}_{1}}$, $V_{\text{max}_{2}}$, and $V_{\text{max}_{3}}$ are the three largest cell values. This normalization ensures that $q(t) = 1$ represents optimal performance, where each individual visits the three best cells at round t. Similarly, we define $Q(t)$, which cumulates all visits up to round t, using the same formula with $q_{c}(t)$ replaced by $Q_{c}(t)$. Thus, $q(t)$ and $Q(t)$ measure the instantaneous and cumulative exploration behavior with respect to cell values. A high $Q(t)$ value indicates effective exploration focused on high-value cells, while a low $Q(t)$ indicates broader or less efficient exploration.

Likewise, based on the definitions of $\mathbf{p}(t)$ and $\mathbf{P}(t)$, we define the average value of the visited cells, weighted by their ratings (fraction of stars) at round t: $p(t) = \sum _{c} p_{c}(t) V_{c} / V_{\text{max}_{1}}$, where $V_{\text{max}_{1}}=99$ is the highest cell value. In general, $p(t) \leq 1$, while $p(t) = 1$ would mean that the only evaluated cell would be the one with a value of 99 at round t. Similarly, we define the cumulative quantity $P(t) = \sum _{c} P_{c}(t) V_{c} / V_{\text{max}_{1}}$, which represents the average value of the cells visited by the participants up to round t, weighted by their ratings. Thus, $p(t)$ and $P(t)$ measure the instantaneous and cumulative distribution of stars with respect to cell values. A high $P(t)$ value (especially in the final round, $t=20$) indicates that participants concentrated their ratings on high-value cells, while a low $P(t)$ indicates deceptive behavior, with high ratings given to low-value cells, as seen in defectors.

To measure the exploration behavior, we introduce the inverse participation ratio (IPR) of the vectors $\mathbf{q}(t)$, $\mathbf{Q}(t)$, $\mathbf{p}(t)$, and $\mathbf{P}(t)$. For a given probability distribution $\mathbf{X} = \{X_{c}\}$, the IPR is defined as $\mathrm{IPR}(\mathbf{X}) = 1 / \sum _{c} X_{c}^{2}$ and characterizes the spread of the distribution of X. Thus, for the four vectors considered, the IPR quantifies the effective number of cells on which visits or ratings are concentrated at round t or up to round t. If a probability vector X is evenly distributed over n cells among N, then $X_{c} = 1/n$ for those cells (and 0 otherwise), and $\mathrm{IPR}(\mathbf{X}) = 1/[n \times (1/n)^{2}] = n$, indicating that the IPR measures the effective number of cells over which a probability distribution is spread.

5.5 Behavioral profiles

Human participants and bots are both classified into three behavioral profiles based on their cell rating patterns. To achieve this classification, the mean rating given by each individual to cells of a specific value V is fitted with a linear function of the cell value V: $u_{0} + u_{1} \times 5V / 99$. In this context, $u_{0}$ represents the intercept, and $u_{1}$ represents the slope. A strict linear rating of cells from value 0 to 99, with corresponding ratings from 0 to 5 stars, would have $u_{0}=0$ and $u_{1}=1$. Individuals are then classified into three behavioral profiles: collaborator, neutral, and defector, using two thresholds: $u_{\text{def-neu}} = -0.5$ and $u_{\text{neu-col}} = 0.5$ (see [25]).

The three behavioral profiles are defined as follows:

Collaborator: Individuals with $u_{1} \geq u_{\text{neu-col}}$ rate cells with a rating that increases with the cell values. They assign low ratings to low-value cells and high ratings to high-value cells, thus helping their group members in identifying the best cells.
Neutral: Individuals with $u_{\text{def-neu}} \leq u_{1} < u_{\text{neu-col}}$ give a very similar rating to every cell regardless of their values. While they do not provide distinctive ratings, most neutral individuals contribute to group success by revisiting high-value cells, thereby making them darker and easier to identify for others.
Defector: Individuals with $u_{1} < u_{\text{def-neu}}$ rate the cells in the opposite way to collaborators. They give low ratings to the high-value cells and high ratings to the low-value ones. This behavior is interpreted as an attempt to mislead other group members by obscuring the best cells with low ratings and highlighting poor cells with high ratings.

5.6 Computation of error bars

Error bars for the experimentally measured observables, corresponding to a confidence level of 68%, were determined using the bootstrap method. The bootstrap is a Monte Carlo technique that assesses the properties of statistical parameters from an unknown probability distribution by performing repeated random sampling with replacement from a dataset [38]. The process begins by generating M artificial sets of N experiments by drawing N samples with replacements from the original dataset. Consequently, some experiments may appear multiple times within an artificial set, while others may not appear at all. This method allows for the computation of a given observable on each artificial set, ultimately yielding a distribution from which confidence intervals can be derived.

In our case, in the experimental condition in which five humans play together, the independent experiments are the ten games played by a group of 5 individuals; therefore, we have $N = 14$ experiments. In the experimental conditions in which humans play with bots, the independent experiments are the ten games played by one human with four bots, and we have between $N = 10$ and $N = 20$ experiments depending on the condition. In every condition, we used $M = 10{,}000$ artificial sets to generate bootstrap distributions.

To obtain reliable results from the numerical simulations of the model, the data are averaged over 1,000,000 runs. This process ensures that the error bars are negligible on the scale of the presented graphs.

Data availability

The datasets collected during the experiment analyzed in this study, along with the data analysis and simulation codes, are available in the following Zenodo repository: https://doi.org/10.5281/zenodo.15870630.

Abbreviations

Col:: collaborator
Def:: defector
Neu:: neutral
Neu-1:: bot adopting a neutral strategy consistently giving the same rating (one star) irrespective of the value of the cell
Neu-3:: bot adopting a neutral strategy consistently giving the same rating (three stars) irrespective of the value of the cell
Neu-5:: bot adopting a neutral strategy consistently giving the same rating (five stars) irrespective of the value of the cell
Opt:: optimized bot designed to maximize the group score when playing with other Opt bots

References

Nowak MA (2006) Five rules for the evolution of cooperation. Science 314(5805):1560–1563. https://doi.org/10.1126/science.1133755
Article Google Scholar
Melis AP, Raihani NJ (2023) The cognitive challenges of cooperation in humans and animals. Nat Rev Psychol. https://doi.org/10.1038/s44159-023-00207-7
Article Google Scholar
Fehr E, Gächter S (2002) Altruistic punishment in humans. Nature 415:137–140. https://doi.org/10.1038/415137a
Article Google Scholar
Pentland A (2014) Social physics: how good ideas spread-the lessons from a new science. Penguin, New York
Google Scholar
Goldstone RL, Janssen MA (2005) Computational models of collective behavior. Trends Cogn Sci 9(9):424–430. https://doi.org/10.1016/j.tics.2005.07.009
Article Google Scholar
Gintis H, Smith EA, Bowles S (2001) Costly signaling and cooperation. J Theor Biol 213(1):103–119. https://doi.org/10.1006/jtbi.2001.2406
Article MathSciNet Google Scholar
Grassé P-P (1959) La reconstruction du nid et les coordinations interindividuelles chez Bellicositermes natalensis et Cubitermes SP. la théorie de la stigmergie: Essai d’interprétation du comportement des termites constructeurs. Insectes Soc 6(1):41–80. https://doi.org/10.1007/BF02223791
Article Google Scholar
Theraulaz G, Bonabeau E (1999) A brief history of stigmergy. Artif Life 5(2):97–116. https://doi.org/10.1162/106454699568700
Article Google Scholar
Heylighen F (2016) Stigmergy as a universal coordination mechanism I: definition and components. Cogn Syst Res 38:4–13. https://doi.org/10.1016/j.cogsys.2015.12.002
Article Google Scholar
Dellarocas C (2003) The digitization of word of mouth: promise and challenges of online feedback mechanisms. Manag Sci 49(10):1407–1424. https://doi.org/10.2139/ssrn.393042
Article Google Scholar
Salganik MJ, Dodds PS, Watts DJ (2006) Experimental study of inequality and unpredictability in an artificial cultural market. Science 311(5762):854–856. https://doi.org/10.1126/science.1121066
Article Google Scholar
Hennig-Thurau T, Marchand A, Marx P (2012) Can automated group recommender systems help consumers make better choices? J Mark 76(5):89–109. https://doi.org/10.1509/jm.10.0537
Article Google Scholar
Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59(7):96–104. https://doi.org/10.1145/2818717
Article Google Scholar
Shao C, Ciampaglia GL, Varol O, Yang KC, Flammini A, Menczer F (2018) The spread of low-credibility content by social bots. Nat Commun 9(1). arXiv:1707.07592. https://doi.org/10.1038/s41467-018-06930-7
Wagner C, Mitter S, Körner C, Strohmaier M (2012) When social bots attack: modeling susceptibility of users in online social networks. In: Proceedings of the 2nd workshop on making sense of microposts held in conjunction with the 21st World Wide Web Conference 2012, pp 41–48
Google Scholar
Zhang Y, Song W, Koura YH, Su Y (2023) Social bots and information propagation in social networks: simulating cooperative and competitive interaction dynamics. Systems 11(4):210. https://doi.org/10.3390/systems11040210
Article Google Scholar
Shirado H, Christakis N (2017) Locally noisy autonomous agents improve global human coordination. Nature 545:370–374. https://doi.org/10.1038/nature22332
Article Google Scholar
Shirado H, Christakis N (2020) Network engineering using autonomous agents increases cooperation. iScience 23, Article ID 101438. https://doi.org/10.1016/j.isci.2020.101438
Article Google Scholar
Sharma G, Guo H, Shen C, Tanimoto J (2023) Small bots, big impact: solving the conundrum of cooperation in optional prisoner’s dilemma game through simple strategies. J R Soc Interface 20, Article ID 20230301. https://doi.org/10.1098/rsif.2023.0301
Article Google Scholar
Shen C, Guo H, Hu S, Shi L, Wang Z, Tanimoto J (2023) How committed individuals shape social dynamics: a survey on coordination games and social dilemma games. Europhys Lett 144:11002. https://doi.org/10.1209/0295-5075/acfb34
Article Google Scholar
Si Z, He Z, Shen C, Tanimoto J (2025) Cooperative bots exhibit nuanced effects on cooperation across strategic frameworks. J R Soc Interface 22(222), Article ID 20240427. https://doi.org/10.1098/rsif.2024.0427
Article Google Scholar
Laitinen K, Laaksonen S-M, Koivula M (2021) Slacking with the bot. J Comput-Mediat Commun 26:343–361. https://doi.org/10.1093/jcmc/zmab012
Article Google Scholar
Bedia M, Castillo L, Lopez C, Seron F, Isaza G (2016) Designing virtual bots for optimizing strategy-game groups. Neurocomputing 172:453–458. https://doi.org/10.1016/j.neucom.2015.05.118
Article Google Scholar
Shen C, He Z, Guo H, Hu S, Tanimoto J, Shi L, Holme P (2024) Beyond a binary theorizing of prosociality. Proc Natl Acad Sci USA 121(49), Article ID 2412195121. https://doi.org/10.1073/pnas.2412195121
Article MathSciNet Google Scholar
Bassanetti T, Cezera S, Delacroix M, Escobedo R, Blanchet A, Sire C, Theraulaz G (2023) Cooperation and deception through stigmergic interactions in human groups. Proc Natl Acad Sci USA 120(42). https://doi.org/10.1073/pnas.2307880120
Niella T, Stier-Moses N, Sigman M (2016) Nudging cooperation in a crowd experiment. PLoS ONE 11(1):1–20. https://doi.org/10.1371/journal.pone.0147125
Article Google Scholar
Van Dyke Parunak H (2006) A survey of environments and mechanisms for human-human stigmergy. In: Weyns D, Van Dyke Parunak H, Michel F (eds) Environments for multi-agent systems II, vol 3830. Springer, Berlin, pp 163–186. https://doi.org/10.1007/11678809_10
Chapter Google Scholar
Elliott M (2016) Stigmergic collaboration: a framework for understanding and designing mass collaboration. In: Cress U, Moskaliuk J, Jeong H (eds) Mass collaboration and education. Springer, Cham, pp 65–84. https://doi.org/10.1007/978-3-319-13536-6_4
Chapter Google Scholar
Asch SE (1956) Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychol Monogr Gen Appl 70(9):1–70. https://doi.org/10.1037/h0093718
Article Google Scholar
Cialdini RB (2007) Influence: the psychology of persuasion. Harper Collins, New York
Google Scholar
Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE, Fowler JH (2012) A 61-million-person experiment in social influence and political mobilization. Nature 489(7415):295–298. https://doi.org/10.1038/nature11421
Article Google Scholar
Resnick P, Kuwabara K, Zeckhauser R, Friedman E (2000) Reputation systems. Commun ACM 43(12):45–48. https://doi.org/10.1145/355112.355122
Article Google Scholar
Cheshire C (2007) Selective incentives and generalized information exchange. Soc Psychol Q 70(1):82–100. https://doi.org/10.1177/019027250707000109
Article Google Scholar
Donath JS (1999) Identity and deception in the virtual community. In: Smith MA, Kollock P (eds) Communities in cyberspace. Routledge, New York, pp 29–59. https://doi.org/10.4324/9780203194959
Chapter Google Scholar
Pozzana I, Ferrara E (2020) Measuring Bot and Human Behavioral Dynamics. Front Phys 8. https://doi.org/10.3389/fphy.2020.00125
Tsvetkova M, García-Gavilanes R, Floridi L, Yasseri T (2017) Even good bots fight: the case of Wikipedia. PLoS ONE 12(2), Article ID 0171774. https://doi.org/10.1371/journal.pone.0171774
Article Google Scholar
Verginer L, Vaccario G, Ronzani P (2025) Irrational herding persists in human-bot interactions. Sci Rep 15(1). https://doi.org/10.1038/s41598-025-05534-8
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511802843
Book Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by grants from the CNRS Mission for Interdisciplinarity (project SmartCrowd, AMI S2C3), the CNRS Project 80Prime ALTHEA, and by the Agence Nationale de la Recherche under grant ANR-17-EURE-0010 (Investissements d’Avenir program). T.B. was supported by a doctoral fellowship from the CNRS. R.E. was supported by Marie Curie Core Grant Funding (grant no. 655235–SmartMass).

Author information

Authors and Affiliations

Laboratoire de Physique Théorique, CNRS, Université Toulouse III – Paul Sabatier, Toulouse, 31062, France
Thomas Bassanetti, Maxime Delacroix & Clément Sire
Centre de Recherches sur la Cognition Animale, Centre de Biologie Intégrative, CNRS, Université Toulouse III – Paul Sabatier, Toulouse, 31062, France
Thomas Bassanetti, Ramón Escobedo & Guy Theraulaz
Toulouse School of Economics, CNRS, Toulouse, 31080, France
Stéphane Cezera & Adrien Blanchet
Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911, Leganés, Madrid, Spain
Ramón Escobedo
Institute for Advanced Study in Toulouse, Toulouse, 31080, France
Adrien Blanchet & Guy Theraulaz

Authors

Thomas Bassanetti
View author publications
Search author on:PubMed Google Scholar
Stéphane Cezera
View author publications
Search author on:PubMed Google Scholar
Maxime Delacroix
View author publications
Search author on:PubMed Google Scholar
Ramón Escobedo
View author publications
Search author on:PubMed Google Scholar
Adrien Blanchet
View author publications
Search author on:PubMed Google Scholar
Clément Sire
View author publications
Search author on:PubMed Google Scholar
Guy Theraulaz
View author publications
Search author on:PubMed Google Scholar

Contributions

The conception and design of the work were carried out by TB, CS, and GT. Data acquisition was performed by TB, SC, RE, AB, CS, and GT. TB, RE, CS, and GT contributed to the data analysis, while data interpretation was conducted by TB, CS, and GT. TB and MD were responsible for the creation of new software used in the work. The manuscript was drafted by TB, CS, and GT. All authors reviewed and approved the final version of the manuscript.

Corresponding author

Correspondence to Guy Theraulaz.

Ethics declarations

Ethics approval and consent to participate

All experimental procedures were approved by the Ethics Committee of the University Paul Sabatier (Comité d’éthique de l’Université Toulouse III – Paul Sabatier) and conducted in full compliance with institutional and European regulations for human-subject research. Participants provided written informed consent before taking part. They were informed that the study investigated decision-making and cooperation in digital environments, but in some sessions they were not told that the other four co-players were algorithmic agents. This partial concealment was necessary to avoid biasing behavior and to reproduce realistic online situations in which human users often interact unknowingly with automated bots.

After completion of the experiment, all participants were fully debriefed. They were informed about the true nature of their co-players, the purpose of the study, and the use of their anonymized data. They were also offered the possibility to withdraw their data; none chose to do so.

All data were anonymized at the time of collection. Each participant was assigned a random identification number, and no correspondence between identifiers and personal information was ever created. The only recorded variables were age range, gender, and in-game decisions. Anonymized datasets are stored on encrypted, password-protected servers at the Laboratoire de Physique Théorique de Toulouse and will be retained for fifteen years. Because of complete anonymization, subsequent data deletion is technically impossible, but participants could withdraw at any time during the experiment.

These procedures ensure that the study meets the ethical requirements for the justified use of deception, post-experimental debriefing, and compliance with the General Data Protection Regulation (EU 2016/679).

Consent for publication

All authors have reviewed and approved the manuscript for publication.

Competing interests

The authors declare no competing interests.

Additional information

Handling Editor: Iain Couzin

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

13688_2026_653_MOESM1_ESM.pdf (download PDF )

The article comes with two appendices, one with the supplementary figures and the other one with the supplementary tables. A Supplementary Information file provides the screenshots of the slides shown to participants at the beginning of the session, along with the oral instructions that accompanied them. (PDF 2.8 MB)

Appendices

Appendix A: Figures

Appendix B: Tables

Table 1 Visit strategy parameters of the bots. Values of the parameters used for the visit strategy of the bots (see Eqs. (4) and (3) in Sect. 5.2.1 which defines the model)

Full size table

Table 2 Rating strategy parameters of the bots. Values of the parameters used for the rating strategy of the bots (see Eq. (6) in Sect. 5.2.2 which defines the model)

Full size table

Table 3 Behavioral profile of human participants. Percentage of human collaborators, neutrals, and defectors observed among human participants in each condition

Full size table

Table 4 Visit strategy parameters of Mimic agents. Parameter values used for the rating strategy (see Eqs. (3) and (4) in Sect. 5.2.1 which defines the model) for Mimic agents (collaborator, neutral, and defector) in the conditions with bots and in the condition with five humans. Note that the value of the parameter α correlates positively with the conditions leading to high-quality social information

Full size table

Table 5 Rating strategy parameters of Mimic agents. Parameter values used for the rating strategy (see Eqs. (7) and (6) in Sect. 5.2.2 which define the model) for Mimic agents (collaborator, neutral, and defector) in the conditions with bots and the condition with five humans

Full size table

Table 6 Parameter values of the model for predicting individual behavioral profiles Values of the regression parameters for the fraction of collaborators (c) and defectors (d) and error E after optimization of the PI and the PIR models (see Eqs. (11) and (10) in Sect. 5.3 which define the models)

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Bassanetti, T., Cezera, S., Delacroix, M. et al. Stigmergic influence of simple bots on human cooperation in digital environments. EPJ Data Sci. 15, 49 (2026). https://doi.org/10.1140/epjds/s13688-026-00653-2

Download citation

Received: 09 December 2025
Accepted: 30 March 2026
Published: 13 April 2026
Version of record: 19 May 2026
DOI: https://doi.org/10.1140/epjds/s13688-026-00653-2

Keywords

Profiles

Guy Theraulaz View author profile

Stigmergic influence of simple bots on human cooperation in digital environments

Abstract

Similar content being viewed by others

Warmth and competence in human-agent cooperation

Analyzing Human Decision Making Process with Intention Estimation Using Cooperative Pattern Task

Towards Cognitive Bots: Architectural Research Challenges

Explore related subjects

1 Introduction

2 Experimental setup

2.1 Description of the game

2.2 Experimental conditions

3 Results

3.1 Impact of bots on participant’s performance

3.2 Model of the visit and rating strategies

3.3 Predicting the behavioral profiles of human participants

3.3.1 Cues available to human participants

3.3.2 Linear model for predicting individual behavioral profiles

3.3.3 Interpretation of the model

3.3.4 Adaptive model

4 Discussion

5 Materials and methods

5.1 Experimental procedure

5.2 Model of the visit and rating strategies

5.2.1 Visit strategy

5.2.2 Rating strategy

5.2.3 Strategies of the Mimic agents

5.2.4 Strategies of the model-controlled bots

5.3 Linear model for predicting individual behavioral profiles

5.4 Definition of the observables

5.5 Behavioral profiles

5.6 Computation of error bars

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s note

Supplementary Information

13688_2026_653_MOESM1_ESM.pdf (download PDF )

Appendices

Appendix A: Figures

Appendix B: Tables

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles