Learning while Trading: Experimentation and Coasean Dynamics (new version coming soon)
Best Graduate Paper Award at the Lisbon Meetings in Game Theory and Applications 2018; Shortlisted for the LAGV Prize at ASSET 2018
Abstract. I study a dynamic bilateral bargaining problem with incomplete information where better outside opportunities may arrive during negotiations. Gains from trade are uncertain. In a good-match market environment, outside opportunities are not available. In a bad-match market environment, superior outside opportunities stochastically arrive for either or both parties. The two parties begin their negotiations with the same belief on the type of the market environment. As arrivals are public information, learning about the market environment is common. One party, the seller, makes price offers at every instant to the other party, the buyer. The seller has no commitment power and the buyer is privately informed about his own valuation. This gives rise to rich bargaining dynamics. In equilibrium, there is either an initial period with no trade or trade starts with a burst. Afterward, the seller screens out buyers one by one as uncertainty about the market environment unravels. Delay is always present, but it is inefficient only if valuations are interdependent. Whether prices increase or decrease over time depends on which party has a higher option value of learning. The seller exercises market power. In particular, when the seller can clear the market in finite time at a positive price, prices are higher than the competitive price. However, market power need not be at odds with efficiency. Applications include durable-good monopoly without commitment, wage bargaining in markets for skilled workers, and takeover negotiations.
Abstract. I study social learning in networks where rational agents act in sequence, observe the choices of their connections, and acquire private information via costly sequential search. I characterize perfect Bayesian equilibria of the model by linking individual search policies to the probability that agents select the best action. The information structure of the model precludes information aggregation via martingale convergence arguments. If (and only if) search costs are not bounded away from zero, an improvement principle holds even though the informational environment significantly differs from that of the standard model with exogenous private signals. I leverage the improvement principle to show that asymptotic learning obtains in sufficiently connected networks where information paths are identifiable. When search costs are bounded away from zero, even a weaker notion of long-run learning fails, except in ad hoc network topologies. Networks where agents observe the choices of random numbers of immediate predecessors share many equilibrium properties with the complete network, including the rate of convergence and the probability of wrong herds. Transparency of past histories has short-run, but not long-run, implications for welfare and efficiency. The simple policy intervention of letting agents observe the relative fraction of previous choices reduces inefficiencies and welfare losses.
Work in Progress
Abstract. We propose a simple estimation strategy when data on strategic interaction are interpreted as the long-run result of a history of game plays. Players interact repeatedly in an incomplete information game, possibly while learning how to play in such a game. We remain agnostic on the details of the learning process and only impose a minimal behavioral assumption describing an optimality condition for the long-term outcome of players’ interaction. In particular, we assume that play satisfies a property of “asymptotic no regret” (ANR). This property requires that the time average of the counterfactual increase in past payoffs, had different actions been played, becomes approximately zero in the long run. A large class of well-known algorithms for the repeated play of the incomplete information game satisfies the ANR property. We show that, under the ANR assumption, it is possible to partially identify the structural parameters of players’ payoff functions. We establish our result in two steps. First, we prove that the time average of play that satisfies ANR converges to the set of Bayes correlated equilibria of the underlying static game. To do so, we extend to incomplete information environments prior results on dynamic foundations for equilibrium play in static games of complete information. Second, we show how to use the limiting model to obtain consistent estimates of the parameters of interest.
Abstract. We study how self-enforcing repeated relationships between a principal and multiple competing agents form and evolve over time in the presence of learning. The principal combines transfers (compensation) and non-monetary tools (replacement threat) to motivate the agents to engage in costly experimentation with innovative activities. Agents know whether innovation opportunities are available in each period, whereas the principal does not. Monitoring is imperfect (and public), but agents’ strategies are private. Successful experimentation reduces the information asymmetry and parties become better able to specify the details of their cooperation in a relational contract. We characterize optimal incentive provision in two distinct market configurations: a deep market, with many agents, and a narrow market, where only two agents are available. Applications include managerial compensation and CEO rotation in innovative industries, and moral hazard problems in political economy.
Abstract. We study the role of diverse learning skills in games of competitive experimentation (e.g., an R&D race). We find that heterogeneous innovation abilities across competitors affect the type and the magnitude of the inefficiencies that arise in equilibrium. As a consequence, industrial policies targeting inefficiencies in R&D should take into account the asymmetry in the firms’ learning process not to be detrimental.