Learning in structured MDPs with convex cost functions : improved regret bounds for inventory management
Year of publication: |
2022
|
---|---|
Authors: | Agrawal, Shipra ; Jia, Randy |
Published in: |
Operations research. - Linthicum, Md. : INFORMS, ISSN 1526-5463, ZDB-ID 2019440-7. - Vol. 70.2022, 3, p. 1646-1664
|
Subject: | censored demand | inventory control problem | Market Analytics and Revenue Management | online convex optimization | regret bounds | reinforcement learning | Theorie | Theory | Lagerhaltungsmodell | Inventory model | Revenue-Management | Revenue management | Lagermanagement | Warehouse management | Kostenfunktion | Cost function | Lernprozess | Learning process | Mathematische Optimierung | Mathematical programming | Operations Research | Operations research |
-
Ordering, pricing, and lead-time quotation under lead-time and demand uncertainty
Wu, Zhengping, (2012)
-
Gupta, Vishal Kumar, (2019)
-
Dynamic pricing and replenishment : optimality, bounds, and asymptotics
Xiao, Yongbo, (2018)
- More ...
-
Optimistic posterior sampling for reinforcement learning : worst-case regret bounds
Agrawal, Shipra, (2023)
-
Parimutuel betting on permutations
Agrawal, Shipra, (2008)
-
Equilibrium in prediction markets with buyers and sellers
Agrawal, Shipra, (2010)
- More ...