Showing 1 - 4 of 4
We propose a minimax concave penalized multi-armed bandit algorithm under generalized linear model (G-MCP-Bandit) for a decision-maker facing high-dimensional data in an online learning and decision-making process. We demonstrate that the G-MCP-Bandit algorithm asymptotically achieves the...
Persistent link: https://www.econbiz.de/10012897096
Persistent link: https://www.econbiz.de/10011975803
Persistent link: https://www.econbiz.de/10012039968
Persistent link: https://www.econbiz.de/10003839141