{"creation_time": 1728912450, "last_updated": 1728912450, "name": "contextual bandit", "definition_text": "In a contextual bandit task, a decision maker chooses between multiple alternatives, where each choice's probability of reward is influenced by the current context or situation. The context consists of a set of features or conditions that provide information about the environment at each trial. The participant's goal is to maximize total rewards across trials by learning which actions lead to the best outcomes for different contexts. This involves adapting decision-making strategies to match the specific circumstances of each trial, balancing exploration of new actions with exploitation of known, rewarding choices.", "id": "tsk_3ICXMd8R3ekcb", "type": "task", "conditions": [], "concepts": [], "indicators": [], "external_datasets": [], "implementations": [], "citation": [], "contrasts": [], "batteries": [], "disorders": []}