nevo.core.basal_ganglia¶
Basal Ganglia Operator Selection¶
This module implements neuromorphic operator selection using basal ganglia circuits in Nengo, with pluggable Temporal Difference learning for adaptive value estimation.
- class UtilityFunction(name, function, initial_weight=1.0)[source]¶
Bases:
objectDefines utility function for an operator.
Maps state features → scalar utility value. Higher utility = more suitable for current state.
- utility_levy_flight(x)[source]¶
LevyFlight utility: high when stuck and not converged.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_differential_evolution(x)[source]¶
DifferentialEvolution utility: high when diversity exists.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_particle_swarm(x)[source]¶
ParticleSwarm utility: high when improving and converging.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_spiral(x)[source]¶
SpiralOptimisation utility: high when highly converged.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_random_search(x)[source]¶
RandomSearch utility: high when stuck, baseline exploration.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_local_random_walk(x)[source]¶
LocalRandomWalk utility: high when converging, need local refinement.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_gravitational_search(x)[source]¶
GravitationalSearch utility: high when diversity exists, need directed exploration.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_firefly(x)[source]¶
FireflyAlgorithm utility: high when moderate convergence, need attraction.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_central_force(x)[source]¶
CentralForce utility: high when need strong directional bias.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_genetic_crossover(x)[source]¶
GeneticCrossover utility: high when diversity exists, want recombination.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_genetic_mutation(x)[source]¶
GeneticMutation utility: high when converging too fast, need diversity.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_simulated_annealing(x)[source]¶
SimulatedAnnealing utility: high when need controlled exploitation.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_harmony_search(x)[source]¶
HarmonySearch utility: high when need memory-guided search.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_tabu_search(x)[source]¶
TabuSearch utility: high when stuck in local optima.
Input: [diversity, improvement_rate, convergence]
- Return type:
- utility_neuromorphic_exploration(x)[source]¶
Neuromorphic exploration utility: prefer when progress is low or diversity drops.
- Return type:
- utility_neuromorphic_exploitation(x)[source]¶
Neuromorphic exploitation utility: prefer when converged and still improving.
- Return type:
- class BasalGangliaSelector(operators, utility_functions=None, neurons_per_ensemble=100, epsilon=0.1, learning_rate=0.1, gamma=0.99, lambda_coeff=0.0, learning_rule=None, value_model=None, td_enabled=True)[source]¶
Bases:
objectBasal ganglia-based operator selection network with modular TD learning.
Implements Winner-Take-All (WTA) selection of operators based on state-dependent utility functions and learned value estimates via Temporal Difference learning (TD(0) or TD(λ)).
- __init__(operators, utility_functions=None, neurons_per_ensemble=100, epsilon=0.1, learning_rate=0.1, gamma=0.99, lambda_coeff=0.0, learning_rule=None, value_model=None, td_enabled=True)[source]¶
- Parameters:
operators (List[Operator]) – List of available operators
utility_functions (Dict[str, Callable], optional) – Custom utility functions (uses defaults if None)
neurons_per_ensemble (int) – Neurons per ensemble in basal ganglia
epsilon (float) – Epsilon-greedy exploration rate (0.0-1.0)
learning_rate (float) – TD learning rate α
gamma (float) – Discount factor for upcoming rewards
lambda_coeff (float) – λ parameter (0.0=TD(0), 1.0=Monte Carlo)
learning_rule (LearningRule, optional) – Pluggable learning rule (default: SimpleTDRule)
value_model (ValueModel, optional) – Pluggable value model (default: LinearValueModel)
td_enabled (bool) – Enable TD learning (vs. basic utility weight adaptation)
- build_network(model, state_ensemble)[source]¶
Build basal ganglia selection network.
- Parameters:
model (nengo.Network) – Parent Nengo network
state_ensemble (nengo.Ensemble) – State feature ensemble (3D: diversity, improvement, convergence)
- Returns:
selected_operator_ens – One-hot encoding of selected operator
- Return type:
- select_operator(operator_selection, current_best_fitness)[source]¶
Select operator using Nengo basal ganglia output + epsilon-greedy policy.
When td_enabled=True, TD(0)/TD(λ) value estimates bias the utility weights that feed the Nengo BG network, so learned knowledge flows back through the neuromorphic circuit rather than bypassing it.
Decision flow¶
Compute reward from fitness improvement (after last operator executed).
Update TD values for the last operator using reward + bootstrap.
Add TD value bias to utility weights (scales Nengo BG input).
Read Nengo thalamus output (operator_selection) as action scores.
Epsilon-greedy: random with prob ε, else argmax of BG output.
- type operator_selection:
- param operator_selection:
Thalamus output from Nengo BG network (n_operators,)
- type operator_selection:
np.ndarray
- type current_best_fitness:
- param current_best_fitness:
Current best fitness value
- type current_best_fitness:
float
- returns:
operator – Selected operator
- rtype:
Operator
- begin_episode()[source]¶
Reset for new episode.
Call this at the start of optimisation to initialise TD learning.
- set_td_lambda(lambda_coeff)[source]¶
Adjust TD(λ) parameter dynamically.
- Parameters:
lambda_coeff (float) – New λ value (0.0 = TD(0), 1.0 = Monte Carlo)
- set_learning_rule(learning_rule)[source]¶
Replace learning rule on the fly.
- Parameters:
learning_rule (LearningRule) – New learning rule instance
- set_value_model(value_model)[source]¶
Replace value model on the fly.
- Parameters:
value_model (ValueModel) – New value model instance
- get_td_values()[source]¶
Get current TD-learned values for all operators.
- Returns:
values – Value estimates (only if TD learning enabled)
- Return type:
np.ndarray