Publication: Q-Learning for Continuous State and Action MDPs under Average Cost Criteria.