The objective of this work is to create the optimal schedule for HVAC operation to reduce the cost while satisfying the home-owner and equipment’s constraints using a model-free Reinforcement Learning (RL)-based optimization. This optimization is addressed with the development of initial learning testbed and implementation of RL techniques on a real home. Our preliminary results showed a 17% reduction in the total cost and a 15% reduction in the power utilization using our RL-based HVAC model–RL-HEMS.