The idea of learning heuristics for combinatorial optimization problems using Neural Networks is promising, and we push this idea further towards practical implementation. We propose a new model and effective way of training by a Reinforcement Learning algorithm, which significantly improves over recent learned heuristics for the Travelling Salesman Problem (TSP), getting close to optimal results for problems up to 100 nodes, and we learn strong heuristics for two variants of the Vehicle Routing Problem (VRP), the Orienteering Problem (OP) and (a stochastic variant of) the Prize Collecting TSP (PCTSP), outperforming a wide range of baselines and getting results close to highly optimized and specialized algorithms.