Attention Solves Your TSP, Approximately

Abstract

We propose a framework for solving combinatorial optimization problems of which the output can be represented as a sequence of input elements. As an alternative to the Pointer Network, we parameterize a policy by a model based entirely on (graph) attention layers, and train it efficiently using REINFORCE with a simple and robust baseline based on a deterministic (greedy) rollout of the best policy found during training. We significantly improve over state-of-the-art results for learning algorithms for the 2D Euclidean TSP, reducing the optimality gap for a single tour construction by more than 75% (to 0.33%) and 50% (to 2.28%) for instances with 20 and 50 nodes respectively.

Date

Nov 19, 2018

13:30 — 15:00

Event

BeNeRL workshop on Reinforcement Learning

Location

Delft University of Technology

Wouter Kool

Machine Learning & Optimization

Scientist & engineer with a PhD in machine learning and a passion for (combinatorial) optimization.