Gem #153: Multicore Maze Solving, Part 1
by Pat Rogers —AdaCore
Let’s get started...
This Gem series introduces the "amazing" project included with the GNAT Pro compiler examples. The project is so named because it involves maze solving (as in "Joe and Julie go a-mazing"). But these aren’t typical mazes that have only one solution. These mazes can have many solutions, tens of thousands, for example. The point is to find all of them as quickly as possible. Therefore, we solve the mazes concurrently, applying multiple CPUs in a divide-and-conquer design. In this first Gem we introduce the program and explain the approach.
We actually have two programs for maze solving: one sequential and one concurrent. Based on the notion of a mouse solving a maze, the sequential program is named mouse and the concurrent version is named – you guessed it – mice. Both are invoked from the command line with required and optional switches. The available switches vary somewhat between the two, but in both cases you can either generate and solve a new maze or re-solve a maze previously generated.
When generating a new maze you have the option to make it "perfect", that is, to have only one exit. Otherwise the maze will have an unknown number of solutions. For our purposes we use mazes that are not perfect, and in fact the number of solutions depends solely on the size of the mazes and the random way in which they are generated.
Switches "-h" and "-w" allow you to specify the height and width of a new maze, but other than that their layout is randomly determined. In addition, the concurrent program allows you to specify the total number of tasks available for solving the maze using the "-t" switch. This switch is useful for experimentation, for example in determining the effect of having a great many tasks, or in determining the optimal number of tasks relative to the number of processors available. There are four tasks available by default. The concurrent program will run on as many or as few processors as are available.
Finally, you can control whether a maze and its solutions are displayed. At first glance this might seem a strange option, but displaying them makes the program heavily I/O-bound and serialized, hiding the benefits of parallelism and making it difficult to determine the effects of design changes. Disabling the display is achieved via the "-q" switch.
After either program solves a new maze, you are asked whether you want to keep it. If so, you specify the file name and the program then writes it out as a stream. To re-solve an existing maze you specify both the "-f" switch and the file’s name.
As the programs execute they display the maze, the unique solutions through the maze, and the running total of the number of solutions discovered (unless the "-q" switch is applied). Currently there are two kinds of "console" supported for depicting this information. The selection is determined when building the executables, under the control of a scenario variable having possible values "Win32" and "ANSI".( Terminals supporting ANSI escape sequences are common on Linux systems, so there is a wide range of supported machines.
Now that you know what the programs can do, let’s see how they do it.
The sequential mouse program illustrates the fundamental approach. As it traverses the maze, it detects junctions in the path where more than one way forward is possible. There may be three ways forward, in fact, but not four because that would involve going back over the location just visited. Hence, at any junction the mouse saves all but one of the other alternative locations, along with the potential solution as it is currently known, and then pursues that one remaining lead. Whenever the mouse can go no further – either because it has encountered a dead end or because it has found the maze exit – it restores one of the previously saved alternative location/solution pairs and proceeds from there. The program is finished when the mouse can go no further and no previous alternatives are stored.
The mice program uses the same basic approach, except it does it concurrently. A "searcher" task type implements the sequential mouse behavior, but instead of storing the alternatives at junctions, it assigns a new searcher task to each of the alternatives. These new searchers continue concurrently (or in parallel) with the searcher that assigned them, themselves assigning new searchers at any junctions they encounter. Only when no additional searcher task is available does any given searcher store alternative leads for later pursuit. If it does restore a lead, it uses the same approach at any further junctions encountered.
A new searcher task may be unavailable when requested, because we use a pool of searcher instances, with a capacity controlled by the command-line parameter. When no further progress is possible, a searcher task puts itself back into this pool for later assignment, so additional searchers may be available when restored leads are pursued. The main program waits for all the searchers to be quiescent, waiting in the pool for assignments, before finishing.
The body for the Searcher task type (declared in package Search_Team) implementing this behavior follows:
task body Searcher is Path : Traversal.Trail; The_Maze : constant Maze.Reference := Maze.Instance; Current_Position : Maze.Position; Myself : Volunteer; Unsearched : Search_Leads.Repository; begin loop select accept Start (My_ID : Volunteer; Start : Maze.Position; Track : Traversal.Trail) do Myself := My_ID; Current_Position := Start; Path := Track; end Start; or terminate; end select; Searching : loop Pursue_Lead (Current_Position, Path, Unsearched); if The_Maze.At_Exit (Current_Position) then Traversal.Threaded_Display.Show (Path, On => The_Maze); end if; exit Searching when Unsearched.Empty; -- Go back to a position encountered earlier that -- could not be delegated at the time. Unsearched.Restore (Current_Position, Path); end loop Searching; Pool.Return_Member (Myself); end loop; end Searcher;
The Searcher task first suspends, awaiting either initiation, to start pursuing a lead, or termination. The rendezvous thus provides the initial location and the currently known solution path. The parameter My_Id is a reference to that same task and is used by the task to return itself back into the pool, when searching is finished. The accept body simply copies these parameters to the local variables. The other local variables include a reference to the maze itself (we use a singleton for that) and the repository of unsearched leads, used to store position/solution pairs for future pursuit.
As the task searches for the exit, procedure Pursue_Lead delegates new searcher tasks to alternatives when junctions are encountered. The procedure returns when no further progress can be made on a given lead. In effect we "flood" the maze with searcher tasks, so this is a divide-and-conquer design typical of classical concurrent programming.
In the next Gem in this series, we will describe a fundamental implementation change made very recently (September 2013) to the original concurrent program. This change solved a critical performance bottleneck that was not present when the original program was first deployed in the 1980s, illustrating one of the fundamental differences between traditional multiprocessing and modern multicore programming.
As mentioned, the "amazing" project is supplied with the GNAT Pro native compiler. Look for it in the share/examples/gnat/amazing directory located under your compiler’s root installation. Note that the design change will appear in future releases of the compiler.