## Homework 5 - A* Maze Solving with pthread

Computer Architecture I   ShanghaiTech University
HW4   HW5   HW6

### A* Path Searching Algorithm

The A* (A-star) path searching algorithm is a widely-used heuristic algorithm in path-finding problems. By enhancing the traditional Dijkstra with a heuristic estimation h, it performs incredibly well for solving mazes with given start point and goal. Heuristic h(v) quantifies our estimation of how far should vertex v be from goal. Since it is heuristic, we cannot guarantee to obtain an optimal solution (i.e. the shortest path), unless h satisfies both the following two conditions:

1. It is Consistent: h(goal) = 0, and for each neighbor n of v, h(v) ≤ d(v, n) + h(n);
2. It is Admissible: For any vertex v, h(v) ≤ actual length of the shortest path from v to goal.
Since we are trying to be perfect, we require an optimal solution in this homework.

### Minecraft Maze Setup

Hope you love the great game Minecraft (我的世界, )! We will be dealing with some lovely Minecraft-like block mazes in this homework. A block maze indicates that the maze is a rectangle made up of square blocks, just like those you can build in Minecraft. Either a wall or a blank occupies a square block.

#### Input

An example of the input file `example.txt` contains:

``` 21 21 ##################### @   #   # #         # ### # # # ####### ### # #   # #   #   #   # # # ####### # # # # # #   #         # # # # ### # # ######### # # # #   #           # # # # ####### ####### # #       #   #     # # ##### # ##### ####### #   # # #     #     # ### # # # ##### ### # # #   # # #   #   # # # # ##### ### ##### # # # #     #   # #   # # # # ##### # # ### # # # #   # # #   #   # # # # ### ### # # # # #             #   # % ##################### ```

Explanation of the input file format:

• The first line contains two numbers `r` and `c` separated by a space, indicating the number of rows and columns of the maze (including the surrounding wall).
• Then there are `r` lines, each containing `c` chars that describe the block maze:
• `#` indicates a wall, which you cannot step on or go through. The maze will always be surrounded by walls as in the example, leaving only a start point and a goal point.
• `@` stands for the start point. For simplicity, it will always be at position (1, 0).
• `%` stands for the goal. For simplicity, it will always be at position (rows - 2, cols - 1).
• ` ` stands for a blank block, which you can step on.
• The maze in guaranteed to have at least one solution, so there is always a shortest path.

#### Your Mission

Imagine you are playing in a maze game server! Your journey begins at the start point `@`, and you MUST find a shortest path all the way to the end point `%`. There are several extra limitations on the game rule:

• The game server limits a player's movement only to four directions (-_-|). You are only allowed to move left, right, up, or down (to a blank block, of course) at each move. NO diagonal movement.
:
```   *       *       ***   ```
×:
```   *       *        **   ```
• Each step towards a neighbor takes exactly the same cost 1.
• Your sixth sense is not that strong, so you cannot give a complicated heuristic estimation. In fact you only hold a compass () in your hand, which tells you your current coordinate and the coordinate of the goal point. Therefore the heuristic function MUST be the Manhattan Distance: h(v) = |vx - goalx| + |vy - goaly|.
• You MUST find an optimal shortest path. If there are multiple possible shortest paths, any of them will be accepted.

We have implemented a simple (not well-written) sequential A* algorithm. Please click here to download the framework. Based on this simple framework, you should conduct optimizations and parallelizations with pthread to accelerate the algorithm. Detailed requirements have been narrated in the comments, but I will emphasize them here once again:

1. You CANNOT modify `compass.h` and `Makefile`. We will replace them with a clean version at your submission. That means you must use the Manhattan Distance as the heuristic, and can only introduce pthread for parallelization. Other parallelization techniques will be allowed in projects, but not in this homework.
2. Feel free to modify any other files to serve your optimization.
3. Build the target with `make`; Run the A* algorithm by `./astar <input-file.txt>`; Clean the build with `make clean`.
4. We give you two sample mazes. `maze-231.txt` is a small maze, whose shortest path should contain `231` steps. `maze-4821.txt` is a relatively huge maze, whose shortest path should contain `4821` steps. Test your implementation on the smaller maze first, then tune its performance with the larger one.

#### Output

Your program will not generate anything to the standard output. Instead, it should write your calculated steps along the shortest path back to the input file. Each step should be printed as an asteroid `*`.

For example, after running the algorithm on the above-mentioned `example.txt`, this file now contains:

``` 21 21 ##################### @***#   # #         # ###*# # # ####### ### # #*  # #   #   #   # # #*####### # # # # # #  *#         # # # # ###*# # ######### # # # #*  #           # # # #*####### ####### # #  ***  #   #     # # #####*# ##### ####### #   #*# #     #     # ### #*# # ##### ### # # #***# # #   #   # # # #*##### ### ##### # # #*#     #   # #   # # #*# ##### # # ### # # #*#   # # #***#***# # #*# ### ###*#*#*#*# #  ***********#***#*% ##################### ```

If the maze has multiple possible shortest paths (which is very likely), any of them will be accepted. But be sure you ONLY print one path to the file. We will test the validity of you path by tracking through it, making sure it is a single path, and counting the number of `*`s.

### Submission and Grading

Submit your work to Autolab with an archive `hw5.tar`, which contains exactly the following files:

``` hw5.tar   |- main.c   |- heap.c   |- heap.h   |- maze.c   |- maze.h   |- node.c   |- node.h ```

We will then put in the clean version `compass.h` and `Makefile`. Due to the poor performance of Autolab server (服 xiǎo 务 shuǐ 器 guǎn), we will only autograde your submission on `100 ~ 2000` sized mazes and give you a score according to correctness. The execution time on Autolab will also be given, but that is only a reference.

Step 1: If you submitted the correct algorithm onto Autolab, meanwhile it makes meaningful uses of multithreading, you get 60% of the total score. (Memory leak check with valgrind is also included)

Step 2: After the autograding deadline, we will run your algorithms on a much stronger and stabler server (configuration given below). We will run it on a huuuuuge maze for several times, and take the average execution time tavg. If you have earned the 60%, the rest 40% will be given based on the performance of your algorithm on the strong server. Fastest 30 students get full 40%, top 31~70 get 30%, top 71~105 get 20%, and the rest get 10%. NOTE: If your program crashes for this huge maze then you get zero in this phase.

### Hints

• The most straight-forward and simple optimization is using two threads and conduct a Bidirectional Search.
• Whether or not you can take advantage of more threads depends on your algorithm. Pay attention to synchronization issues.
• You are allowed to modify everything in the 7 files you submit. So you may consider adopting extra optimizations, such as parallel I/O with the `txt` file. However for this problem scenario, these optimizations may not bring significant speedup, compared to a good and scalable algorithm. It's all up to you.
If you came up with a really incredible algorithm which is accurate and highly parallelizable, please come to TA Guanzhou Hu. We warmly welcome your sparkling ideas and may consider giving you a bonus.

### Server Configurations

Autolab Server (Just for your reference. Tune your performance to the Star Lab Server):

• CPU: Intel Core i5-4590 3.3 GHz, 4 cores (4 threads) Details here
• Memory: 4GB
Star Lab Server (Used for final speed test):
• CPU: Intel Xeon E5-2650 v3 2.3 GHz, 10 cores (20 threads) Details here
• L1 cache 640KB per core
• L2 cache 2560KB per core
• L3 cache 25600KB
• Memory: 32GB