



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Paper; Professor: Rolfe; Class: Algorithms; Subject: Computer Science; University: Eastern Washington University; Term: Unknown 1989;
Typology: Papers
1 / 6
This page cannot be seen from the preview
Don't miss anything!
ositioning queens on a chess board is one of the classic problems in mathematics and computer science. This long-standing problem goes back even before Carl Gauss (1777–1855), and is based on the chessboard. It is the problem of finding all of the ways to po- sition eight queens on the chessboard so that none of them is under attack by any other. Remember, the queen can move horizontally, vertically, and in the two di- agonal directions; for convenience I’ll call the direction down and to the right (and its reverse) the diagonal direction, then call the direction up and to the right the antidiagonal direction. You could approach this problem by looking at all possible ways of placing eight queens in the 64 available cells — and there are 64!/56! ways to do that (the permutations of 64 things taken eight at a time), but you don’t have to look at all 1.78E+14 permutations. (If you could fig- ure out a way of just generating the com- binations of 64 things taken eight at a time, the number goes down to 4.43E+09.) You can use some natural intelligence because you know that the queens can attack hor- izontally. That means that you can only have one queen to a row. Because there are eight positions on each of the eight rows, the total candidate configurations is reduced to 8^8 —1.68E+07— a nasty num-
ber but not nearly as nasty as the earlier one. You can, however, toss out massive numbers of these. As a programming exercise, this is a classical problem solved by backtracking. You start by positioning a queen in the top row of the board. As she sits in her cell, you move to the next row and try positioning a queen there. If you find that a queen above is attacking the queen be- low, you don’t have to proceed any far- ther: All board configurations that include this start will be illegal. So you simply po- sition that queen on the next available cell. You back up to the earlier row when you have finished positioning a queen on all available cells of the current row and try the next cell in that row for its queen. Eight queens is a bit challenging to start with, so you can rephrase this as the N Queens problem: Positioning N queens on a grid made up of N rows of N squares to the row. We know that there is no so- lution for the two-queens problem. Nec- essarily, each queen is attacking the oth- er either vertically or diagonally. If you play around with paper and pencil, you’ll see that there is no possible solution for the three-queens problem either — there is necessarily a diagonal attack. So the four-queens problem is the first one that has a solution. For pseudocode purposes I’m going to use the C and Java conven- tions of numbering rows from zero on up. This means that when you hit N , you’ve already positioned N queens. To position a queen in row j :
Listing One implements this algorithm in C. The big question is how to perform the check for attack from above, and this is going to be the source for one of the optimizations. In Fundamentals of Com- puter Algorithms (Computer Science Press, 1978), Ellis Horowitz and Sartaj Sahni gave a simple test based on the assumption that the board above the current position is
valid. They chose to represent the board as a one-dimensional array: Each element represents a row on the board, and the number in that element represents the col- umn position that the queen is occupy- ing. That means that you can easily check for vertical attack: An earlier row has the same column position as your current row. They also noticed that you can easily check for diagonal attack, as shown in the following pseudocode. To check for a valid board[ ] filled from row 0 to row j :
Timothy is a professor of computer science at Eastern Washington University. He can be contacted at trolfe@ewu.edu.
32 Dr. Dobb’s Journal, May 2005 http://www.ddj.com
Listing Two implements this algorithm in C. You can easily see that the time required for the check increases with the size of the board. In computer jargon, it is an “order- N ” algorithm. In Algorithms and Data Structures (1986), Niklaus Wirth showed that you can perform the check in constant time, pro- vided that you use some additional arrays to hold information. (His first proposal of this technique, however, was in the April 1971 issue of the Communications of the ACM. ) For instance, you can have an ar- ray indicating which columns have already been filled. To check whether you can position a queen in a particular column, you just look at that cell of the array to see if it is in use. Similarly, you can have additional arrays to hold information about the diagonals and antidiagonals. You can see that along the diagonals, the differ- ence of the row and column subscripts is a constant, while along the antidiagonals the sum of the row and column subscripts is a constant. Listing Three implements Wirth’s algorithm in C. Wirth’s algorithm dramatically speeds up the processing, requiring only about half the time as compared with the code using Horowitz and Sahni’s order- N validity check. There is, however, another optimization possible. You’re positioning only one queen on a row because of the horizontal attack. You know exactly the same thing about the columns — for each column there can be only one queen. This means that all suc- cessful solutions are just going to be per- mutations of the column subscripts: Each successive row has one fewer candidate position than the previous row. Thus, the problem space (before backtracking) has come down from 8^8 to 8!— from 1.68E+ down to 403E+04. All you have to do is generate candidate permutations. As a bonus, the validity check becomes a little easier, since you only need to check for diagonal and antidiagonal attacks. (In Fundamentals of Algorithmics , Gilles Brassard and Paul Bratley take note of us- ing this method, but they approach it as examining all 8! possible permutations without combining the method with back- tracking during permutation generation.) You first fill the entire array with all of the column positions — a massively ille- gal configuration with all the queens lined up along the diagonal — but then you march the available column positions through each row cell. You can do this with swaps: After you have evaluated the initial partial permutation, you just swap the value in the front cell with the val- ues in the remaining cells and then eval- uate the resulting partial permutation. In the end, you have all the values in the same order except that the value from
the last cell is now at the front. So you can regenerate the original configuration by doing a circular leftward rotation in the array. (This was discussed in my ar- ticle in “Backtracking Algorithms,” DDJ , May 2004.) To position a queen in row j :
Listing Four implements this algo- rithm in C. This optimization even more dramati- cally speeds up the processing, requiring only a fifth of the time as compared with the code using the row-filling implemen- tation. The comparison code uses Horowitz and Sahni’s order- N validity check so that speed-up comes only from the permutation vector optimization. Combining both optimizations, of course, really speeds up the processing. The final optimizations are to remove the benchmarking superstructure, and then to use inline code for the most frequent- ly performed operations: Marking the Boolean arrays for diagonal attack and performing the validity checks based on
those Boolean arrays. This roughly dou- bles the speed. Figure 1 shows the time required in a benchmarking run on a Dell desktop com- puter with a 2-GHz Pentium 4 processor for boards of sizes 12 up to 18. The Ex- cel workbook and code are available elec- tronically (see “Resource Center,” page 5). While the figure shows the C language re- sults, the ZIP file also includes the Java implementations of the benchmarking code and of the final optimization code. Note that Figure 1 is using logarithmic scal- ing for its y-axis, and that the x-axis runs from 12 to 18.
Rejecting Equivalent Solutions You may want to restrict the acceptable solutions to the unique solutions. Some solutions possess what is called “rotational symmetry”: If you rotate the board through 180 degrees — or perhaps even 90 degrees — you end up with exactly the same configuration. Figure 2 shows two solutions, one with the 180-degree rota- tional symmetry, the other with the 90- degree rotational symmetry. The num- bering in the figure shows the queens that are equivalent upon the rotation. If a solution does not possess such a rotational symmetry, then successive ro- tations through 90 degrees generate oth- er solutions that will be discovered as you process the solutions. In addition to the rotations, there is another symmetry op- eration: reflection in a mirror. By the very nature of the N -Queens problem, a valid solution cannot have mirror symmetry. That means that each valid solution also has mirror images that turns up in the processing. Figure 3 shows the solution for the Five-Queens problem that lacks rotational symmetry (Figure 2 shows the
http://www.ddj.com Dr. Dobb’s Journal, May 2005 33
Figure 1: N -Queens optimization results.
N-Queens Optimization Results 100,000.
12 13 Number of Queens
No Optimization Permutation Vector
Wirth's Validity Check Both Optimizations Inline Code Optimization
Seconds
14 15 16 17 18
Listing Six shows the class Board , which controls the initial board positions that the threads work from and also accumulates the sums for total solutions and unique solutions. The synchronized method next- Job( ) receives the partial results from a thread (receiving zeroes on the first in- vocation) and sends the column position from which the thread should begin the next job — returning a negative number as the end-of-job message. Coarser synchronization (waiting for thread completion) is handled by the Java method Thread.join( ). To ease the wait- ing game, you can daisy-chain thread cre- ation: Each thread generates its own child thread until all required threads are ac- tive. The main program can then simply execute child.join( ) and be certain that
all threads have completed before it re- sumes execution because every thread with a child will itself execute child.join( ) before terminating. Listing Seven shows the constructor for the class WorkEngine that extends Thread. The child thread creation and start is part of the constructor itself. This allows the run( ) method to contain just the logic to dialog with the Board object to work sub-
problems and then terminate after re- ceiving the end-of-job message and exe- cuting child.join( ) , if appropriate. Listing Eight shows that method. Table 2 shows the statistics from a num- ber of runs on a quad-processor Xeon computer under Linux. Since the Xeon processor is itself a dual processor, Linux sees eight available 1.5-MHz processors. From that table, you can easily see some
http://www.ddj.com Dr. Dobb’s Journal, May 2005 35
Description Time Required Ratio With No Optimization No Optimization 20:59:51 1. Wirth's Validity Check 8:51:50 2. Permutation Vector 3:52:36 5. Both Optimizations 1:54:54 10. Inline Code Optimization 1:03:28 19. Without Symmetry Checks 0:58:18 21.
Table 1: Benchmarking results.
. 1..... 1... ... 2...... 1 ..... 3.. 2.. 3..... 1.... .. 2...... 1. .... 1.
Symmetric on Symmetric on 180-degree rotation 90-degree rotation
Figure 2: Examples of rotational symmetry in solutions.
Figure 3: Set of solutions for N = equivalent by symmetry operations.
Original Vertical mirror 1........ 1
.. 2.... 2.. .... 3 3.... . 4...... 4. ... 5.. 5...
90 degree rotation Antidiagonal mirror
.... 1.. 3.. . 4... 5.... ... 2.... 2. 5..... 4... .. 3...... 1
180 degree rotation Horizontal mirror
. 5...... 5. ... 4.. 4... 3........ 3 .. 2.... 2.. .... 1 1....
270 degree rotation Diagonal mirror
.. 3.. 1.... .... 5... 4. . 2.... 2... ... 4..... 5 1...... 3..
Listing One void Nqueens (int Board[], int Trial[], int Size, int Row) { if (Row == Size) Process(Board, Size); else for (int Col = 0; Col < Size; Col++) { Board[Row] = Col; if ( Valid (Board, Size, Row) ) Nqueens (Board, Trial, Size, Row+1); } }
Listing Two int Valid (int Board[], int Size, int Row) { for (int Idx = 0; Idx < Row; Idx++) if ( Board[Idx] == Board[Row] || abs(Board[Row]-Board[Idx]) == (Row-Idx) ) return 0; // boolean false return 1; // boolean true }
Listing Three int Valid (int Board[], int Size, int Row, int Col[], int Diag[], int AntiD[] ) { int Idx; /* Index into Diag[] / AntiD[] / int Chk; / Occupied flag / Chk = Col[Board[Row]]; / Diagonal: Row-Col == constant */
Idx = Row - Board[Row] + Size-1; Chk = Chk || Diag[Idx]; /* AntiDiagonal: Row+Col == constant / Idx = Row + Board[Row]; Chk = Chk || AntiD[Idx]; return !Chk; / Valid if NOT any occupied */ }
Listing Four void Nqueens (int Board[],int Size, int Row) { int Idx, Lim, Vtemp; /* Check for a partial board. / if (Row < Size-1) { if (Valid (Board, Size, Row) Nqueens (Board, Trial, Size, Row+1); for (Idx = Row+1; Idx < Size; Idx++) { Vtemp = Board[Idx]; Board[Idx] = Board[Row]; Board[Row] = Vtemp; if (Valid (Board, Size, Row)) Nqueens (Board, Trial, Size, Row+1); } } / Regenerate original vector from Row to Size-1: / Vtemp = Board[Row]; for (Idx = Row+1; Idx < Size; Idx++) Board[Idx-1] = Board[Idx]; Board[Idx-1] = Vtemp; } / This is a complete board. Final validity check */ else if ( Valid (Board, Size, Row) ) Process(Board, Size); }
performance degradation if you split the problem into too many parallel threads. By the nature of the problem, some start- ing board configurations take less time to calculate than others (due to early back- tracking). The main program, however, must wait for the slowest thread to com- plete before it can continue execution. This same waiting means that if the threads end up computing different numbers of starting board configurations, the main cannot continue until the slowest thread is finished. The Java code and accompanying Excel workbook are also available (in the “Thread”
folder) in the ZIP file accessible through the “Resource Center” (see page 5).
Acknowledgments The benchmarking runs reported here were performed during an academic va- cation period on equipment owned by the State of Washington and located within the Computer Science Department at East- ern Washington University. This article contains material first presented at the Small College Computing Symposium at Augustana College (Sioux Falls, South Dakota), 21–22 April 1995. It was pub- lished in SCCS: Proceedings of the 28th
Annual Small College Computing Sympo- sium (1995), 201–10. The article is avail- able online through http://penguin.ewu .edu/~trolfe/SCCS-95/index.html. For even more optimizations, see http:// www.jsomers.com/nqueen_demo/nqueens .html. I have felt it appropriate as I am working on this article not to work through his solution myself, but he reports a 10-fold speed-up compared with the code available through the SCCS-95 web page referenced earlier.
36 Dr. Dobb’s Journal, May 2005 http://www.ddj.com
Number of Number of UniProc Threaded Speed-Up Individual Thread Elapsed Times Queens Threads Time Time*
14 1 2.887 2.918 0.99 2. 14 2 2.865 1.662 1.73 1.261 1. 14 3 2.874 1.294 2.22 1.283 1.274 1. 14 4 2.848 1.271 2.26 0.678 1.105 1.264 1. 14 5 2.909 1.008 2.85 0.637 0.840 0.637 0.632 1. 14 6 2.883 1.039 2.77 0.670 0.650 0.640 0.671 0.623 1. 14 7 2.855 0.690 4.17 0.660 0.459 0.654 0.638 0.678 0.641 0. 15 1 19.559 19.332 1.02 19. 15 2 19.642 9.814 2.00 9.807 9. 15 3 19.627 9.028 2.18 8.372 9.021 7. 15 4 19.573 7.734 2.54 5.335 7.690 7.586 4. 15 5 19.640 6.495 3.02 4.005 5.047 3.858 6.479 5. 15 6 19.614 6.302 3.12 3.874 3.888 3.912 6.289 3.881 6. 15 7 19.627 4.584 4.28 4.052 3.895 3.956 4.095 3.673 3.509 4. 15 8 19.817 3.965 4.95 3.617 3.883 3.949 3.774 3.548 3.894 3.918 3. 16 1 122.277 122.852 1.00 122. 16 2 122.381 62.527 1.96 62.520 61. 16 3 122.623 51.567 2.38 36.502 51.560 44. 16 4 122.578 44.849 2.74 44.819 30.552 36.919 37. 16 5 122.757 32.535 3.77 31.549 32.496 32.517 21.761 19. 16 6 122.718 33.373 3.68 32.387 21.243 23.716 33.357 21.909 19. 16 7 122.606 32.415 3.78 24.999 23.875 22.509 24.968 32.361 21.947 19. 16 8 123.360 28.887 4.25 24.026 24.395 27.755 28.855 25.141 22.993 22.309 19.
Table 2: Thread Timing Results. *Speed-up is the average uniprocessor time divided by the threaded time.