INVERSE BOGGLE

Boggle is a fairly well known word game distributed by Parker Brothers. There are also numerous web implementations and there is public domain code available on the web. In the most common form, 25 random letters are arranged in a 5x5 grid and the player then finds words by connecting letters within the grid in sequence. Longer words mean more points. The game is FAR too easy for the POTM gang.

So. This time around we play "Inverse Boggle" (or 1/BOGGLE). I'll give your program a bunch of words, and you construct a grid that contains as many words as possible according to "boggle rules".


    ==========  THE DETAILS UNLEASHED  ===================

I.  The task

	You will write a program that takes a list of words as input.
	Your program will construct a 5x5 grid of letters.  Your goal
	will be to create a square that "contains" as many of these
	words as possible.  Your program will print out the letter
	square and then print out the words from the input that may
	be found in your square.  More words means more score.  Longer
	words mean more score.  High score wins.

II.  Finding words in a square of letters - modified Boggle rules

	If you're unfamiliar with the game, check out some web sites
	like http://boggle.stanford.edu/ ... there are many variations
	possible - I have chosen a single rule set for this POTM.

	A.  The letter square will be a 5x5             NPDNN
	    grid of letters.  Each grid position        AVADS
GRID:	    is occupied by one of 26 letters (A-Z).     ITTEB
	    Q and U are individual letters.  All        TPLFA
	    letters will be upper case.  A sample       AUEMD
	    letter square is shown at the right.

	B.  A word may be found in the letter square by connecting 
	    letters with their neighbors.  You can move from a letter 
	    to any other letter that touches it, even diagonally, 
WORDS:	    but you can't use the same individual letter more than 
	    once in a word. 

		Some of the words that can be found in the grid above are: 
		PIT PITTED BEADS DATES BELT MELTED FAME SETTLE

		Some words that can't be legally found are: 

			FLEET         (the E's don't touch)
			FELL          (can't use the L twice)
			BELATED       (the L and the A don't touch)
			ADDED         (not enough D's)
			BELTED        (uses the same E twice)

	C.  ONLY words that are in the file I provide to your program
	    are valid words ... they may be in any language, of length 
WORDSLIST   1-25 letters, or in no language whatever.  For example, I may
	    present a wordlist that contains the following "words":

	    GESUNDHEIT, NPDNNN, POTM, GEORGE, XX, YOURNAME, MYNAME

	    In this case ONLY these seven words would be worth score
	    if they are found in the letter grid.  Other sequences,
	    like NAME or POT or CAT would not be considered valid.
	    The wordlist may contain words as short as one letter or
	    as long as 25 letters.  The wordlist may contain words
	    which are subsets of one another (like A,AT,EAT,SEAT,SEA).
	    All these words can be found in a four letter sequence and
	    each would be a valid word eligible for scoring if it was
	    in the provided word list.  All words in the wordlist will
	    be unique - there will be no duplicates.

	    
	D.  Any word from the wordlist that can be found in your
	    letter square will add to your score.  A specific word
	    can score only once even though the word may be found in 
SCORING:    multiple places in your square.  Each word will score
	    points according to the number of letters in the word.

		A one letter word will score one point.
		A two letter word will score two points.
		A three letter word will score three points.

			< ... there is no mystery here ... >

		A ten letter word will score ten points.
		
		An N letter word will score N points for N = 1 to 25

	    Your total score will be the sum of the scores of all valid
	    words that you find in your final letter grid.  In order to
	    score, the word must be present in your grid AND it must
	    be present in your output word list.  Your program will print
	    the square you find followed by the words you claim are present.

>	The following rule was added after the initial distribution
>	of the rules:  If your output contains a word which does
>	not result in a score (because it is not in the square, or
>	not in the wordlist, or a duplicate, or whatever) then 10
>	points will be subtracted from your score for each such word.

	    In the event of a tie, there will be three tiebreakers (each
	    is applied in sequence ONLY if necessary to resolve ties
	    of the earlier scores.

	    2) The FEWEST number of words found.  (given that two programs
	       have the same score, this rewards the one that uses longer
	       words)

-->	    3) The FEWEST vowels (AEIOUY) in the square.  (very unlikely 
	       I need to use this tiebreaker)

-->	    4) If two or more entries are still tied, I will apply the
	       same set of scoring criteria to a second wordlist in order
	       to break the tie.

III.  Your program

SHOULD	     a) should accept a single argument that provides the file
ACCEPT	        name of a wordlist I will provide.  The wordlist will
SINGLE	        contain between 5 and 1000 words, one per line.  All words
ARGUMENT        will be in upper case letters using only (A-Z) and will
THAT	        contain no white space, hyphens, apostrophes, etc.  There
PROVIDES        will be no blank lines in the file.  The wordlist for the
THE	        system test is shown in the left margin (25 words).
FILE
NAME	     b) your output MUST be exactly as follows:
OF		* the letter square you generate must appear on the
WORDLIST	  first five lines, five upper case letters per line.
OUTPUT		* starting with line six, you must list the words that
LETTER		  can be found in your square - all in upper case.
SQUARE		* there should be no blanks, tabs, or anything other
AND		  than upper case letters in your output.
LIST
WORDS		Note that only the words you CLAIM are in your square
YOU		will be evaluated for possible scoring.  In order to
CLAIM		score they must be present in the square AND they must
PRESENT		be present in the source list I provide.
A
KEWL	     c) I will run the program as follows:
POTM			nohup timex a.out INPUTFILENAME >output 2>timexoutput &
EXTRAORDINAIRE
THANKS

=============================================================================
    The following items are standard stuff for ALL the POTMs ....
    (but they occasionally will change ... so READ 'EM!)
=============================================================================
I.  About your programming:

    a) I compile on one machine (usually) and execute on 
	another as follows:

        compilation machine: SunOS 5.3 Generic_101318-59 sun4m sparc
        execution machine  : SunOS 5.3 Generic           sun4c sparc

    b) The compilers I have available are (at least):

        SPARCompiler C++ 4.0
        SPARCompiler C   ANSI C compiler
         		 SVID compliant under SunOS 5.x
        gcc/g++ version 2.7.2
	PERL version 5.002
	Java, version 1.0 of the Java Developer Toolkit from SUN
		(as downloaded to a SPARC workstation)
	Scheme48 - a Lex variant

    All compilation will be done WITHOUT ANY OPTIMIZATION, and the
    standard math library will be loaded (even if not used).  While
    this might not reflect the real world, it is at least consistent.
    No makefiles are permitted, if there are special instructions
    please so indicate in your program header comments.

>>>      IMPORTANT:  submit early so we can resolve any
>>>      portability problems!!!  (Particularly if you 
>>>      are working in a PC environment.

       NOTE: assembly code submissions are NOT acceptable.  I must be
       the one to compile your code so I can check for cheating!

    c) if you wish to submit a shell program, Bourne, Korn, and csh
         are available ... along with any bin or /usr/bin tools that
         are available as generic Unix tools - my judgement!!!
	 Also nawk, awk, dc, or whatever.
         (again - submit early in case there are version differences)

    d) Temporary files may be created in /tmp, but MUST be removed
         when you are done ... creation of files anywhere else is
         strictly prohibited.

    e) Maximum size of the source code is roughly 25K characters - 
         different rules may apply for specific POTMS, and 
         comments don't count against you.

    f) Maximum size of executable is 1 Megabyte ... please!!!!

    g) Maximum malloc'able space is a bit over 50 Meg ....

    h) sizeof(short)=2; sizeof(int)=4; sizeof(long)=4; sizeof(char)=1

    i)	 a    = 0x80000000  = -2147483648
         a - 1 		    =  2147483647
         a + 1 		    = -2147483647
         {a} is true.	{a == 0} is false.

II.  The system testing ....
    
    a) mail me an entry as soon as possible - you can always submit
         another entry if you improve your solution .. but try and
         keep it down to one a week once the porting issues are 
	 resolved .... please!

    b) one entry per person (or team) please.  I associate each entry
         name with an email address and a person for communication
         purposes.  Communication is fine - and encouraged - but
         everyone's code should be their own unless there is a
         stated collaboration and a single team entry.  Honor system!

    c) on receipt of your entry, I'll run a system test to make sure
         your program works ... you'll receive the results and
         a weekly standing of how you fared against other entries.
         (I usually will get to new mail once a night but perhaps not!)

    d) please make sure your program works on the system test problem.

    e) your program must perform the task specified within 10 minutes
       sys+user time as measured by timex on MY execution system.
       Your time will be provided along with your system test run
       so you can see the differences in speed between yours and mine.
       All execution time measurements used for tiebreakers (if any)
       will be measured using  time via timex (similar to
       time command).  
       
       ===>	NOTE: A code fragment to measure elapsed time
       ===>	      is available from the POTM-master for the asking.

III.  SENDING ME YOUR ENTRY - PLEASE USE EMAIL!!!

   Please email (not uuto, no attachments) your source code to me.

    IMPORTANT:  Please use the following (or equivalent) lines at the
    front of the program you mail to me (this will help immeasurably!):

    /*  POTM ENTRY:  entryname (something clever please!)	*/
    /*  Your Name:   First Last					*/
    /*  Your email:  log@machine.att.com (or whatever)		*/
    /*  compile instructions (if other than "make entryname")	*/

    NOTE:  If you submit a shell program, please include these lines
    with a leading "#" and indicate which shell to run it under!  

    IMPORTANT:  ENTER EARLY - you will receive weekly standings and
    you will resolve any portability issues early.  You may improve
    your entry at any time by simply sending me another entry - so it
    pays to enter earlier!  (I process most everything in a day or so)

Thanks!  If you have any questions, mail 'em to me - if I answer them
I'll include them in the Frequently Asked Questions (FAQ) list I 
circulate with the weekly standings!!!  Don't call me ... please!

WATCH THE FAQ - ESPECIALLY IN THE FIRST FEW WEEKS AS ALL THE STUPID
  ERRORS I MAKE IN THE PROBLEM STATEMENT TURN UP!!!!

=Fred


  


The POTM is unsponsored and just for fun.       Friday 07:32:13 AM      Copyright © 2004-6 - Fred Hicinbothem