You’ve been given a list of words to study and memorize. Being a diligent student of language and the arts, you’ve decided to not study them at all and instead make up pointless games based on them. One game you’ve come up with is to see how you can concatenate the words to generate the lexicographically lowest possible string.

Input

As input for playing this game you will receive a text file containing an integerN, the number of word sets you need to play your game against. This will be followed byNword sets, each starting with an integerM, the number of words in the set, followed byMwords. All tokens in the input will be separated by some whitespace and, aside fromNandM, will consist entirely of lowercase letters.

Output

Your submission should contain the lexicographically shortest strings for each corresponding word set, one per line and in order.

Constraints

1 <=N<= 100

1 <=M<= 9

1 <= all word lengths <= 10

Here is my problem analysis for Facebook Hacker Cup Qualification Round: Studious Student.

**Studious Student Problem Analysis:**

As I mentioned, this problem is not as straight forward as you think it might be. The first most natural way to approach this problem is sorting. Most people will reason that you can sort and concatenate each individual word together to form the lexicographically smallest string. This is incorrect, as illustrated in one of the sample inputs:

**jibw ji jp bw jibw**

By sorting and concatenate, the answer is:

**bwjijibwjibwjp**,

while the correct answer should be:

**bwjibwjibwjijp**.

Notice in the above words, “**ji**” is the prefix of “**jibw**“. The “*sort and concatenate*” method definitely does not work when there is a case where a word is a prefix of one or more other words.

Well, one might try a naive way of doing a brute force. Although it is highly inefficient, it works for this problem. This is because each input would be at most **9 **words, and it turns out there are only a total of **9!** = **362880** possible permutations of words being concatenated together. Therefore, one can generate all possible permutations and find the answer.

We make an easy observation that if all words in the list are of equal length, then *sort + concatenate* must yield the smallest dictionary order. In fact, a better argument would be:

*sort + concatenate*must yield the smallest dictionary order string.

To solve this problem correctly, we must also handle the special case where a word appears as a prefix of other words. One efficient and easy (non-trivial to prove but easy to reason) solution for this problem is to re-define the order relation of two words, *s1* and *s2*, as:

*s1*is less than

*s2*

**iff (if and only if)**

*s1*+

*s2*<

*s2*+

*s1*.

Then, by sorting and concatenating the words using the above ordering, it must yield the lexicographically smallest string. Why?

Here is a concrete example why this works. We use an example where the list of words are:

**ji**

**jibw**

**jijibw**

By the definition of *s1* is less than* s2* iff *s1*+*s2* < *s2*+*s1*, we found that the lowest ordered word in the list is “**jibw**“. This is because “**jibwji**” < "**jijibw**” and “**jibwjijibw**” < "**jijibwjibw**“.

Now, the key to understand why the order relation s1+s2 < s2+s1 yields the smallest dictionary order is:

- We have found the smallest-ordered word such that s1+x < x+s1. Therefore, it is impossible to swap the words to yield a smaller dictionary order.
- For a case with more words, then this order relation holds: s1+x < x+s1, and x+y < y+x. As swapping at any point could not possibly yield a smaller dictionary order, therefore s1+x+y must yield the smallest dictionary order.
- This result can be generalized to all
*M*words by induction, due to the transitive property mentioned above.

I do not claim that this is a rigid or even a correctly constructed proof. However, it does help me in convincing myself this is a valid solution.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
bool compareSort(const string &s1, const string &s2) { return s1 + s2 < s2 + s1; } int main() { string words[10]; int N, M; cin >> N; for (int i = 0; i < N; i++) { cin >> M; for (int j = 0; j < M; j++) cin >> words[j]; sort(words, words+M, compareSort); for (int j = 0; j < M; j++) cout << words[j]; cout << endl; } } |

There is a solution O(2^M M^2) to this problem. http://ideone.com/Qkudc

It can be reduced to TSP. The "distance" is the lowest permutation.

0The way you have solved definitely works and all that we need to do is to provide the STL sort a new function :).

0I was surprised by how challenging this problem was. How did you arrive at your compareSort solution? That is very clever.

I ended up writing a findLower() method that would find words that were lower than a given word using two facts: if words were the same length you can trust a traditional sort, and if they aren't, you can trust the sort if you add the first letter of the shorter word to the end of it. Like you said, its hard to prove why that works, but it does!

Hopefully a problem this tricky wouldn't come up in an interview!

+1@Anonymous:

Just like you, my first intuition was to append a smaller letter (by finding the smallest first letter from the word list not including the word itself) to each word and sort it, progressively refining at each step.

However, this does not work for the case:

ji jijib

Then, I thought that appending just one letter would not be enough. Then later it just lead me naturally to the order relation of s1 + s2 < s2 + s1.

+1there is a simple and effective way to solve this, please go see wiki

http://en.wikipedia.org/wiki/Trie

and find the sorting section.

Cheers!

Pop

+1The comparer part is the tricky one. My solution was written in C#. It looks a bit clunky compared with your C++ solution.

using System;

using System.Collections;

class Student

{

public class MyStringComparer : IComparer

{

int IComparer.Compare(object X, object Y)

{

string x = (string) X;

string y = (string) Y;

return (x+y).CompareTo(y+x);

}

}

public void MySort(string[] words, IComparer comparer)

{

}

static void Main()

{

int N = int.Parse(Console.ReadLine());

for(int i=0; i < N; i++)

{

string line = Console.ReadLine();

string[] splits = line.Trim().Split();

int M = int.Parse(splits[0]);

string[] words = new string[M];

for(int j=0; j< M; j++)

words[j] = splits[j+1];

MyStringComparer comparer = new MyStringComparer();

Array.Sort(words, comparer);

for(int j=0; j< M; j++)

Console.Write(words[j]);

Console.WriteLine();

}

}

}

0good solution…i use files for compare.

0My post was wrong. This solution is brilliant. You are a genius.

Here is my proof of your solution.

Before we jump to the formal proof. Let us play with some simple cases.

suppose we have a series of characters:a,b,c,d

satisfied the criteria like in the solution. Then we have:

abcab (fix c, exchange a and b)

cab>acb (fix b, exchange a and c)

acb>abc (fix a, exchange b and c)

summarize above, we have cba>abc

2. more exercise for abcd, dbca [here you will see the pattern, I simply exchange the first and last element]

it is almost the same:

dbca>dbac>dabc>adbc>abdc>abcd

— — — — —

the trick is exchange one pair at a time: the element 'a' moves from last potion to first and then move the element 'd' down from first position to last.

Then we can conclude a general format

x Z y < y Z x if xx[j] and ij.

Obviously, we can construct the new solution by switching x[i] and x[j].

the new solution will look like

X' = x[0] x[1] x[2] ….x[i-1] x[j] x[i+1]…x[j-1] x[i] x[j+1]…x[n]

Comparing X and X', they have the same prefix (x[0..i-1] and suffix(x[j+1..n]). We do not need to consider them. The most interesting part is

in X', x[j]x[i+1]…x[j-1]x[i]

in X, x[i]x[i+1]…x[j-1]x[j]

Notice that we already prove X'j. This prove the solution.

0Did not notice that the "<" and ">" are special characters. Post it again.

Here is my proof of your solution.

Before we jump to the formal proof. Let us play with some simple cases.

suppose we have a series of characters:a,b,c,d

satisfied the criteria like in the solution. Then we have:

cab > cab (fix c, exchange a and b)

cab > acb (fix b, exchange a and c)

acb > abc (fix a, exchange b and c)

summarize above, we have cba > abc

2. more exercise for abcd, dbca [here you will see the pattern, I simply exchange the first and last element]

it is almost the same:

dbca > dbac (fix db, exchange a,c)

> dabc (fix d and c, exchange a,b)

> adbc (fix bc, exchange a,d)

> abdc (fix a and c, exchange b,d)

> abcd (fix ab, exchange c,d)

the trick is exchange one pair at a time: the element 'a' moves from last potion to first and then move the element 'd' down from first position to last.

Then we can conclude a general format

x {Z} y < y {Z} x if x < y and {Z} is in order. — A

Obviously, we can construct the new solution by switching x[i] and x[j].

the new solution will look like

X' = x[0]..x[i-1] x[j] x[i+1]..x[j-1] x[i] x[j+1]…x[n]

Comparing X and X', they have the same prefix (x[0..i-1] and suffix(x[j+1..n]). We do not need to consider them. The most interesting part is

in X', x[j]x[i+1]…x[j-1]x[i]

in X, x[i]x[i+1]…x[j-1]x[j]

Notice that we already prove X' < X (see the general format A). This prove the solution.

0Darn. The blog editor sucks. It ate up one of the important sentences.

We use proof of contradiction.

Given the minimum solution X, we prove that there is no such pair of x[i], x[j] where x[i] > x[j] and i < j.

Proof by contradiction, suppose we have one of such a pair of x[i], x[j], if we can construct a new solution X' based on X and X' < X, then we lead to a contradiction : X is not the minimum.

Obviously, we can construct X' by switching x[i], x[j] in X.

0I ended up with the same solution in Python. Add I had to do from my initial wrong solution has add the custom comparison function.

import sys

def lex_compare(x, y):

if x+y < y+x:

return -1

return 1

f = open(sys.argv[1])

m = int(f.readline())

for r in range(0,m):

l = f.readline().rstrip()

arr = l.split(" ")

l = arr[1:]

l.sort(cmp=lex_compare)

print ''.join(l)

0@Pop:

Thanks for your insightful comments. After reading your proof, I think I come up with a better argument (I mean better than my previous argument, not saying that my argument is better than yours), which is updated in my post above. I believe the key to the proof is the swapping and the transitive property.

0You are welcome. I also saw your reply to my post in MITBBS. I think this problem is well done. 🙂

Cheers,

Pop

0@Pop:

You're very welcomed too 🙂

I'm glad that you've figured it out!

0i used PHP for this game…

[php]

// THis is the main code to sort the words

if($_POST['generate']){

for($ii=0;$ii<$_POST[n];$ii++){

$list = $_POST['idx'.$ii];

$getIdx = explode(" ",$list);

asort ($getIdx);

reset ($getIdx);

while(list($key,$val)=each($getIdx)){

echo "$val";

}

echo "

";

}

}

[/php]

0Wrote mine using nodeJS. I used a sort algorithm that finds equal character sequences in both strings and then compares their possible combinations. I didn't have much time, it's certainly incomplete but seems to have worked for the input.

https://gist.github.com/794605

0Firstly, you have a nice BLOG, keep it going. Take a bow.

Comparing s1 + s2 with s2 + s1 is not the best solution. That means your each compare is O(m+n). It can be made O(max(m,n)) by recursive call.

for($i = 0; ($i < strlen($str1)) && ($i < strlen($str2)); $i++) {

//echo $i . ': comparing ' . $str1[$i] . ' ' . $str2[$i];

//echo "\n";

if(ord($str1[$i]) < ord($str2[$i])) {

return -1;

} else if(ord($str1[$i]) > ord($str2[$i])) {

return 1;

}

}

if(($i == strlen($str1)) && ($i == strlen($str2))) {

return 0;

}

if($i == strlen($str1)) {

$str2 = substr($str2, $i);

return self::cmp($str1, $str2);

}

if($i == strlen($str2)) {

$str1 = substr($str1, $i);

return self::cmp($str1, $str2);

}

0the proof would be easier and clearer if you can explicitly prove the transitive relation

a < b && b a < c it can be proved by enumerating the 2 cases of a < b ( length(a) == length(b) && a < b lexicographically, or a = bx where the x < a lexicographically ) but it's not that clean

0I implemented a brute force string compare function:

int string_compare(const char *str1, const char *str2)

{

const char *p1=str1, *p2=str2;

for (; *p1 != ” && *p2 != ” && *p1 == *p2; p1++, p2++);

if (*p1==” && *p2==”)

return 0;

if (*p1 == ”)

return string_compare(str1, p2);

else if (*p2 == ”)

return string_compare(p1, str2);

else

return (int)(*p1) > (int)(*p2);

}

0Why would ‘ji’ appear later than ‘jibw’ in the dictionary.

Since ‘ji’ is a prefix of ‘jibw’ and len(‘ji’) check out ‘cat’ & ‘cataract’ in the dictionary.

0they want to connect these words. in ur example, the solution is that jibwji appears before jijibw in dictionary.

0@1337c0d3r

I think the easiest way to compare is to redefine the comparing as following:

one string is smaller if the character at the same index is smaller(which is the original definition), or the longer string’s next character is smaller than the first character.

eg. jibx ji

this just depends on the third character comparing with first character.

this is easy to implement

0I mean jibw is smaller than ji

and jiwb is larger than ji

0Here is a ruby program. Since this wa a quick 10 minute hack, I didn’t compare performance.

biginput = [

[‘facebook’, ‘hacker’, ‘cup’, ‘for’, ‘studious’, ‘students’],

[‘k’, ‘duz’, ‘q’, ‘rc’, ‘lvraw’],

[‘mybea’, ‘zdr’, ‘yubx’, ‘xe’, ‘dyroiy’],

[‘jibw’, ‘ji’, ‘jp’, ‘bw’, ‘jibw’],

[‘uiuy’, ‘hopji’, ‘li’, ‘j’, ‘dcyi’],

[‘ba’, ‘b’]

]

biginput.each do |input|

out = “”

(‘a’..’z’).each do |x|

temp = input.select{|str| str.start_with? x}

temp.each {|str| str = str<<x}

# puts temp.inspect

temp.sort!

# puts temp.inspect

temp.each do |res|

out << res[0..-2]

end

end

puts out

end

0@1337c0d3r

Unbelievable! Thanks for that!!

My personal justification of the correctness of the new sorting scheme is :

Given any ordering of any set of words, find the first pair of word (ith and i+1 th) such that i+1 appear before i in the new sorting rule.

Swap ith and i+1 th word, notice the entire word gets smaller in dictionary order.

Repeat the behavior until no such pair exists.

Claim optimum is achieved.

0@1337c0d3r

You are a genius! Thank you very much for your post! I have learnt a lot from it!

I have a question that needs your help. How can we prove transitivity? I mean, if we know

s1 + s2 < s2 + s1 and s2 + s3 < s3 + s2

how to prove s1 + s3 < s3 + s1? Thank you!

0Pingback: LeetCode | techinterviewsolutions