[LeetCode] 472. Concatenated Words

 

Given an array of strings `words` (without duplicates), return  _all the concatenated words in the given list of_  `words`.

A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.

 

Example 1:
    
    
    Input: words = ["cat","cats","catsdogcats","dog","dogcatsdog","hippopotamuses","rat","ratcatdogcat"]
    Output: ["catsdogcats","dogcatsdog","ratcatdogcat"]
    Explanation: "catsdogcats" can be concatenated by "cats", "dog" and "cats"; 
    "dogcatsdog" can be concatenated by "dog", "cats" and "dog"; 
    "ratcatdogcat" can be concatenated by "rat", "cat", "dog" and "cat".

Example 2:
    
    
    Input: words = ["cat","dog","catdog"]
    Output: ["catdog"]
    

 

Constraints:

  * `1 <= words.length <= 104`
  * `0 <= words[i].length <= 1000`
  * `words[i]` consists of only lowercase English letters.
  * `0 <= sum(words[i].length) <= 105`



 

这道题给了一个由单词组成的数组，某些单词是可能由其他的单词组成的，让我们找出所有这样的单词。这道题跟之前那道 [Word Break](http://www.cnblogs.com/grandyang/p/4257740.html) 十分类似，可以对每一个单词都调用之前那题的方法，首先把所有单词都放到一个 HashSet 中，这样可以快速找到某个单词是否在数组中存在。对于当前要判断的单词，先将其从 HashSet 中删去，然后调用之前的 [Word Break](http://www.cnblogs.com/grandyang/p/4257740.html) 的解法，具体讲解可以参见之前的帖子。如果是可以拆分，那么就存入结果 res 中，参见代码如下：

 

解法一：
    
    
    class Solution {
    public:
        vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
            if (words.size() <= 2) return {};
            vector<string> res;
            unordered_set<string> dict(words.begin(), words.end());
            for (string word : words) {
                dict.erase(word);
                int len = word.size();
                if (len == 0) continue;
                vector<bool> v(len + 1, false);
                v[0] = true;
                for (int i = 0; i < len + 1; ++i) {
                    for (int j = 0; j < i; ++j) {
                        if (v[j] && dict.count(word.substr(j, i - j))) {
                            v[i] = true;
                            break;
                        }
                    }
                }
                if (v.back()) res.push_back(word);
                dict.insert(word);
            }
            return res;
        }
    };

 

下面这种方法跟上面的方法很类似，不同的是判断每个单词的时候不用将其移除 HashSet，而是在判断的过程中加了判断，使其不会判断单词本身是否在 HashSet 中存在，而且由于对单词中子字符串的遍历顺序不同，加了一些优化在里面，使得其运算速度更快一些，参见代码如下：

 

解法二：
    
    
    class Solution {
    public:
        vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
            vector<string> res;
            unordered_set<string> dict(words.begin(), words.end());
            for (string word : words) {
                int n = word.size();
                if (n == 0) continue;
                vector<bool> dp(n + 1, false);
                dp[0] = true;
                for (int i = 0; i < n; ++i) {
                    if (!dp[i]) continue;
                    for (int j = i + 1; j <= n; ++j) {
                        if (j - i < n && dict.count(word.substr(i, j - i))) {
                            dp[j] = true;
                        }
                    }
                    if (dp[n]) {res.push_back(word); break;}
                }
            }
            return res;
        }
    };

 

下面这种方法是递归的写法，其中递归函数中的 cnt 表示有其他单词组成的个数，至少得由其他两个单词组成才符合题意，参见代码如下：

 

解法三：
    
    
    class Solution {
    public:
        vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
            vector<string> res;
            unordered_set<string> dict(words.begin(), words.end());
            for (string word : words) {
                if (word.empty()) continue;
                if (helper(word, dict, 0, 0)) {
                    res.push_back(word);
                }
            }
            return res;
        }
        bool helper(string& word, unordered_set<string>& dict, int pos, int cnt) {
            if (pos >= word.size() && cnt >= 2) return true;
            for (int i = 1; i <= (int)word.size() - pos; ++i) {
                string t = word.substr(pos, i);
                if (dict.count(t) && helper(word, dict, pos + i, cnt + 1)) {
                    return true;
                }
            }
            return false;
        }
    };

 

Github 同步地址：

<https://github.com/grandyang/leetcode/issues/472>

 

类似题目：

[Word Break](http://www.cnblogs.com/grandyang/p/4257740.html)

 

参考资料：

<https://leetcode.com/problems/concatenated-words/>

<https://leetcode.com/problems/concatenated-words/discuss/95652/Java-DP-Solution>

<https://leetcode.com/problems/concatenated-words/discuss/95677/c-772-ms-dp-solution>

<https://leetcode.com/problems/concatenated-words/discuss/95717/c-600ms-20-lines-of-code-dfs-solution-is-there-any-way-to-optimize>

 

[LeetCode All in One 题目讲解汇总(持续更新中...)](http://www.cnblogs.com/grandyang/p/4606334.html)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LeetCode] 472. Concatenated Words #472

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development