一 题目
You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
Example 1:
Input:
s = "barfoothefoobarman",
words = ["foo","bar"]
Output: [0,9]
Explanation: Substrings starting at index 0 and 9 are "barfoor" and "foobar" respectively.
The output order does not matter, returning [9,0] is fine too.
Example 2:
Input:
s = "wordgoodgoodgoodbestword",
words = ["word","good","best","word"]
Output: []
二 分析
大意是给定一个字符串和一个包含若干个词的列表,然后找出列表中所有词的各种组合在字符串中的位置。注意情况是仅一次。
本题是hard级别。通常情况下hard级别的题目对于暴力循环的方式会超时TLE。
会遇到重复的情况
"wordgoodgoodgoodbestword"
["word","good","best","good"]
所以开始想的,因为子串的长度固定,就按照子串长度去匹配吧,如果len=3,大概效果这样【0,3,6,9.。。。】
还会遇到不是标准对齐的情况:
"lingmindraboofooowingdingbarrwingmonkeypoundcake"
["fooo","barr","wing","ding","wing"]
所以,不但要考虑上面遍历一遍,还是需要逐个字母去偏移后迭代。然后偏移一个字符 1,4,7,10...然后再偏移一个字符 2,5,8,11.。。。这样就可以吧全部case覆盖到。
因为考虑到重复的情况,所以需要放到hashmap( map)对应来记录 words 里的所有词及出现的次数,然后我们一个单词一个单词的遍历,如果当前遍历的到的单词t在 map 中存在,那么我们将其加入另一个哈希表 curmap 中,如果在 curmap 中个数小于等于 map 中的个数,那么我们 count 自增1,如果大于了,那么需要做一些处理,说明不连续了,就break。
如果count==words的数量,说明子串都是 words 中的单词,并且刚好构成了 words,则将当前位置i加入结果 res 即可。
public static void main(String[] args) {
String[] words4 = {"bar","foo","the"};
List<Integer> res4 = findSubstring( "barfoofoobarthefoobarman",words4);
System.out.println(JSON.toJSON(res4));
String[] words6 = {"fooo","barr","wing","ding","wing"};
List<Integer> res6 = findSubstring( "lingmindraboofooowingdingbarrwingmonkeypoundcake",words6);
System.out.println(JSON.toJSON(res6));
String[] words5 = {"word","good","best","good"};
List<Integer> res5 = findSubstring( "wordgoodgoodgoodbestword",words5);
System.out.println(JSON.toJSON(res5));
String[] words = {"foo","bar"};
List<Integer> res = findSubstring( "barfoothefoobarman",words);
System.out.println(JSON.toJSON(res));
String[] words1 = {"aa","aa"};
List<Integer> res1 = findSubstring( "aaa",words1);
System.out.println(JSON.toJSON(res1));
String[] words3 = {"ba","ab","ab"};
List<Integer> res3 = findSubstring( "abaababbaba",words3);
System.out.println(JSON.toJSON(res3));
String[] words2 = {"word","good","best","word"};
List<Integer> res2 = findSubstring( "wordgoodgoodgoodbestword",words2);
System.out.println(JSON.toJSON(res2));
}
//滑动窗口
public static List<Integer> findSubstring(String s, String[] words) {
List<Integer> res = new ArrayList();
HashMap<String,Integer> map = new HashMap();
//coner case
if(s.equals("")||words.length==0){
return res;
}
for(String word:words){
if(map.containsKey(word)){
map.put(word, map.get(word)+1);
}
else{
map.put(word, 1);
}
}
//length
int l = words[0].length();
int w= words.length;
// 迭代字母
for (int i = 0; i <= s.length() - w * l; i++) {
int count = 0;
HashMap<String, Integer> curMap = new HashMap();
//单词迭代
for (int j = 0; j < w; j++) {
String tmp = s.substring(i + j * l, i + (j + 1) * l);
if (map.containsKey(tmp)) {
if (curMap.containsKey(tmp)) {
curMap.put(tmp, curMap.get(tmp) + 1);
} else {
curMap.put(tmp, 1);
}
if (curMap.get(tmp) <= map.get(tmp)) {
count++;
} else {
break;
}
} else {
curMap.clear();
count = 0;
break;
}
}
if (count == w) {
res.add(i);
}
}
return res;
}
Runtime: 96 ms, faster than 31.44% of Java online submissions forSubstring with Concatenation of All Words.
Memory Usage: 41 MB, less than 45.24% of Java online submissions forSubstring with Concatenation of All Words.
还是挺慢的。网上还看到了滑动窗口的解法。
2.2 滑动窗口
今天补充上9.12,目前使用这个算法超出我的能力了,看了大神的文章。
上面的一样:变化的是用left来记录左边界的位置,count 表示当前已经匹配的单词的个数。
特殊情况处理:如:wordgoodgoodgoodbestword,words= {"word","good","best","good"}.
这种情况下good出现第三次的时候。要重头开始循环,没截取一个单词,都对应的curmap 减去。如果curmap对应的value比map小,计数的count还要减去。为啥要用循环的?
我开始想,直接保留一个good,再循环不是更快嘛?知道遇到abababab 的case。{"a","b","a"}
在匹配完aba之后,就是bab,b第二次出现的时候,大于了map出现1次。这时候就不好处理了,不知道重复的元素在哪里?所以要用循环,从left开始匹配,再相应的处理。我觉得这里就是本题的难点所在。
如果某个时刻 count 和 w相等了,说明我们成功匹配了一个位置,那么将当前左边界 left 存入结果 res 中,此时去掉最左边的一个词,同时 count 自减1,左边界右移 l,继续匹配。如果我们匹配到一个不在 map 中的词,那么说明跟前面已经断开了,我们重置 curmap,count 为0,左边界left移到 j+l.
class Solution {
public List<Integer> findSubstring(String s, String[] words) {
Set<Integer> res = new HashSet();
HashMap<String,Integer> map = new HashMap();
//coner case
if(s.equals("")||words.length==0){
return new ArrayList(res);
}
for(String word:words){
if(map.containsKey(word)){
map.put(word, map.get(word)+1);
}
else{
map.put(word, 1);
}
}
//length
int l = words[0].length();
int w= words.length;
// 迭代字母
for (int i = 0; i <= l; i++) {
int left = i, count = 0;
HashMap<String, Integer> curMap = new HashMap();
//单词迭代
for (int j = i; j <= s.length()-l; j=j+l) {
String tmp = s.substring(j , j + l);
if (map.containsKey(tmp)) {
if (curMap.containsKey(tmp)) {
curMap.put(tmp, curMap.get(tmp) + 1);
} else {
curMap.put(tmp, 1);
}
if (curMap.get(tmp) <= map.get(tmp)) {
count++;
} else {
while(curMap.get(tmp)>map.get(tmp)){
String t1 = s.substring(left,left+l);
if(curMap .containsKey(t1)){
curMap.put(t1, curMap.get(t1)-1);
if(curMap.get(t1)<map.get(t1) ){
count --;
}
}
left = left+l;
}
}
if (count == w) {
res.add(left);
count --;
String t2 = s.substring(left,left+l);
if(curMap.containsKey(t2))
curMap.put(t2, curMap.get(t2)-1);
//左边界右移
left = left+l;
}
} else {//断开
curMap.clear();
count = 0;
left = j+l;
}
}
}
return new ArrayList(res);
}
}
Runtime: 12 ms, faster than 78.90% of Java online submissions forSubstring with Concatenation of All Words.
Memory Usage: 39.4 MB, less than 92.86% of Java online submissions forSubstring with Concatenation of All Words.
时间复杂度是O(N)的。
参考:
https://blog.csdn.net/linhuanmars/article/details/20342851
还需要不断练习。
本文探讨了在给定字符串中寻找特定单词列表所有可能组合的问题,通过滑动窗口算法优化解决,提高了搜索效率。

549

被折叠的 条评论
为什么被折叠?



