Hi~
Hi~
文章目录
  1. 前言
  2. 3. Longest Substring Without Repeating Characters
    1. 审题&思路
    2. 思路误区
    3. 解题
      1. 低性能的遍历
      2. 滑动窗口
      3. 滑动窗口优化
    4. 解题总结
  3. 总结

划水 leetcode:3

前言

第二篇算法总结,总结 leetcode 第 3 题。

3. Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters.

审题&思路

本题大意:给定一个字符串,找到最长无重复自负的子字符串的长度。

思路误区

这道题一开始以为可以找到两个相同字符并计算之间的长度,但相同字符之间可能存在不同字符开头所构成的子字符串比相同字符之间子字符串长度更长。

解题

本题难易:中等。

低性能的遍历

本题的解题直觉是遍历字符串找出不同子字符串,为了优化性能,我一遍循环实现遍历,但由于设置了标志位,所以如果遇到相同字符回到该字符的上一个位置的下一个字符,重新开始查找。性能较差,如果遇到最坏情况可能会达到 O(n^2)的情况。

class Solution {
public:
int lengthOfLongestSubstring(string s) {

int maxLength = 0;
std::map<char, int> stringMap;
int sameIndex = 0;
cout << "NY String:" << s << endl;
for (int i = 0; i < s.length(); i++) {
cout << "\nCurrent: s[i]=" << s[i] << endl;

if (stringMap[s[i]])
{
sameIndex = stringMap[s[i]];
i = sameIndex;
stringMap = map<char, int>();
} else {
stringMap[s[i]] = i;
}
maxLength = stringMap.size() > maxLength ? stringMap.size() : maxLength;
printMap(stringMap);
}

return maxLength;
}
};
  • 控制台输出
NY String:abcdefggadvopesgacbesmqrs

Current: s[i]=a
a
Current: s[i]=b
ab
Current: s[i]=c
abc
Current: s[i]=d
abcd
Current: s[i]=e
abcde
Current: s[i]=f
abcdef
Current: s[i]=g
abcdefg
Current: s[i]=g

Current: s[i]=g
g
Current: s[i]=a
ag
Current: s[i]=d
adg
Current: s[i]=v
adgv
Current: s[i]=o
adgov
Current: s[i]=p
adgopv
Current: s[i]=e
adegopv
Current: s[i]=s
adegopsv
Current: s[i]=g

Current: s[i]=a
a
Current: s[i]=d
ad
Current: s[i]=v
adv
Current: s[i]=o
adov
Current: s[i]=p
adopv
Current: s[i]=e
adeopv
Current: s[i]=s
adeopsv
Current: s[i]=g
adegopsv
Current: s[i]=a

Current: s[i]=d
d
Current: s[i]=v
dv
Current: s[i]=o
dov
Current: s[i]=p
dopv
Current: s[i]=e
deopv
Current: s[i]=s
deopsv
Current: s[i]=g
degopsv
Current: s[i]=a
adegopsv
Current: s[i]=c
acdegopsv
Current: s[i]=b
abcdegopsv
Current: s[i]=e

Current: s[i]=s
s
Current: s[i]=g
gs
Current: s[i]=a
ags
Current: s[i]=c
acgs
Current: s[i]=b
abcgs
Current: s[i]=e
abcegs
Current: s[i]=s

Current: s[i]=g
g
Current: s[i]=a
ag
Current: s[i]=c
acg
Current: s[i]=b
abcg
Current: s[i]=e
abceg
Current: s[i]=s
abcegs
Current: s[i]=m
abcegms
Current: s[i]=q
abcegmqs
Current: s[i]=r
abcegmqrs
Current: s[i]=s

Current: s[i]=m
m
Current: s[i]=q
mq
Current: s[i]=r
mqr
Current: s[i]=s
mqrs

滑动窗口

为了优化遍历的性能,可以使用 HashSet 作为滑动窗口,那么检查当前字符的时间复杂度就可以是 O(1)。

A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in the array/string which usually defined by the start and end indices, i.e. [i, j)[i,j) (left-closed, right-open). A sliding window is a window “slides” its two boundaries to the certain direction. For example, if we slide [i, j)[i,j) to the right by 11 element, then it becomes [i+1, j+1)[i+1,j+1) (left-closed, right-open).

滑动窗口方法主要是假定一个前闭后开的区间,在这个区间内如果不存在当前遍历的字符则加入该字符,如果已存在相同字符,则将该字符之前的所有字符移除,直到当前重复字符不存在集合中。

考察的数据类型为 HashSet ,集合是一种关联容器。集合的特点:1. 元素是唯一的,没有重复的;2. 集合中的元素是排好序的。

class Solution {
public:
int lengthOfLongestSubstring(string s) {

int maxLength = 0, i = 0, j = 0;
std::set<char> characterSet;
cout << "String: " << s << endl;
while (i < s.length() && j < s.length()) {
cout << "Current: s[i]=" << s[i] << " Current s[j]=" << s[j] << endl;
if (!(characterSet.find(s[j]) != characterSet.end()))
{
cout << "Not Contain Character" << endl;
characterSet.insert(s[j++]);
maxLength = max(maxLength, j-i);
} else {
characterSet.erase(s[i++]);
}
set<char>::iterator it;
for(it = characterSet.begin(); it != characterSet.end(); it++)
cout << (*it) << " ";
cout << " " << endl;
}
return maxLength;
}
};
  • 控制台输出
String: abcdefggadvopesgacbesmqrs
Current: s[i]=a Current s[j]=a
Not Contain Character
a
Current: s[i]=a Current s[j]=b
Not Contain Character
ab
Current: s[i]=a Current s[j]=c
Not Contain Character
abc
Current: s[i]=a Current s[j]=d
Not Contain Character
abcd
Current: s[i]=a Current s[j]=e
Not Contain Character
abcde
Current: s[i]=a Current s[j]=f
Not Contain Character
abcdef
Current: s[i]=a Current s[j]=g
Not Contain Character
abcdefg
Current: s[i]=a Current s[j]=g
bcdefg
Current: s[i]=b Current s[j]=g
cdefg
Current: s[i]=c Current s[j]=g
defg
Current: s[i]=d Current s[j]=g
efg
Current: s[i]=e Current s[j]=g
fg
Current: s[i]=f Current s[j]=g
g
Current: s[i]=g Current s[j]=g

Current: s[i]=g Current s[j]=g
Not Contain Character
g
Current: s[i]=g Current s[j]=a
Not Contain Character
ag
Current: s[i]=g Current s[j]=d
Not Contain Character
adg
Current: s[i]=g Current s[j]=v
Not Contain Character
adgv
Current: s[i]=g Current s[j]=o
Not Contain Character
adgov
Current: s[i]=g Current s[j]=p
Not Contain Character
adgopv
Current: s[i]=g Current s[j]=e
Not Contain Character
adegopv
Current: s[i]=g Current s[j]=s
Not Contain Character
adegopsv
Current: s[i]=g Current s[j]=g
adeopsv
Current: s[i]=a Current s[j]=g
Not Contain Character
adegopsv
Current: s[i]=a Current s[j]=a
degopsv
Current: s[i]=d Current s[j]=a
Not Contain Character
adegopsv
Current: s[i]=d Current s[j]=c
Not Contain Character
acdegopsv
Current: s[i]=d Current s[j]=b
Not Contain Character
abcdegopsv
Current: s[i]=d Current s[j]=e
abcegopsv
Current: s[i]=v Current s[j]=e
abcegops
Current: s[i]=o Current s[j]=e
abcegps
Current: s[i]=p Current s[j]=e
abcegs
Current: s[i]=e Current s[j]=e
abcgs
Current: s[i]=s Current s[j]=e
Not Contain Character
abcegs
Current: s[i]=s Current s[j]=s
abceg
Current: s[i]=g Current s[j]=s
Not Contain Character
abcegs
Current: s[i]=g Current s[j]=m
Not Contain Character
abcegms
Current: s[i]=g Current s[j]=q
Not Contain Character
abcegmqs
Current: s[i]=g Current s[j]=r
Not Contain Character
abcegmqrs
Current: s[i]=g Current s[j]=s
abcemqrs
Current: s[i]=a Current s[j]=s
bcemqrs
Current: s[i]=c Current s[j]=s
bemqrs
Current: s[i]=b Current s[j]=s
emqrs
Current: s[i]=e Current s[j]=s
mqrs
Current: s[i]=s Current s[j]=s
mqr
Current: s[i]=m Current s[j]=s
Not Contain Character
mqrs

比较两个算法的控制台输出结果,相比我的算法,滑动窗口的优势在于,不需要重新在重复的元素开始遍历一遍,减少了时间损耗,时间复杂度 O(2n)。

滑动窗口优化

上述滑动窗口的算法,仍然会使 i 进行一遍遍历,如果想优化 2n,则需要使用 Map 对字符的下标索引进行记录,那么我们就可以在找到重复字符时直接跳过那些字符。

class Solution {
public:
int lengthOfLongestSubstring(string s) {
int maxLength = 0;
std::map<char, int> charMap;
cout << "\nString: " << s << endl;
int n = s.length();
for (int i = 0, j = 0; j < n; j++)
{
cout << "\nCurrent: s[i("<<i<<")]=" << s[i] << " Current s[j("<<j<<")]=" << s[j] << endl;
if (charMap.find(s[j]) != charMap.end())
{
cout << "Contain:" << s[j] << " charMap["<< s[j] <<"]=" << charMap[s[j]] << " i=" << i << endl;
i = max(charMap[s[j]], i);
}
maxLength = max(maxLength, j - i + 1);
cout << "Max Length:" << maxLength << endl;
charMap[s[j]] = j + 1;
printMap(charMap);
}
return maxLength;
}
};
  • 控制台输出
String: abcdefggadvopesgacbesmqrs

Current: s[i(0)]=a Current s[j(0)]=a
Max Length:1
a
1


Current: s[i(0)]=a Current s[j(1)]=b
Max Length:2
ab
1 2


Current: s[i(0)]=a Current s[j(2)]=c
Max Length:3
abc
1 2 3


Current: s[i(0)]=a Current s[j(3)]=d
Max Length:4
abcd
1 2 3 4


Current: s[i(0)]=a Current s[j(4)]=e
Max Length:5
abcde
1 2 3 4 5


Current: s[i(0)]=a Current s[j(5)]=f
Max Length:6
abcdef
1 2 3 4 5 6


Current: s[i(0)]=a Current s[j(6)]=g
Max Length:7
abcdefg
1 2 3 4 5 6 7


Current: s[i(0)]=a Current s[j(7)]=g
Contain:g charMap[g]=7 i=0
Max Length:7
abcdefg
1 2 3 4 5 6 8


Current: s[i(7)]=g Current s[j(8)]=a
Contain:a charMap[a]=1 i=7
Max Length:7
abcdefg
9 2 3 4 5 6 8


Current: s[i(7)]=g Current s[j(9)]=d
Contain:d charMap[d]=4 i=7
Max Length:7
abcdefg
9 2 3 10 5 6 8


Current: s[i(7)]=g Current s[j(10)]=v
Max Length:7
abcdefgv
9 2 3 10 5 6 8 11


Current: s[i(7)]=g Current s[j(11)]=o
Max Length:7
abcdefgov
9 2 3 10 5 6 8 12 11


Current: s[i(7)]=g Current s[j(12)]=p
Max Length:7
abcdefgopv
9 2 3 10 5 6 8 12 13 11


Current: s[i(7)]=g Current s[j(13)]=e
Contain:e charMap[e]=5 i=7
Max Length:7
abcdefgopv
9 2 3 10 14 6 8 12 13 11


Current: s[i(7)]=g Current s[j(14)]=s
Max Length:8
abcdefgopsv
9 2 3 10 14 6 8 12 13 15 11


Current: s[i(7)]=g Current s[j(15)]=g
Contain:g charMap[g]=8 i=7
Max Length:8
abcdefgopsv
9 2 3 10 14 6 16 12 13 15 11


Current: s[i(8)]=a Current s[j(16)]=a
Contain:a charMap[a]=9 i=8
Max Length:8
abcdefgopsv
17 2 3 10 14 6 16 12 13 15 11


Current: s[i(9)]=d Current s[j(17)]=c
Contain:c charMap[c]=3 i=9
Max Length:9
abcdefgopsv
17 2 18 10 14 6 16 12 13 15 11


Current: s[i(9)]=d Current s[j(18)]=b
Contain:b charMap[b]=2 i=9
Max Length:10
abcdefgopsv
17 19 18 10 14 6 16 12 13 15 11


Current: s[i(9)]=d Current s[j(19)]=e
Contain:e charMap[e]=14 i=9
Max Length:10
abcdefgopsv
17 19 18 10 20 6 16 12 13 15 11


Current: s[i(14)]=s Current s[j(20)]=s
Contain:s charMap[s]=15 i=14
Max Length:10
abcdefgopsv
17 19 18 10 20 6 16 12 13 21 11


Current: s[i(15)]=g Current s[j(21)]=m
Max Length:10
abcdefgmopsv
17 19 18 10 20 6 16 22 12 13 21 11


Current: s[i(15)]=g Current s[j(22)]=q
Max Length:10
abcdefgmopqsv
17 19 18 10 20 6 16 22 12 13 23 21 11


Current: s[i(15)]=g Current s[j(23)]=r
Max Length:10
abcdefgmopqrsv
17 19 18 10 20 6 16 22 12 13 23 24 21 11


Current: s[i(15)]=g Current s[j(24)]=s
Contain:s charMap[s]=21 i=15
Max Length:10
abcdefgmopqrsv
17 19 18 10 20 6 16 22 12 13 23 24 25 11

这个算法是有点难理解的,算法并没有实际去查找最长不重复字符字串具体是什么,而是只求当前遍历的字符距离上一个重复的字符的最大距离。

解题总结

  • set 集合是一种关联容器。
  • set 集合的特点:1. 元素是唯一的,没有重复的;2. 集合中的元素是排好序的。
  • 提交代码时须注释掉 console 输出,否则即便是优化后的算法也会超时。 非常重要!!!

总结

现在还是欠缺算法时间复杂度的分析,还是不能用数学公式明确地表示出算法时间复杂度,算法分析仍然需要多练习,虽然有点挫败,慢慢来。买书学习是一定的了。

支持一下
扫一扫,支持forsigner
  • 微信扫一扫
  • 支付宝扫一扫