[LeetCode] HTML Entity Parser

1410. HTML Entity Parser

HTML entity parser is the parser that takes HTML code as input and replace all the entities of the special characters by the characters itself.

The special characters and their entities for HTML are:

  • Quotation Mark: the entity is " and symbol character is “.
  • Single Quote Mark: the entity is ' and symbol character is ‘.
  • Ampersand: the entity is & and symbol character is &.
  • Greater Than Sign: the entity is > and symbol character is >.
  • Less Than Sign: the entity is &lt; and symbol character is <.
  • Slash: the entity is &frasl; and symbol character is /.

Given the input text string to the HTML parser, you have to implement the entity parser.

Return the text after replacing the entities by the special characters.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Solution {
bool helper(string k, string& s, int p) {
for(int i = 0; i < k.length(); i++) {
if(p + i == s.length()) return false;
if(s[i + p] != k[i]) return false;
}
return true;
}
public:
string entityParser(string text) {
string res = "";
text.push_back('#');
int i = 0, n = text.length();
unordered_map<string, string> mp;
mp["&quot;"] = "\"";
mp["&apos;"] = "'";
mp["&amp;"] = "&";
mp["&gt;"] = ">";
mp["&lt;"] = "<";
mp["&frasl;"] = "/";
while(i < n) {
bool eq = false;
for(auto& [k, v] : mp) {
if(helper(k,text,i)) {
res += v;
i += k.length();
eq = true;
}
}
if(!eq) {
res.push_back(text[i++]);
}
}
res.pop_back();
return res;
}
};
Author: Song Hayoung
Link: https://songhayoung.github.io/2022/07/21/PS/LeetCode/html-entity-parser/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.