Tomasz Sowa
|
5253963c84
|
fix: put a white char before an opening tag in tree mode if it was in the source html
|
2022-02-08 16:34:54 +01:00 |
Tomasz Sowa
|
0100c7e453
|
fix: check correctly for new lines when filtering html
|
2022-02-08 14:52:50 +01:00 |
Tomasz Sowa
|
fd1a8270cd
|
read CDATA as an ordinary text
|
2022-01-18 19:36:40 +01:00 |
Tomasz Sowa
|
b781948f21
|
HTMLParser now parses correctly such entities: & < > " '
|
2021-12-02 17:44:41 +01:00 |
Tomasz Sowa
|
2dadfc0809
|
added: HTMLParser::ItemParsedListener listener with an item_parsed(...) method which is called when a tag is parsed by the parser
|
2021-11-30 16:27:27 +01:00 |
Tomasz Sowa
|
c54c398828
|
fixed in HTMLParser: </nofilter> tag was printed
|
2021-10-13 00:40:55 +02:00 |
Tomasz Sowa
|
17d2c0fb25
|
- added some converting methods: esc_to_json(...), esc_to_xml(...), esc_to_csv() (convert/misc.h)
- BaseParser: added possibility to read from TextStream and WTextStream
- HTMLParser: added filter(const WTextStream & in, Stream & out, ...) method
- added utf8_stream.h with one method:
template<typename StreamIteratorType>
size_t utf8_to_int(
StreamIteratorType & iterator_in,
StreamIteratorType & iterator_end,
int & res,
bool & correct)
|
2021-10-12 19:53:11 +02:00 |
Tomasz Sowa
|
4902eb6037
|
fixed: in HTMLParser::CheckClosingTags() don't return immediately if stack_len is equal to 2
|
2021-10-03 13:22:49 +02:00 |
Tomasz Sowa
|
abe349be34
|
small refactoring in HTMLParser
|
2021-10-02 21:01:09 +02:00 |
Tomasz Sowa
|
f23cabfb2f
|
added to HTMLParser: filter_file(...) methods for filtering from a file
|
2021-10-02 20:34:19 +02:00 |
Tomasz Sowa
|
5b2583b566
|
fixed in HTMLParser: sometimes a closing item left on the stack, for stack_len < 3 there was not PopStack() called
|
2021-10-02 18:45:02 +02:00 |
Tomasz Sowa
|
2576eb12d1
|
HTMLParser: start working on xml mode
added methods:
Status parse_xml_file(const char * file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const std::string & file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const wchar_t * file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const std::wstring & file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
|
2021-08-10 21:56:04 +02:00 |
Tomasz Sowa
|
b1cc64a29b
|
added a compact_mode option when creating a space output
|
2021-08-10 01:45:10 +02:00 |
Tomasz Sowa
|
b8a03bf852
|
HTMLParser: added possibility to parse html to Space class
added method: HTMLParser::parse_html(const wchar_t * in, Space & space)
|
2021-08-07 21:21:16 +02:00 |
Tomasz Sowa
|
8c5ede5cf3
|
HTMLParser: for <script> and <!- (comments) we copy the content without parsing
|
2021-08-07 02:13:13 +02:00 |
Tomasz Sowa
|
fdfd0b1385
|
renamed: HTMLFilter -> HTMLParser
|
2021-08-06 17:10:19 +02:00 |