Tomasz Sowa
379adf6a69
allow to parse a time decimal fraction in ParseTime() method
...
while here:
- let ParseDate() is able to parse such formats: "20081012" (without a separator)
and without the month or day e.g: "2008" or "200810"
- let ParseTime() is able to parse a time without separators, e.g.:
"141030", or "1410" or just "14"
- let Parse(...) method use ParseDate() and ParseTime()
this will parse a format similar to ISO 8601
2022-12-23 02:15:11 +01:00
Tomasz Sowa
74230d667b
change headerfile_picotools_* macros to headerfile_pikotools_*
2022-06-30 12:45:08 +02:00
Tomasz Sowa
cadba907b2
change licence from 3-Clause BSD to 2-Clause BSD
2022-06-30 12:09:22 +02:00
Tomasz Sowa
44bda888b5
fix: do not unescape xml sequences in filter mode
2022-06-01 05:17:30 +02:00
Tomasz Sowa
5253963c84
fix: put a white char before an opening tag in tree mode if it was in the source html
2022-02-08 16:34:54 +01:00
Tomasz Sowa
fd1a8270cd
read CDATA as an ordinary text
2022-01-18 19:36:40 +01:00
Tomasz Sowa
b781948f21
HTMLParser now parses correctly such entities: & < > " '
2021-12-02 17:44:41 +01:00
Tomasz Sowa
2dadfc0809
added: HTMLParser::ItemParsedListener listener with an item_parsed(...) method which is called when a tag is parsed by the parser
2021-11-30 16:27:27 +01:00
Tomasz Sowa
17d2c0fb25
- added some converting methods: esc_to_json(...), esc_to_xml(...), esc_to_csv() (convert/misc.h)
...
- BaseParser: added possibility to read from TextStream and WTextStream
- HTMLParser: added filter(const WTextStream & in, Stream & out, ...) method
- added utf8_stream.h with one method:
template<typename StreamIteratorType>
size_t utf8_to_int(
StreamIteratorType & iterator_in,
StreamIteratorType & iterator_end,
int & res,
bool & correct)
2021-10-12 19:53:11 +02:00
Tomasz Sowa
f23cabfb2f
added to HTMLParser: filter_file(...) methods for filtering from a file
2021-10-02 20:34:19 +02:00
Tomasz Sowa
2576eb12d1
HTMLParser: start working on xml mode
...
added methods:
Status parse_xml_file(const char * file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const std::string & file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const wchar_t * file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
Status parse_xml_file(const std::wstring & file_name, Space & out_space, bool compact_mode = false, bool clear_space = true);
2021-08-10 21:56:04 +02:00
Tomasz Sowa
b1cc64a29b
added a compact_mode option when creating a space output
2021-08-10 01:45:10 +02:00
Tomasz Sowa
b8a03bf852
HTMLParser: added possibility to parse html to Space class
...
added method: HTMLParser::parse_html(const wchar_t * in, Space & space)
2021-08-07 21:21:16 +02:00
Tomasz Sowa
8c5ede5cf3
HTMLParser: for <script> and <!- (comments) we copy the content without parsing
2021-08-07 02:13:13 +02:00
Tomasz Sowa
fdfd0b1385
renamed: HTMLFilter -> HTMLParser
2021-08-06 17:10:19 +02:00