Read the whole character from a multibyte string (as int/char32_t) and
then check if it needs to be escaped. Also don't use a tmp stream object
when serializing between wide/char strings.
while here:
- add try_esc_to_space(...) global function
- add wide_to_output_function(const wchar_t * str, size_t len, OutputFunction output_function, int mode)
- add wide_to_output_function(const wchar_t * str, OutputFunction output_function, int mode)
Add an operator<<(char32_t) to the Stream class, char32_t will be used
as a main character instead of a wchar_t (this is needed on systems
where sizeof(wchar_t) is equal to 2).
while here:
- add to utf8:
size_t wide_to_int(const Stream & stream, size_t stream_index, int & res, bool & correct)
template<typename StreamType, typename OutputFunction> bool wide_to_output_function(StreamType & buffer, OutputFunction output_function, int mode = 1)
template<typename OutputFunction> bool wide_to_output_function_by_index(const Stream & stream, OutputFunction output_function, int mode)
- add to convert/misc:
bool try_esc_to_tex(char32_t c, pt::Stream & out)
bool try_esc_to_html(char32_t c, pt::Stream & out)
while here:
- remove utf8/utf8_stream.h, now we only need utf8/utf8.h to include
- add some new methods for converting from a utf8 stream to wide stream/string
- do some improvements in TextStream:
- don't use temporary objects to convert utf8/wide
- add put_stream() which takes TextStreamBase<> as its argument
(uses an iterator instead of get_char() for reading)
- let operator<<(const Space & space) serialize to json and not to Space
A wide string was first changed to utf-8 and then escaped to json/xml/csv
which is incorrect. First should be escaped and then changed to utf-8.
Add TextStreamBase<>::iterator and TextStreamBase<>::const_interator as classes
with a method wchar_t get_unicode_and_advance(const iterator & end)
to return one character either from utf-8 stream or from wide stream.
Let TextStreamBase<>::operator<<(wchar_t v) correctly use utf-8.
there are methods for reading from string/files there
those methods were moved from SpaceParser and CSVParser
fixed: CSVParser didn't set input_as_utf8 flag