Commit Graph

23 Commits

Author SHA1 Message Date
Tomasz Sowa 987d9c845c
declare esc_to_csv() method with a wstring 2023-05-27 18:18:00 +02:00
Tomasz Sowa 96a3a564cf add a virtual dctor to BaseParser() 2022-12-23 04:38:03 +01:00
Tomasz Sowa 379adf6a69 allow to parse a time decimal fraction in ParseTime() method
while here:
- let ParseDate() is able to parse such formats: "20081012" (without a separator)
  and without the month or day e.g: "2008" or "200810"
- let ParseTime() is able to parse a time without separators, e.g.:
  "141030", or "1410" or just "14"
- let Parse(...) method use ParseDate() and ParseTime()
  this will parse a format similar to ISO 8601
2022-12-23 02:15:11 +01:00
Tomasz Sowa 3b3c04b85d fix: rename Toul -> to_ul in PatternReplacer 2022-11-16 16:14:16 +01:00
Tomasz Sowa b3137a7607 rename functions for converting strings to integers to snake case
while here:
- add some functions taking std::string/std::wstring
2022-11-14 03:20:17 +01:00
Tomasz Sowa 663233fe2a let all utf8/wide functions can be available just by including utf8/utf8.h
while here:
- remove utf8/utf8_stream.h, now we only need utf8/utf8.h to include
- add some new methods for converting from a utf8 stream to wide stream/string
- do some improvements in TextStream:
  - don't use temporary objects to convert utf8/wide
  - add put_stream() which takes TextStreamBase<> as its argument
    (uses an iterator instead of get_char() for reading)
  - let operator<<(const Space & space) serialize to json and not to Space
2022-07-30 03:31:18 +02:00
Tomasz Sowa aa97fe2811 add methods for trimming \r\n from the end of a string
add:
void trim_last_new_lines(std::string & str, bool check_carriage_return_too = true);
void trim_last_new_lines(std::wstring & str, bool check_carriage_return_too = true);
2022-07-30 02:43:29 +02:00
Tomasz Sowa d13c10c604 add methods for converting from hex string to bytes
add to convert/text.h:
template<typename HexStringPointerType, typename BytesStringType>
bool hex_string_pointer_to_bytes(const HexStringPointerType * hex_string, BytesStringType & bytes, bool clear_bytes = true);

template<typename HexStringType, typename BytesStringType>
bool hex_string_to_bytes(const HexStringType & hex_string, BytesStringType & bytes, bool clear_bytes = true);
2022-07-26 05:14:35 +02:00
Tomasz Sowa b81daf9fb6 set 2-Clause BSD licence in *.cpp files 2022-06-30 13:44:21 +02:00
Tomasz Sowa 74230d667b change headerfile_picotools_* macros to headerfile_pikotools_* 2022-06-30 12:45:08 +02:00
Tomasz Sowa cadba907b2 change licence from 3-Clause BSD to 2-Clause BSD 2022-06-30 12:09:22 +02:00
Tomasz Sowa 68fe25c8bf add limits when parsing a json/space format
while here:
- add column index error
- add parsing methods with pt::TextStream and pt::WTextStream arguments
2022-05-30 01:01:14 +02:00
Tomasz Sowa c3b7ab5793 add min_width parameter to methods converting int to string 2022-05-28 06:06:32 +02:00
Tomasz Sowa ac3c59323b add methods: try_esc_to_json(wchar_t val, stream) try_esc_to_xml(...) try_esc_to_csv(...)
Those methods return true if the val character was escaped and put
to the out stream. If the character is invalid for such a stream
they only return true without putting it to the stream.
2022-02-04 14:19:54 +01:00
Tomasz Sowa 6b97b1b74a fix: correctly escape json/xml/csv wide strings
A wide string was first changed to utf-8 and then escaped to json/xml/csv
which is incorrect. First should be escaped and then changed to utf-8.

Add TextStreamBase<>::iterator and TextStreamBase<>::const_interator as classes
with a method wchar_t get_unicode_and_advance(const iterator & end)
to return one character either from utf-8 stream or from wide stream.

Let TextStreamBase<>::operator<<(wchar_t v) correctly use utf-8.
2022-02-03 19:08:21 +01:00
Tomasz Sowa 17d2c0fb25 - added some converting methods: esc_to_json(...), esc_to_xml(...), esc_to_csv() (convert/misc.h)
- BaseParser: added possibility to read from TextStream and WTextStream
- HTMLParser: added filter(const WTextStream & in, Stream & out, ...) method
- added utf8_stream.h with one method:
  template<typename StreamIteratorType>
  size_t utf8_to_int(
    StreamIteratorType & iterator_in,
    StreamIteratorType & iterator_end,
    int & res,
    bool & correct)
2021-10-12 19:53:11 +02:00
Tomasz Sowa 7ce07c57f5 added a base class for parsers: BaseParser (convert/baseparser.h|cpp)
there are methods for reading from string/files there
  those methods were moved from SpaceParser and CSVParser
fixed: CSVParser didn't set input_as_utf8 flag
2021-07-17 14:38:22 +02:00
Tomasz Sowa 198945c97b PatternReplacerBase: to_string() changed to to_str() 2021-07-06 21:42:42 +02:00
Tomasz Sowa 8997284b16 added trim(...) functions to convert/text.h
void trim_first_white(std::string & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);
void trim_first_white(std::wstring & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);

void trim_last_white(std::string & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);
void trim_last_white(std::wstring & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);

void trim_white(std::string & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);
void trim_white(std::wstring & str, bool check_additional_chars = true, bool treat_new_line_as_white = true);

void trim_first(std::string & str, wchar_t c);
void trim_first(std::wstring & str, wchar_t c);

void trim_last(std::string & str, wchar_t c);
void trim_last(std::wstring & str, wchar_t c);

void trim(std::string & str, wchar_t c);
void trim(std::wstring & str, wchar_t c);
2021-06-29 23:23:35 +02:00
Tomasz Sowa 3c0b59e115 added to Space: long double to Space::Value and methods for converting from/to long double
added global methods for converting float/string double/string and long double/string (convert/double.h|cpp):
      float to_float(const char * str, const char ** after = nullptr);
      float to_float(const wchar_t * str, const wchar_t ** after = nullptr);
      double to_double(const char * str, const char ** after = nullptr);
      double to_double(const wchar_t * str, const wchar_t ** after = nullptr);
      long double to_long_double(const char * str, const char ** after = nullptr);
      long double to_long_double(const wchar_t * str, const wchar_t ** after = nullptr);
      float to_float(const std::string & str, const char ** after = nullptr);
      float to_float(const std::wstring & str, const wchar_t ** after = nullptr);
      double to_double(const std::string & str, const char ** after = nullptr);
      double to_double(const std::wstring & str, const wchar_t ** after = nullptr);
      long double to_long_double(const std::string & str, const char ** after = nullptr);
      long double to_long_double(const std::wstring & str, const wchar_t ** after = nullptr);
      std::string to_str(float val);
      std::wstring to_wstr(float val);
      std::string to_str(double val);
      std::wstring to_wstr(double val);
      std::string to_str(long double val);
      std::wstring to_wstr(long double val);
2021-06-23 17:01:43 +02:00
Tomasz Sowa b574289054 namespace PT renamed to pt 2021-05-20 16:11:12 +02:00
Tomasz Sowa 7abe4b340a changes in convert/text functions
- changed function names: PascalCase to snake_case
- templates functions moved to a seperate file (text_private.h)
- as a public api only available functions with char/wchar_t/std::string/std::wstring
- ToLower(...) changed to to_lower_emplace(...), similar ToUpper(...) to to_upper_emplace(...)
- added functions:
  std::string to_lower(const std::string & str);
  std::string to_upper(const std::string & str);
  and with std::wstring too
- functions with postfix 'NoCase' changed to 'nc'
2021-05-10 20:04:12 +02:00
Tomasz Sowa 3984c29fbf moved all directories to src subdirectory 2021-05-09 20:11:37 +02:00