RapidXml 简介

RapidXml 简介2019独角兽企业重金招聘Python工程师标准>>>…

大家好,又见面了,我是你们的朋友全栈君。

来自:http://rapidxml.sourceforge.net/manual.html

RapidXml is an attempt to create the fastest XML DOM parser possible, while retaining useability, portability and reasonable W3C compatibility. It is an in-situ parser written in C++, with parsing speed approaching that of
strlen() function executed on the same data.


RapidXml 试图成为最快的 XML DOM 解析
工具包,同时保证解析结果的可用性、可移植性以及与 W3C 标准的兼容性。RapidXml 使用 C++ 编写,因此在操作同一数据时,其解析速度接近于 strlen() 函数。

Entire parser is contained in a single header file, so no building or linking is neccesary. To use it you just need to copy
rapidxml.hpp file to a convenient place (such as your project directory), and include it where needed. You may also want to use printing functions contained in header
rapidxml_print.hpp.


整个解析工具包包含在一个头文件中,所以使用时不用编译也不用连接。要想使用 RapidXml 只要包含 rapidxml.hpp 即可,当然如果要用附加功能(如打印函数),你可以包含 rapidxml_print.hpp 文件。

1.1 Dependencies And Compatibility【依赖性与兼容性

RapidXml has
no dependencies other than a very small subset of standard C++ library (
<cassert>,
<cstdlib>,
<new> and
<exception>, unless exceptions are disabled). It should compile on any reasonably conformant compiler, and was tested on Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3. Care was taken that no warnings are produced on these compilers, even with highest warning levels enabled.


除了标准C++库中的 cassert、cstdlib、new、exception外,RapidXml几乎不依赖于其他库,几乎能够在任何编译器上通过,经过测试的有
Visual C++ 2003, Visual C++ 2005, Visual C++ 2008, gcc 3, gcc 4, and Comeau 4.3.3。

1.2 Character Types And Encodings【字符类型和编码

RapidXml is character type agnostic, and can work both with narrow and wide characters. Current version does not fully support UTF-16 or UTF-32, so use of wide characters is somewhat incapacitated. However, it should succesfully parse
wchar_t strings containing UTF-16 or UTF-32 if endianness of the data matches that of the machine. UTF-8 is fully supported, including all numeric character references, which are expanded into appropriate UTF-8 byte sequences (unless you enable parse_no_utf8 flag).


RapidXml的字符类型检查不严格(?),窄字符和宽字符
均可以被处理。由于目前版本不支持 UTF-16和UTF-32,因此宽字符的处理范围还有待改进,UTF-8完全没有问题。

Note that RapidXml performs no decoding – strings returned by name() and value() functions will contain text encoded using the same encoding as source file. Rapidxml understands and expands the following character references:
&apos; &amp; &quot; &lt; &gt; &#...; Other character references are not expanded.


注意:name()函数返回不解码的值,value()函数返回以原编码方式编码的文本值。RapidXml认
&apos; &amp; &quot; &lt; &gt; &#...;

1.3 Error Handling【错误处理

By default, RapidXml uses C++ exceptions to report errors. If this behaviour is undesirable, RAPIDXML_NO_EXCEPTIONS can be defined to suppress exception code. See
parse_error class and
parse_error_handler() function for more information.


一般情况下,RapidXml使用 C++的异常处理报告错误,如果异常行为无法预期,可定义
RAPIDXML_NO_EXCEPTIONS。

1.4 Memory Allocation【内存分配

RapidXml uses
a special memory pool object
to allocate nodes and attributes, because direct allocation using
new operator would be far too slow. Underlying memory allocations performed by the pool can be customized by use of
memory_pool::set_allocator() function. See class
memory_pool for more information.

1.5 W3C Compliance【W3C兼容性

RapidXml is not a W3C compliant parser, primarily
because it ignores DOCTYPE declarations. There is a number of other, minor incompatibilities as well. Still, it can successfully parse and produce complete trees of all valid XML files in W3C conformance suite (over 1000 files specially designed to find flaws in XML processors). In destructive mode it performs whitespace normalization and character entity substitution for a small set of built-in entities.


并非W3C兼容的XML解析器,但问题不大。

1.6 API Design【API设计原则

RapidXml API is minimalistic, to reduce code size as much as possible, and facilitate use in embedded environments. Additional convenience functions are provided in separate headers:
rapidxml_utils.hpp and
rapidxml_print.hpp. Contents of these headers is not an essential part of the library, and is currently not documented (otherwise than with comments in code).


API设计坚持最小化原则,以尽可能减少代码尺寸,使之适用于嵌入式环境。

1.7 Reliability【稳定性

RapidXml is
very robust and comes with a large harness of unit tests. Special care has been taken to ensure stability of the parser no matter what source text is thrown at it. One of the unit tests produces 100,000 randomly corrupted variants of XML document, which (when uncorrupted) contains all constructs recognized by RapidXml. RapidXml passes this test when it correctly recognizes that errors have been introduced, and does not crash or loop indefinitely.

Another unit test puts RapidXml head-to-head with another, well estabilished XML parser, and verifies that their outputs match across a wide variety of small and large documents.

Yet another test feeds RapidXml with over 1000 test files from W3C compliance suite, and verifies that correct results are obtained. There are also additional tests that verify each API function separately, and test that various parsing modes work as expected.

1.8 Acknowledgements

I would like to thank Arseny Kapoulkine for his work on
pugixml, which was an inspiration for this project. Additional thanks go to Kristen Wegner for creating
pugxml, from which pugixml was derived. Janusz Wohlfeil kindly ran RapidXml speed tests on hardware that I did not have access to, allowing me to expand performance comparison table.


类别:
Xml 
查看评论

转载于:https://my.oschina.net/zhmsong/blog/5230

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/160911.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号