Java javax.swing.text.html.parser 类 DocumentParser（JDK5）

194 阅读 0 评论 0 点赞

所有类

概述

软件包

类

使用

树

已过时

索引

帮助

摘要：嵌套 | 字段 | 构造方法 | 方法

详细信息：字段 | 构造方法 | 方法

javax.swing.text.html.parser
类 DocumentParser

java.lang.Object
  javax.swing.text.html.parser.Parser
      javax.swing.text.html.parser.DocumentParser

所有已实现的接口：: DTDConstants

public class DocumentParser
extends Parser
extends Parser

HTML 文档的解析器 (Parser)（实际上，您可以指定一个 DTD，但实际上仅应将此类与 swing 中的 html dtd 一起使用）。读取 HTML 的 InputStream，并调用 ParserCallback 类中的适当方法。这是 HTMLEditorKit 用于解析 HTML url 的默认解析器。

此类将通知回调所有有效的标记，以及被隐含但没有明确指定的标记。例如，html 字符串 (<p>blah) 只定义了一个 p 标记。该回调将看到下面的方法：

handleStartTag(html, ...)
handleStartTag(head, ...)
handleEndTag(head)
handleStartTag(body, ...)
handleStartTag(p, ...)
handleText(...)
handleEndTag(p)
handleEndTag(body)
handleEndTag(html)

斜体表示的项是隐含的，也就是说，它们未被明确指定为本来应该提供的正确的 html（尽管头不是必须的，但仍生成了头）。对于隐含的标记，AttributeSet 参数中键 HTMLEditorKit.ParserCallback.IMPLIED 的值为 Boolean.TRUE。

HTML.Attributes 定义 html 属性的一个类型安全的枚举。如果在 HTML.Attribute 中定义了一个标记的属性键，HTML.Attribute 将作为键，否则将使用一个 String 作为键。例如，<p foo=bar class=neat> 有两个属性，foo 在 HTML.Attribute 中未定义，而 class 在其中定义，因此 AttributeSet 将包含两个值，HTML.Attribute.CLASS 作为键的 String 值 "neat"，String "foo" 作为键的 String 值 "bar"。

位置参数将指示标记、注释或文本的开始。与数组类似，流中的第一个字符位置为 0。对于被隐含的标记来说，位置参数将指示下一个出现的标记的位置。在第一个示例中，隐含的开始正文和 html 标记将具有与 p 标记相同的位置，隐含的结束 p、html 和正文标记都将具有相同的位置。

由于 html 跳过空白，因此文本的位置将是第一个有效字符的位置，例如，在字符串 "\n\n\nblah" 中，文本 "blah" 位置为 3，换行将被跳过。

对于没有值的属性，例如，在 html 字符串 <foo blah> 中，属性 blah 没有值，存在两个可以放入 AttributeSet 值中的可能值：

如果 DTD 不包含该元素的定义，或者该定义没有一个显式值，则 AttributeSet 中的值将为 HTML.NULL_ATTRIBUTE_VALUE。
如果 DTD 包含一个显式值，如：<!ATTLIST OPTION selected (selected) #IMPLIED> 将使用 dtd（在此情形中已选中）中的这一值。

在解析流之后，将通知回调最可能的行字符串结束符。行字符串的结束符将是 \n、\r 或 \r\n 之一，它在解析流中出现得最多。

字段摘要

从类 javax.swing.text.html.parser.Parser 继承的字段
`dtd, strict`

从接口 javax.swing.text.html.parser.DTDConstants 继承的字段
`ANY, CDATA, CONREF, CURRENT, DEFAULT, EMPTY, ENDTAG, ENTITIES, ENTITY, FIXED, GENERAL, ID, IDREF, IDREFS, IMPLIED, MD, MODEL, MS, NAME, NAMES, NMTOKEN, NMTOKENS, NOTATION, NUMBER, NUMBERS, NUTOKEN, NUTOKENS, PARAMETER, PI, PUBLIC, RCDATA, REQUIRED, SDATA, STARTTAG, SYSTEM`

构造方法摘要
`DocumentParser(DTD dtd)`

方法摘要
`protected void`	`handleComment(char[] text)` 在遇到 HTML 注释时调用。
`protected void`	`handleEmptyTag(TagElement tag)` 处理空标记。
`protected void`	`handleEndTag(TagElement tag)` 处理结束标记。
`protected void`	`handleError(int ln, String errorMsg)` 发生了一个错误。
`protected void`	`handleStartTag(TagElement tag)` 处理开始标记。
`protected void`	`handleText(char[] data)` 处理文本。
`void`	`parse(Reader in, HTMLEditorKit.ParserCallback callback, boolean ignoreCharSet)`

从类 javax.swing.text.html.parser.Parser 继承的方法
`endTag, error, error, error, error, flushAttributes, getAttributes, getCurrentLine, getCurrentPos, handleEOFInComment, handleTitle, makeTag, makeTag, markFirstTime, parse, parseDTDMarkup, parseMarkupDeclarations, startTag`

从类 java.lang.Object 继承的方法
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

构造方法详细信息

DocumentParser

public DocumentParser(DTD dtd)

方法详细信息

parse

public void parse(Reader in,
                  HTMLEditorKit.ParserCallback callback,
                  boolean ignoreCharSet)
           throws IOException

抛出：: IOException

handleStartTag

protected void handleStartTag(TagElement tag)

处理开始标记。

覆盖：: 类 Parser 中的 handleStartTag

handleComment

protected void handleComment(char[] text)

从类 Parser 复制的描述

在遇到 HTML 注释时调用。

覆盖：: 类 Parser 中的 handleComment

handleEmptyTag

protected void handleEmptyTag(TagElement tag)
                       throws ChangedCharSetException

处理空标记。

覆盖：: 类 Parser 中的 handleEmptyTag

抛出：: ChangedCharSetException

handleEndTag

protected void handleEndTag(TagElement tag)

处理结束标记。

覆盖：: 类 Parser 中的 handleEndTag

handleText

protected void handleText(char[] data)

处理文本。

覆盖：: 类 Parser 中的 handleText

handleError

protected void handleError(int ln,
                           String errorMsg)

从类 Parser 复制的描述

发生了一个错误。

覆盖：: 类 Parser 中的 handleError