edu.stanford.nlp.web
Class USPDIParser

java.lang.Object
  |
  +--javax.swing.text.html.HTMLEditorKit.ParserCallback
        |
        +--edu.stanford.nlp.web.HTMLParser
              |
              +--edu.stanford.nlp.web.USPDIParser

public class USPDIParser
extends HTMLParser

A Parser whose parse method returns the relevant contents of a query on USP DI Drug Database for the drug given by drugName. The title() method returns the drug name, and the text() method returns relevant text scraped from the USP DI Drug Database.


Field Summary
 
Fields inherited from class edu.stanford.nlp.web.HTMLParser
textBuffer, title
 
Fields inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback
IMPLIED
 
Constructor Summary
USPDIParser()
           
 
Method Summary
 void handleEndTag(HTML.Tag tag, int pos)
          Start recording again after you hit the end of a header, link, or bold
 void handleStartTag(HTML.Tag tag, MutableAttributeSet attrSet, int pos)
          Opens and parses the URL that includes /mdxcgi/htmldisp.exe.
 void handleText(char[] data, int pos)
          Return the URL of the link that says USP DI(R) Drug Information for the Health Care Professional
 
Methods inherited from class edu.stanford.nlp.web.HTMLParser
parse, parse, parse, title
 
Methods inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback
flush, handleComment, handleEndOfLineString, handleError, handleSimpleTag
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

USPDIParser

public USPDIParser()
            throws IOException
Method Detail

handleText

public void handleText(char[] data,
                       int pos)
Return the URL of the link that says USP DI(R) Drug Information for the Health Care Professional

Overrides:
handleText in class HTMLParser

handleStartTag

public void handleStartTag(HTML.Tag tag,
                           MutableAttributeSet attrSet,
                           int pos)
Opens and parses the URL that includes /mdxcgi/htmldisp.exe. NOTE: I'm going to have to change this to take care of the times when there are two or more matches to the query. (cgidict.exe). Stops reading at "Precautions to Consider". Skips headers, links, and bold words.

Overrides:
handleStartTag in class HTMLParser

handleEndTag

public void handleEndTag(HTML.Tag tag,
                         int pos)
Start recording again after you hit the end of a header, link, or bold

Overrides:
handleEndTag in class HTMLParser


Stanford NLP Group