Home | Links | Contact Us | More About Intellectual Property | Bookmark
Search patents:
Home File Sharing Method-and-system-for-bootstrapping-statistical-processing-into-a-rule-based-natural-language-parser

 Method and apparatus for producing and accessing composite data
OF THE INVENTION Reference will now be made in detail to the preferred embodiments of this ...


 Composing a description of a virtual 3D world from values stored in a database and generated by decomposing another description of a virtual 3D world
The foregoing needs, and other needs that will become apparent in the following description, are ...


 Method, system and program product for animated web page construction and display
It is therefore one object of the present invention to provide a system and method for constructing ...


 Encoding and transferring media content onto removable storage
In the following detailed description of embodiments of the invention, reference is made to the ...


 Method and apparatus for refreshing a non-clocked memory
OF THE INVENTION A combinatorial decoding device and/or programmable refresh according to the ...


 Generalized network security policy templates for implementing similar network security policies across multiple networks
The present invention provides a software facility for implementing similar network security ...


 Server system and method for distributing and scheduling modules to be executed on different tiers of a network
An exemplary embodiment of a component distribution server (CDS) system according to the present ...


 Solid state sound lamp
The present invention comprises a sound light source that overcomes the problems caused by the ...


 Microcomputer-based carrier detection system for a cordless telephone by comparing an output signal which contains signals no higher in frequency with a frequency threshold value
OF THE EMBODIMENT Referring to FIG. 1a, a cordless telephone 100 as known from the prior art ...


 Automated audit methodology for design
1. A method of auditing a design process, said method comprising: producing a library usage file ...


 Method and system for bootstrapping statistical processing into a rule-based natural language parser

Details
Inventors: Richardson, Stephen Darrow; Heidorn, George E.;
Assignee: Microsoft Corporation (Redmond, WA)
Primary Examiner: Trammell; James P.
Assistant Examiner: Nguyen; Cuong H.
Attorney, Agent or Firm: Seed and Berry LLP

A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided. In a preferred embodiment, a statistical bootstrapping software facility optimizes the operation of a robust natural language parser that uses a set of lexicon entries to determine possible parts of speech of words from an input string and a set of rules to combine words from the input string into syntactic structures. The facility first operates the parser in a statistics compilation mode, in which, for each of many sample input strings, the parser attempts to apply all applicable rules and lexicon entries. While the parser is operating in the statistics compilation mode, the facility compiles statistics indicating the likelihood of success of each rule and lexicon entry, based on the success of each rule and lexicon entry when applied in the statistics compilation mode. After a sufficient body of likelihood of success statistics have been compiled, the facility operates the parser in an efficient parsing mode, in which the facility uses the compiled statistics to optimize the operation of the parser. In order to parse an input string in the efficient parsing mode, the facility causes the parser to apply applicable rules and lexicon entries in the descending order of the likelihood of their success as indicated by the statistics compiled in the statistics compilation mode.

DETAILED DESCRIPTION OF THE INVENTION I.
INTRODUCTION A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided.
In a preferred embodiment, the invention comprises a statistical bootstrapping software facility ("the facility"), shown as element 208 in FIG.
2, for automatically compiling and using statistics to improve the performance of a rule-based natural language parser, which generates syntax trees to represent the organization of plain-text sentences.
Such a parser uses a set of lexicon entries to identify the part of speech of words, and a set of rules to combine words from an input string into syntactic structures, or "records," eventually combining the records into a syntactic tree representing the entire input string.
A parser is said to "apply" lexicon entries and rules in order to produce new records.
A parser may apply a lexicon entry when the word to which it corresponds appears in the input string, and does so by creating a new record, then copying lexical information such as part of speech, person, and number from the lexicon entry to the created record.
A parser may apply a rule that combines existing records by first evaluating conditions associated with the rule.
If the conditions of the applied rule are satisfied, then the facility creates a new record and adds information to the created record, such as record type and information about the combined records, as specified by the rule.
The facility functions as a parser control program for a conventional rule-based parser.
FIG.
1 is a flow diagram showing the overall operation of the facility.
In step 101-103, the facility operates the parser in a statistics compilation mode, during which the facility compiles statistics indicating the success rate of the parser when it applies each lexicon entry and each rule while parsing a "corpus," or large sample of representative text.
In this mode, the facility in steps 101-102 causes the parser to apply every rule and lexicon entry which may be applied ("applicable" rules and lexicon entries) to create "records," or prospective parse tree nodes



Related patents
  Method for creating a disjunctive edge graph from subtrees during unification
An object of the present invention is to reduce the time required to unify two feature structures by reducing the time required to copy attributes and values from those ...
  Systems and methods for determinization and minimization a finite state transducer for speech recognition
These and other objects of the invention are accomplished in accordance with the principles of the present invention by providing a system and method for optimal ...
  Elimination of left recursion from context-free grammars
A method for transforming a first set of rule expressions forming a first grammar to a second set of rule expressions forming a second grammar includes identifying at ...
  Apparatus and method for electronic document certification and verification
In accordance with the present invention, electronic document certification, verification of such certification, and certification in the transmission of electronic mail ...
  Method for tracking configuration changes in networks of computer systems through historical monitoring of configuration status of devices on the network.
It is an aspect of the present invention to track configuration changes in computer system devices on a network. It is another aspect of the invention to use a revision ...
  Method and apparatus for implementing distributed SCSI devices using enhanced adapter reservations
Principal aspects of the present invention are to provide a method and apparatus for implementing distributed (SCSI) devices using enhanced adapter reservations. Other ...
  Schema for sharing relational database types
A schema is described for storing the meta data that describes relational databases. Advantageously, the schema can be used in both database vendor environments and ...
  Methods and systems for providing supplemental contextual content
FIG. 1 is a block diagram of an article generation system 100. In FIG. 1, an article generator 105 interacts with information sources 110 to publish articles to a ...
  Assigning a hot spot in an electronic artwork
In general, in one aspect, the invention features apparatus and methods implementing a technique for creating an electronic artwork with a hot area. For a selected layer ...
  3D virtual environment creation management and delivery system
A system for the creation, modification and delivery of a virtual environment is one in which the virtual reality environment or scene is stored not in terms of files, ...

0.004

Archive: All patents - Links

Copyright (c)2006 Eipa-patents.org - All rights reserved