Home | Links | Contact Us | More About Intellectual Property | Bookmark
Search patents:
Home File Sharing Word-phrase-classification-processing-method-and-apparatus

 Support bound probes and methods of analysis using the same
The present invention provides improved methods useful for de novo sequencing of an unknown polymer ...


 Synchronous content addressable memory
A content addressable memory (CAM) device is disclosed. The CAM device is a synchronous device that ...


 Step to access native script in a legacy database management system using XML message
The present invention overcomes the disadvantages of the prior art by providing a technique which ...


 Digital signature purpose encoding
OF THE INVENTION FIG. 1 is a diagram of ordinary digital signature generation without purpose ...


 High speed system and method for replicating a large database at a remote location
In summary, the present invention is a distributed computer database system having a local computer ...


 Document processing apparatus for adding predetermined design types to an original document
The present invention has been made to solve the above-described drawbacks of the conventional ...


 Method for determining the skew of a printhead of a printer
A first method of the invention is for determining the skew of a printhead of a printer. The first ...


 Method and apparatus for selecting network entities
The present invention can allow network managers to view network entities of interest more easily ...


 Methods and systems for email attachment distribution and management
Email distribution methods and systems consistent with the present invention are described herein ...


 Data processing method and apparatus
OF THE PREFERRED EMBODIMENTS Referring first to FIG. 1, a computer system 10 comprises a system ...


 Word/phrase classification processing method and apparatus

Details
Inventors: Ushioda, Akira;
Assignee: Fujitsu Limited (Kawasaki, JP)
Primary Examiner: Isen; Forester W.
Assistant Examiner: Edouard; Patrick N.
Attorney, Agent or Firm: Staas & Halsey, LLP

A token is attached to a word class sequence whose probability of appearance in text data is equal to or more than a predetermined value. A set of words and tokens included in a word/token sequence concerning the text data, is separated so that a probability of generation of the word/token sequence concerning the text data becomes the highest. The token is then replaced with a phrase included in the text data.

DETAILED DESCRIPTION A first object of the present invention is to provide a word/phrase classification processing apparatus and method thereof which can automatically classify word and phrase as one block.
A second object of the present invention is to provide a phrase extraction apparatus which can extract a phrase from a large amount of text data at a high speed.
A third object of the present invention is to provide a speech recognition apparatus which can perform accurate speech recognition using the correspondence or similarity between word and phrase or between phrases.
A fourth object of the present invention is to provide a machine translation apparatus which can perform accurate machine translation using the correspondence or similarity between word and phrase or between phrases.
To attain the above described first object, word and phrase included in text data are classified together to generate a class in which the word and phrase exist together, according to the present invention.
With such a class, not only words, but word and phrase or phrases can be classified as one block, thereby easily identifying the correspondence or similarity between the word and phrase or between the phrases.
Furthermore, according to an embodiment of the present invention, a one-dimensional sequence of word classes is generated by mapping word classes into which words are classified, into a one-dimensional sequence of words included in text data.
Then, a word class sequence in which all of the degrees of stickiness between contiguous word classes are equal to or more than a predetermined value, is extracted from the one-dimensional sequence of word classes of the text data and has a token attached.
After word and token are classified together, a word class sequence corresponding to the token is replaced with a phrase belonging to that word class sequence.
As described above, a token is attached to a word class sequence to regard that sequence as one word.
As a result, equal handling of a word included in text data and a word class sequence with a token attached allows classification processing to be performed without making a distinction between word and phrase



Related patents
  Optical waveguide device, optical and electrical elements combined device, method of driving the same, and electronic equipment using the same
It is an object of the present invention to provide an optical waveguide device having a configuration adapted to selectively receive a desired signal of optical signals ...
  Apparatus and method for extracting data
The present invention is a method, system and apparatus for extracting data from another location and saving it in a local environment. According to one embodiment, the ...
  Method and apparatus for representing multidimensional data
The invention is directed towards method and apparatus for representing multidimensional data. Some embodiments of the invention provide a two-layered data structure to ...
  Call traffic based exception generating system
It is an object of the present invention to provide a new and improved call traffic based monitoring system of central office switch. The invention, therefore, according ...
  Database system with improved methods for asynchronous logging of transactions
The asynchronous logging system of the present invention provides improved methods for storing log records in a manner that reduces contention for logging resources of a ...
  Format conversion of storage data using an efficient division of data
OF PREFERRED EMBODIMENTS The database apparatus according to a preferred embodiment of the present invention will be described using an example of the apparatus applied ...
  Device and method for automatically classifying documents using vector analysis
The invention has been conceived to solve the drawbacks of the related art and aims at realizing self-organizing classification of an aggregation of documents through ...
  Data processing apparatus, data processing method, and computer readable medium having data processing program recorded thereon
To achieve the above-noted objects, the present invention is a data processing apparatus having a host system having a plurality of different databases and a terminal ...
  Utilizing information redundancy to improve text searches
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive ...
  Compiling glyphs into instructions for imaging for execution on a general purpose computer
The present invention is a method and apparatus for receiving glyph data, specifying glyphs according to a pixel map, and for automatically compiling the glyph data to ...

0.004

Archive: All patents - Links

Copyright (c)2006 Eipa-patents.org - All rights reserved