Home | Links | Contact Us | More About Intellectual Property | Bookmark
Search patents:
Home Databases Binary-tree-for-complex-supervised-learning

 Systems and methods for the automatic segmentation and clustering of ordered information
The systems and methods according to this invention provide for automatic segmentation and ...


 Method for boosting the performance of machine-learning classifiers
OF THE PREFERRED EMBODIMENTS In the following description of the preferred embodiments of the ...


 Method and system for data management perform the functions of automatically propagating changes in information related to product being designed or manufactured from a central location to remote and disparate user information systems having varying data
A data management system is provided that is designed to automatically propagate changes in ...


 Methods, systems and computer program products for monitoring interrelated tasks executing on a computer using queues
Embodiments of the present invention monitor a task executing on a computer that utilizes a work in ...


 Document display apparatus for displaying a plurality of multimedia documents
Accordingly, It Is a n object of the present invention to provide a document display apparatus ...


 Processing system for storing related information without modifying selected basic data or stored relational data if previous storage detected
What is claim is: 1. A data processing device comprising: display means; basic data input means for ...


 Geographic information indicator, method for displaying geographic information and storage medium for storing program for executing the same
Accordingly, one object of this invention is to provide a geographic information indicator, a ...


 Descriptor mechanism for assuring indivisible execution of AV/C operations
The present invention provides a method and system that leverages the widespread emergence of ...


 Access control for groups of related data items
Accordingly, the present invention provides a method of controlling access to an unstructured group ...


 Data storage method and device and storage medium therefor
The present invention has been made in view of the above, and therefore an objective of the present ...


 Binary tree for complex supervised learning

Details
Inventors: Huang, Jing; Olshen, Richard A.;
Assignee: The Board of Trustees of the Leland Stanford Junior University (Palo Alto, CA)
Primary Examiner: Vincent; David
Assistant Examiner: Tran; Mai T.
Attorney, Agent or Firm: Lumen Intellectual Property Services, Inc.

The present invention provides a powerful and robust classification and prediction tool, methodology, and architecture for supervised learning, particularly applicable to complex datasets where multiple factors determine an outcome and yet many other factors are irrelevant to prediction. Among those features which are relevant to the outcome, they have complicated and influential interactions, though insignificant individual contributions. For example, polygenic diseases may be associated with genetic and environmental risk factors. This new approach allow us consider all risk factors simultaneously, including interactions and combined effects. Our approach has the strength of both binary classification trees and regression. A simple rooted binary tree model is created with each split defined by a linear combination of selected variables. The linear combination is achieved by regression with optimal scoring. The variables are selected using backward shaving. Cross-validation is used to find the level of shrinkage that minimizes errors. Using a selected variable subset to define each split not only increases interpretability, but also enhances the model's predictive power and robustness. The final model deals with cumulative effects and interactions simultaneously.

DETAILED DESCRIPTION As interest in complex human disease increases, there are increasing needs for methodologies that address issues such as gene-gene and gene-environment interactions in a robust fashion.
The present invention provides a binary tree-structured classification tool that addresses such needs, particularly in predicting a complex human disease (hypertension) from single nucleotide polymorphisms (SNPs) and other variables.
With a superior ability to handle combinations of predictors, the methodology disclosed herein extends beyond the traditional approach known as CART.
RTM.
, while retaining CART.
RTM.
's simple binary tree structure.
CART refers to Classification And Regression Trees where typically each split is defined by a single variable.
CART.
RTM.
is a software package from Salford Systems.
According to an aspect of the invention, the methodology includes transforming categorical predictors to indicator variables, suitably scoring outcomes, backward selecting predictors from the least to the most "important," and constructing models while respecting family structures in the data.
We successfully convert a problem of classification to one of regression without losing sight of classification.
More specifically, with our methodology, each vector of predictor values can be located to one of several disjoint subgroups, that is, to a terminal node of a binary tree.
Finding groups with high risk of disease may lend understanding to etiology and biological mechanism, as will be explained hereinafter in a later section.
In traditional classification trees, each split is on one feature alone.
What is more, the set of features is not reduced in size before any splitting is attempted.
Neither seems appropriate for polygenic disease, when no single gene is decisive and the "main effect" may be a gene by environment interaction; but most genes are irrelevant to the signal.
Algorithm Our approach has the strength of both classification trees and regression.
Although some underlying techniques are known, it is believed that our approach as a whole is utterly novel



Related patents
  Interface and method of designing an interface
Accordingly, the present invention is directed to a method for designing a user interface and taking into consideration the user's input in mapping the specific tasks to ...
  Method and system for establishing voice communications using a computer network
This need is met by the method and system of the present invention for establishing voice communications between a computer user and an agent wherein the computer user ...
  Distributed database configuration with graphical representations having prelinked parameters for devices within a networked control system
While this invention is susceptible of embodiments in many different forms, there is shown in the drawings and will herein be described in detail, a preferred ...
  Identifying, processing and caching object fragments in a web environment
FIG. 1 depicts an example of an Internet environment adaptable to the present invention. As depicted, a client (60 . . . 63 ) may be connected through a network (25) to ...
  In-memory database system
The above-mentioned shortcomings, disadvantages and problems are addressed by the present invention, which will be understood by reading and studying the following ...
  Internet database system
The present invention addresses the foregoing desires by providing an incrementally-scalable database system and method. The system architecture implements a netstore as ...
  System for using a dialog session context to process electronic forms data on the world wide web
The present invention is a processing system and method for electronic fill-out forms, wherein forms information is repeatedly exchanged between two nodes on a ...
  Document management system
The present invention is a document management system that provides secure document collaboration, sharing and archiving with context indexing, digital document ...
  Probabilistic record linkage model derived from training data
Computers keep and store information about each of us in databases. For example, a computer may maintain a list of a company's customers in a customer database. When the ...
  Multimedia search apparatus and method for searching multimedia content using speaker detection by audio data
This invention provides multimedia search apparatus and methods for searching multimedia content using speaker detection to segment the multimedia content. The ...

0.014

Archive: All patents - Links

Copyright (c)2006 Eipa-patents.org - All rights reserved