Method and system for automatically measuring resource needs in a computer |
| It is an object of the present invention to provide a system and method that dynamically ... |
|
System and method for operating a packet buffer |
| The present invention incorporates a technique that enables implementation of a high-speed, high-... |
|
Method for efficient serialized transmission of handshake signal on a digital bus |
| In accordance with the preferred embodiment of the present invention, a method is provided for ... |
|
Method and system for reducing the number of connections between a plurality of semiconductor devices |
| It is therefore an object of the present invention to provide a method and system for reducing the ... |
|
Receive deserializer for regenerating parallel data serially transmitted over multiple channels |
| A receive deserializer which regenerates parallel data words that have been broken into smaller ... |
|
Method and apparatus for transmitting control information across a serialized bus interface |
| The preferred embodiment present invention provides a scheme that can be used to transmit control ... |
|
On chip network with memory device address decoding |
| OF EMBODIMENT(S) OF THE INVENTION As used herein, the terms "assert" and "negate" are used when ... |
|
Data packet switching |
| To overcome these problems, the present invention provides a telecommunications packet switch ... |
|
Ethernet system |
| We claim: 1. In combination for use in a system providing signals having individual ones of a ... |
|
|
Method and apparatus for dispatching tasks in a non-uniform memory access (NUMA) computer system
| Details |
Inventors: McDonald, Jarl Wendell;
Assignee: International Business Machines Corporation (Armonk, NY)
Primary Examiner: Bullock, Jr.; Lewis A.
Assistant Examiner: Tang; Kenneth
Attorney, Agent or Firm: Truelson; Roy W. Cockburn; Joscelyn G.
A dispatcher for a non-uniform memory access computer system dispatches threads from a common ready queue not associated with any CPU, but favors the dispatching of a thread to a CPU having a shorter memory access time. Preferably, the system comprises multiple discrete nodes, each having a local memory and one or more CPUs. System main memory is a distributed memory comprising the union of the local memories. A respective preferred CPU and preferred node may be associated with each thread. When a CPU becomes available, the dispatcher gives at least some relative priority to a thread having a preferred CPU in the same node as the available CPU over a thread having a preferred CPU in a different node. This preference is relative, and does not prevent the dispatch from overriding the preference to avoid starvation or other problems. |
|
DETAILED DESCRIPTION In accordance with the present invention, a dispatcher for a non-uniform memory access computer system dispatches all threads from a single, common ready queue (also known as a run queue), which is not preferentially associated with any CPU or group of CPUs. The dispatcher considers the physical placements of CPUs when dispatching threads, and specifically, preferentially favors the dispatching of a thread to a CPU having a shorter memory access time for accessing a memory subset likely to contain a relatively larger share of thread required data. In the preferred embodiment, the NUMA system is designed as a system of multiple discrete nodes, each having a local memory, one or more CPUs, an internal node bus, and an interface for communicating with other nodes. System main memory is a distributed memory comprising the union of local memories in each node. Memory access to a location within the node of the processor is faster than memory access across a node boundary. In the preferred embodiment, a respective preferred CPU may be associated with each thread. When a CPU becomes available, the dispatcher gives at least some relative priority to a thread having a preferred CPU in the same node as the available CPU over a thread having a preferred CPU in a different node. This is a relative priority, and not an absolute constraint. It is still possible to select a thread for dispatch to a CPU which is not in the same node as the thread's preferred CPU, and thus avoid starvation or other problems which may arise from too rigidly constraining the thread dispatching choice. In the preferred embodiment, a preferred node, called an "ideal node", is generally assigned to user processes. When a process spawns a thread, the thread inherits the ideal node of the process. Additionally, a CPU in the ideal node is selected as the "ideal processor" for the thread. The selection of ideal processors for threads spawned by a single process is generally rotated on a round-robin basis. Other selection criteria being equal, threads are preferentially dispatched to ideal processors first, and to ideal nodes second
|
|