O documento discute como se tornar um profissional especializado em virtualização, abordando tópicos como: a demanda por profissionais de virtualização no mercado, como se capacitar através de cursos e materiais gratuitos da Microsoft, e casos reais de projetos de virtualização que falharam devido a falta de conhecimento técnico.
Windows support for more than 64 logical processors is based on a new concept—the group. A group is a static set of up to 64 logical processors that is treated as a single scheduling entity. Groups have the following characteristics:The Windows kernel determines at boot time which processor belongs to which group.Each logical processor is assigned to a single group.All the logical processors in a core, and all the cores in a physical processor, are assigned to the same group if possible.Physical processors that are physically close to one another are assigned to the same group.A process can have affinity for more than one group at a time. However, a thread can be assigned to only a single group at any time, and that group is always in the affinity of the thread’s process.An interrupt can only target processors of a single group.In NUMA architectures, a group can contain processors from one or more nodes, but all the processors in a node are assigned to the same group whenever possible.Up to 64 CPUs per groupGroups may span multiple NUMA nodes, but each node must be fully contained in a single groupA thread belongs to a single group at any point in time, but a process can span groupsInterrupt may be directed to any single CPU, but if directed to a set of CPUs, they must be in the same groupThe group architecture assumes that related code runs on the processors in the same group and that best performance results if the processors in a group are physically close to one another. This architecture has several benefits:Many existing drivers and applications can run without modification on systems that have fewer than 64 logical processors.Groups provide locality of hardware that is used by related software components, thus avoiding an adverse effect on performance.Software can determine the relationships among processors and groups by using exposed interfaces.The group architecture is easily extensible to support additional processors in the future.We will drill into this further and from an application developer perspective. We also have an online technical document you may download. Link provided on resource slide.http://www.microsoft.com/whdc/system/Sysinternals/MoreThan64proc.mspx
This is a good place to make that point, namely “What happens when your app thread that is waiting on this NUMA-localized I/O is scheduled on a different node?” (Not pretty.)
Whole Ecosystem Needs to ScaleScenario with Server 2008 R2… optimized for topology.How does the OS assist with NUMA Aware workloads : StorePort NUMA IOImprovements to Windows Server 2008 to improve data locality of storage transactionsIncludes:Per-I/O interrupt redirection (with appropriate hardware support)NUMA-aware control structure allocationI/O concurrency improvements through lock breakupTarget ~10% reduction in CPU utilization on I/O intensive workloadsUsed as an example herePointer to implementation details on Resources slideDifferent I/O models have different data flows – this case study is designed to demonstrate the analysis and thought processApplication issues a file read or writeEnds up as a storage adapter DMA transactionPotentially in the application buffer, more likely to an intermediate bufferInterrupt is generated after the I/O completesInterrupt source is cleared in the deviceControl structures associated with the request are updatedDPC is queued to process the dataDPC performs the bulk of the processingFlushes DMA adapters (potentially copying data into the user buffer)Potentially performs data transformationEach component (adapter, filesystems, etc) processes control structuresApplication thread is notified of completion