SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Spectrum Scale Memory
Usage
Tomer Perry
Spectrum Scale Development
<tomp@il.ibm.com>
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?
●
Others
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?
●
Others
Some relevant Scale Basics
”The Cluster”
• Cluster – A group of operating system
instances, or nodes, on which an
instance of Spectrum Scale is deployed
• Network Shared Disk - Any block
storage (disk, partition, or LUN) given to
Spectrum Scale for its use
• NSD server – A node making an NSD
available to other nodes over the network
• GPFS filesystem – Combination of data
and metadata which is deployed over
NSDs and managed by the cluster
• Client – A node running Scale Software
accessing a filesystem
nsd
nsd
srv
nsd nsd
nsd
srv
nsd
srv
nsd
srv
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
CCCCC
Some relevant Scale Basics
MultiCluster
• A Cluster can share some or all of its
filesystems with other clusters
• Such cluster can be referred to as
“Storage Cluster”
• A cluster that don’t have any local
filesystems can be referred to as “Client
Cluster”
• A Client cluster can connect to 31
clusters ( outbound)
• A Storage cluster can be mounted by
unlimited number of client clusters
( Inbound) – 16383 really.
• Client cluster nodes “joins” the storage
cluster upon mount
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCCCCCCCC
Some relevant Scale Basics
Node Roles
• While the general concept in Scale is “all nodes were made equal” some has special roles
Config Servers

Holds cluster
configuration files

4.1 onward supports
CCR - quorum

Not in the data path
Cluster Manager

One of the quorum
nodes

Manage leases,
failures and recoveries

Selects filesystem
managers

mm*mgr -c
Filesystem Manager

One of the “manager”
nodes

One per filesystem

Manage filesystem
configurations ( disks)

Space allocation

Quota management
Token Manager/s

Multiple per filesystem

Each manage portion
of tokens for each
filesystem based on
inode number
Some relevant Scale Basics
Node Internal Structure
User Mode Daemon:
I/O Optimization, Communications, Clustering Kernel Module
Pagepool (Buffer Pool)
Pinned Memory
Disk Descriptor
File System Descriptor
File metadata
File Data
Application
Shared Storage
Communication
TCP/IP
RDMA
Sockets
GPFS Commands
/usr/lpp/mmfs/bin
Configuration
CCR
Other Memory
Non-pinned Memory
Other Nodes
Shell
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?
●
Others
Scale Memory Pools
• At a high level, Scale is using several
memory pools:
• Pagepool
data
• Shared Segment
“memory metadata”, shared ( U+K)
• External/heap
Used by “external” parts of Scale ( AFM
queue, Ganesha, SAMBA etc.)
External/Heap
SharedSegment
Pagepool
Scale Memory Pools
• At a high level, Scale is using several
memory pools:
• Pagepool
data
• Shared Segment
“memory metadata”, shared ( U+K)
• External/heap
Used by “external” parts of Scale ( AFM
queue, Ganesha, SAMBA etc.)
External/Heap
SharedSegment
Pagepool
Scale Memory Pools
Pagepool
• Pagepool size is statically defined using
the pagepool config option. It is allocated
on daemon startup and can be changed
using mmchconfig with the -i option
( NUMA, fragmentation)
• Pagepool stores data only
• All pagepool content is referenced by the
sharedSegment ( data is worthless without
metadata…)
• Not all the sharedSegment points to the
pagepool
SharedSegment
Pagepool
Scale Memory Pools
SharedSegment
• The shared segment has two major pools
• Pool2 - “Shared Segment”
-
All references to data stored in
the pagepool ( bufferDesc,
Openfiles, IndBlocks,
CliTokens) etc.
-
Statcache ( compactOpenfile)
• Pool3 - “Token Memory”
-
Tokens on token servers
-
Some client tokens ( BR)
Pool2
“UNPINNED”
Pool3
TokenMemory
Scale Memory Pools
SharedSegment
• What’s the difference between
maxFilesToCache and maxStatCache?
– maxFilesToCache:
How many files data and metadata
can we cache in the pagepool.
Implies how many concurrent open
files we can have
– maxStatCache:
If a file needs to be evicted from the
pagepool due to MFTC above, we
can still take its stat structure,
compact it and store it here. So we
won’t need an IOP for those stat
calls
maxStatCache
maxFilesToCache
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?
●
Others
Memory Calculations
”So how much memory do I need?”
• Pagepool - ”it depends...”:
How much DATA do you need to cache?
How much RAM can you spend?
• Shared Segment - “it depends...”
How many files do you need to cache?
How many inodes do you need to cache?
What is your access pattern ( BR)
SharedSegment
Pagepool
Memory Calculations
SharedSegment
0 10000 20000 30000 40000 50000 60000
100
200
300
400
500
Memory Use
per cached files
No Of Files
MB
• Pool 2 - “Shared Segment”
– Scope: Almost completely local
( not much dependency on other
nodes)
– Relevant parameters:
●
maxFilesToCache
●
maxStatCache
– Cost:
●
Each MFTC =~ 10k
●
Each MSC =~ 480b
For example, using MFTC of 50k and MSC of 20k and
opening 60k files
Memory Calculations
SharedSegment
• Pool 3 - “Token Memory”
– Scope: “Storage cluster” manager nodes
– Relevant parameters:
●
MFTC, MSC, number of nodes, “hot objects”
– Cost:
●
TokenMemPerFile ~= 464b, ServerBRTreeNode = 56b
●
BasicTokenMem = (MFTC+MSC)*NoOfNodes*TokenMemPerFile
●
ClusterBRTokens = NoOfHotBlocksRandomAccess * ServerBRTreeNode
●
TotalTokenMemory = BasicTokenMem + ClusterBRTokens
●
PerTokenServerMem = TotalTokenMemory / NoOfTokenServers
* The number of the byte range tokens, in the worse case scenario, would be one byte range per block for all the blocks being accessed randomly
Memory Calculations
SharedSegment
• Pool 3 - “Token Memory”
– Scope: “Storage cluster” manager nodes
– Relevant parameters:
●
MFTC, MSC, number of nodes, “hot objects”
– Cost:
●
TokenMemPerFile ~= 464b, ServerBRTreeNode = 56b
●
BasicTokenMem = (MFTC+MSC)*NoOfNodes*TokenMemPerFile
●
ClusterBRTokens = NoOfHotBlocksRandomAccess * ServerBRTreeNode
●
TotalTokenMemory = BasicTokenMem + ClusterBRTokens
●
PerTokenServerMem = TotalTokenMemory / NoOfTokenServers
* The number of the byte range tokens, in the worse case scenario, would be one byte range per block for all the blocks being accessed randomly
Overall number of cached objects in the cluster
( different nodes might cache different numbers) * 0.5K
+ monitor token memory usage
Memory Calculations
SharedSegment
0 10000 20000 30000 40000 50000 60000
200
400
600
800
1000
1200
Token Server Memory Use
Per cached file
SrvTM SrvTM*25
NoOfFiles
MB
• Pool 3 - “Shared Segment”
– As can be seen in the graph, single
client don’t usually large memory
foot print ( blue line)
– Using 25 clients, already brings us
closer to 1G
– There is no difference between the
“cost” of MFTC and MSC
For example, using MFTC of 50k and MSC of 20k and
opening 60k files
Memory Calculations
SharedSegment
• But...where can we find those numbers?
– Mmdiag is your friend
– “The Command-Who-Must-Not-Be
Named” can provide even more
details ( dump malloc)
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?
●
Others
Is that all?
Related OS memory usage
0 10000 20000 30000 40000 50000 60000
100
200
300
400
500
Memory Utilization
Shared2 dEntryOSz inodeOvSz
No Of Files
MB
• “Scale don’t use the operating system
cache”...well...sort of
– Directory Cache ( a.k.a dcache) - ~192b
– VFS inode cache - ~1K
• But is it really important? The SharedSeg
takes much more memory
Outline
●
Relevant Scale Basics
●
Scale Memory Pools
●
Memory Calculations
●
Is that all?( Linux memory)
●
Others
Others
Out of scope here
• TCP Memory
• AFM queue memory
• Policy ( and its affect on cached objects)
• Ganesha (MDCache)
• SAMBA
• Sysmon, zimon
• GUI
• TCT, Archive
• fsck, other maintenance commands
Thank
You

Weitere ähnliche Inhalte

Was ist angesagt?

Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015Doug O'Flaherty
 
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...Sandeep Patil
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocatorsHao-Ran Liu
 
IBM Spectrum Scale and Its Use for Content Management
 IBM Spectrum Scale and Its Use for Content Management IBM Spectrum Scale and Its Use for Content Management
IBM Spectrum Scale and Its Use for Content ManagementSandeep Patil
 
IBM Spectrum Scale Security
IBM Spectrum Scale Security IBM Spectrum Scale Security
IBM Spectrum Scale Security Sandeep Patil
 
K8s cluster autoscaler
K8s cluster autoscaler K8s cluster autoscaler
K8s cluster autoscaler k8s study
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionHemanth Venkatesh
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesMarko Bevc
 
Introduction to BTRFS and ZFS
Introduction to BTRFS and ZFSIntroduction to BTRFS and ZFS
Introduction to BTRFS and ZFSTsung-en Hsiao
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIORebekah Rodriguez
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Community
 
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionProject ACRN
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
[IBM 서버] POWER9
[IBM 서버] POWER9[IBM 서버] POWER9
[IBM 서버] POWER9(Joe), Sanghun Kim
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM ShapeBlue
 
VM Job Queues in CloudStack
VM Job Queues in CloudStackVM Job Queues in CloudStack
VM Job Queues in CloudStackShapeBlue
 
Openstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingOpenstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingShannon McFarland
 

Was ist angesagt? (20)

Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocators
 
IBM Spectrum Scale and Its Use for Content Management
 IBM Spectrum Scale and Its Use for Content Management IBM Spectrum Scale and Its Use for Content Management
IBM Spectrum Scale and Its Use for Content Management
 
IBM Spectrum Scale Security
IBM Spectrum Scale Security IBM Spectrum Scale Security
IBM Spectrum Scale Security
 
K8s cluster autoscaler
K8s cluster autoscaler K8s cluster autoscaler
K8s cluster autoscaler
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodes
 
Introduction to BTRFS and ZFS
Introduction to BTRFS and ZFSIntroduction to BTRFS and ZFS
Introduction to BTRFS and ZFS
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS Update
 
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introduction
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
[IBM 서버] POWER9
[IBM 서버] POWER9[IBM 서버] POWER9
[IBM 서버] POWER9
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
VM Job Queues in CloudStack
VM Job Queues in CloudStackVM Job Queues in CloudStack
VM Job Queues in CloudStack
 
Openstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingOpenstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud Networking
 

Ähnlich wie Spectrum Scale Memory Usage

Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Databricks
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)MongoDB
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Community
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment StrategiesMongoDB
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-BrĂŠe
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Driver development – memory management
Driver development – memory managementDriver development – memory management
Driver development – memory managementVandana Salve
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment StrategyMongoDB
 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)JyotiprakashMishra18
 
Kinetic basho public
Kinetic basho publicKinetic basho public
Kinetic basho publicAnton Gerasimov
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Suresh Kumar
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Suresh Kumar
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Lars Marowsky-BrĂŠe
 
Mac Memory Analysis with Volatility
Mac Memory Analysis with VolatilityMac Memory Analysis with Volatility
Mac Memory Analysis with VolatilityAndrew Case
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internalsSisimon Soman
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
Deterministic Memory Abstraction and Supporting Multicore System Architecture
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureDeterministic Memory Abstraction and Supporting Multicore System Architecture
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureHeechul Yun
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 

Ähnlich wie Spectrum Scale Memory Usage (20)

Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Driver development – memory management
Driver development – memory managementDriver development – memory management
Driver development – memory management
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)
 
Kinetic basho public
Kinetic basho publicKinetic basho public
Kinetic basho public
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
 
Mac Memory Analysis with Volatility
Mac Memory Analysis with VolatilityMac Memory Analysis with Volatility
Mac Memory Analysis with Volatility
 
Windows memory manager internals
Windows memory manager internalsWindows memory manager internals
Windows memory manager internals
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Deterministic Memory Abstraction and Supporting Multicore System Architecture
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureDeterministic Memory Abstraction and Supporting Multicore System Architecture
Deterministic Memory Abstraction and Supporting Multicore System Architecture
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 

KĂźrzlich hochgeladen

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 

KĂźrzlich hochgeladen (20)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

Spectrum Scale Memory Usage

  • 1. Spectrum Scale Memory Usage Tomer Perry Spectrum Scale Development <tomp@il.ibm.com>
  • 2. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all? ● Others
  • 3. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all? ● Others
  • 4. Some relevant Scale Basics ”The Cluster” • Cluster – A group of operating system instances, or nodes, on which an instance of Spectrum Scale is deployed • Network Shared Disk - Any block storage (disk, partition, or LUN) given to Spectrum Scale for its use • NSD server – A node making an NSD available to other nodes over the network • GPFS filesystem – Combination of data and metadata which is deployed over NSDs and managed by the cluster • Client – A node running Scale Software accessing a filesystem nsd nsd srv nsd nsd nsd srv nsd srv nsd srv CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC CCCCC
  • 5. Some relevant Scale Basics MultiCluster • A Cluster can share some or all of its filesystems with other clusters • Such cluster can be referred to as “Storage Cluster” • A cluster that don’t have any local filesystems can be referred to as “Client Cluster” • A Client cluster can connect to 31 clusters ( outbound) • A Storage cluster can be mounted by unlimited number of client clusters ( Inbound) – 16383 really. • Client cluster nodes “joins” the storage cluster upon mount CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCC
  • 6. Some relevant Scale Basics Node Roles • While the general concept in Scale is “all nodes were made equal” some has special roles Config Servers  Holds cluster configuration files  4.1 onward supports CCR - quorum  Not in the data path Cluster Manager  One of the quorum nodes  Manage leases, failures and recoveries  Selects filesystem managers  mm*mgr -c Filesystem Manager  One of the “manager” nodes  One per filesystem  Manage filesystem configurations ( disks)  Space allocation  Quota management Token Manager/s  Multiple per filesystem  Each manage portion of tokens for each filesystem based on inode number
  • 7. Some relevant Scale Basics Node Internal Structure User Mode Daemon: I/O Optimization, Communications, Clustering Kernel Module Pagepool (Buffer Pool) Pinned Memory Disk Descriptor File System Descriptor File metadata File Data Application Shared Storage Communication TCP/IP RDMA Sockets GPFS Commands /usr/lpp/mmfs/bin Configuration CCR Other Memory Non-pinned Memory Other Nodes Shell
  • 8. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all? ● Others
  • 9. Scale Memory Pools • At a high level, Scale is using several memory pools: • Pagepool data • Shared Segment “memory metadata”, shared ( U+K) • External/heap Used by “external” parts of Scale ( AFM queue, Ganesha, SAMBA etc.) External/Heap SharedSegment Pagepool
  • 10. Scale Memory Pools • At a high level, Scale is using several memory pools: • Pagepool data • Shared Segment “memory metadata”, shared ( U+K) • External/heap Used by “external” parts of Scale ( AFM queue, Ganesha, SAMBA etc.) External/Heap SharedSegment Pagepool
  • 11. Scale Memory Pools Pagepool • Pagepool size is statically defined using the pagepool config option. It is allocated on daemon startup and can be changed using mmchconfig with the -i option ( NUMA, fragmentation) • Pagepool stores data only • All pagepool content is referenced by the sharedSegment ( data is worthless without metadata…) • Not all the sharedSegment points to the pagepool SharedSegment Pagepool
  • 12. Scale Memory Pools SharedSegment • The shared segment has two major pools • Pool2 - “Shared Segment” - All references to data stored in the pagepool ( bufferDesc, Openfiles, IndBlocks, CliTokens) etc. - Statcache ( compactOpenfile) • Pool3 - “Token Memory” - Tokens on token servers - Some client tokens ( BR) Pool2 “UNPINNED” Pool3 TokenMemory
  • 13. Scale Memory Pools SharedSegment • What’s the difference between maxFilesToCache and maxStatCache? – maxFilesToCache: How many files data and metadata can we cache in the pagepool. Implies how many concurrent open files we can have – maxStatCache: If a file needs to be evicted from the pagepool due to MFTC above, we can still take its stat structure, compact it and store it here. So we won’t need an IOP for those stat calls maxStatCache maxFilesToCache
  • 14. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all? ● Others
  • 15. Memory Calculations ”So how much memory do I need?” • Pagepool - ”it depends...”: How much DATA do you need to cache? How much RAM can you spend? • Shared Segment - “it depends...” How many files do you need to cache? How many inodes do you need to cache? What is your access pattern ( BR) SharedSegment Pagepool
  • 16. Memory Calculations SharedSegment 0 10000 20000 30000 40000 50000 60000 100 200 300 400 500 Memory Use per cached files No Of Files MB • Pool 2 - “Shared Segment” – Scope: Almost completely local ( not much dependency on other nodes) – Relevant parameters: ● maxFilesToCache ● maxStatCache – Cost: ● Each MFTC =~ 10k ● Each MSC =~ 480b For example, using MFTC of 50k and MSC of 20k and opening 60k files
  • 17. Memory Calculations SharedSegment • Pool 3 - “Token Memory” – Scope: “Storage cluster” manager nodes – Relevant parameters: ● MFTC, MSC, number of nodes, “hot objects” – Cost: ● TokenMemPerFile ~= 464b, ServerBRTreeNode = 56b ● BasicTokenMem = (MFTC+MSC)*NoOfNodes*TokenMemPerFile ● ClusterBRTokens = NoOfHotBlocksRandomAccess * ServerBRTreeNode ● TotalTokenMemory = BasicTokenMem + ClusterBRTokens ● PerTokenServerMem = TotalTokenMemory / NoOfTokenServers * The number of the byte range tokens, in the worse case scenario, would be one byte range per block for all the blocks being accessed randomly
  • 18. Memory Calculations SharedSegment • Pool 3 - “Token Memory” – Scope: “Storage cluster” manager nodes – Relevant parameters: ● MFTC, MSC, number of nodes, “hot objects” – Cost: ● TokenMemPerFile ~= 464b, ServerBRTreeNode = 56b ● BasicTokenMem = (MFTC+MSC)*NoOfNodes*TokenMemPerFile ● ClusterBRTokens = NoOfHotBlocksRandomAccess * ServerBRTreeNode ● TotalTokenMemory = BasicTokenMem + ClusterBRTokens ● PerTokenServerMem = TotalTokenMemory / NoOfTokenServers * The number of the byte range tokens, in the worse case scenario, would be one byte range per block for all the blocks being accessed randomly Overall number of cached objects in the cluster ( different nodes might cache different numbers) * 0.5K + monitor token memory usage
  • 19. Memory Calculations SharedSegment 0 10000 20000 30000 40000 50000 60000 200 400 600 800 1000 1200 Token Server Memory Use Per cached file SrvTM SrvTM*25 NoOfFiles MB • Pool 3 - “Shared Segment” – As can be seen in the graph, single client don’t usually large memory foot print ( blue line) – Using 25 clients, already brings us closer to 1G – There is no difference between the “cost” of MFTC and MSC For example, using MFTC of 50k and MSC of 20k and opening 60k files
  • 20. Memory Calculations SharedSegment • But...where can we find those numbers? – Mmdiag is your friend – “The Command-Who-Must-Not-Be Named” can provide even more details ( dump malloc)
  • 21. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all? ● Others
  • 22. Is that all? Related OS memory usage 0 10000 20000 30000 40000 50000 60000 100 200 300 400 500 Memory Utilization Shared2 dEntryOSz inodeOvSz No Of Files MB • “Scale don’t use the operating system cache”...well...sort of – Directory Cache ( a.k.a dcache) - ~192b – VFS inode cache - ~1K • But is it really important? The SharedSeg takes much more memory
  • 23. Outline ● Relevant Scale Basics ● Scale Memory Pools ● Memory Calculations ● Is that all?( Linux memory) ● Others
  • 24. Others Out of scope here • TCP Memory • AFM queue memory • Policy ( and its affect on cached objects) • Ganesha (MDCache) • SAMBA • Sysmon, zimon • GUI • TCT, Archive • fsck, other maintenance commands