SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
Quota enforcement for high-performance
                              distributed storage systems
 Kristal T. Pollack, Darrell D. E. Long, Richard A. Golding, Benjamin Reed, Ralph A. Becker-Szendy
                              IBM Almaden Research Center, San Jose, CA


Abstract                                                            cess control.
                                                                       Existing quota systems trade off scalability and accu-
Storage systems manage quota to ensure that each user               racy. A centralized quota tracking server can be byte-
gets the storage they need, and that no one user can—even           accurate as long as each client informs the server on each
by accident—use up all available storage. This is diffi-             resource allocation, which inhibits scalable performance.
cult for large, distributed systems, especially those used          Other systems use a centralized server but relax accuracy,
for high-performance computing applications, because re-            either by tracking quota only in large granules or by us-
source allocation occurs on many nodes concurrently. We             ing time-limited escrow mechanisms (which set aside a
present a scheme where quota is enforced asynchronously             certain amount of resource for a client for a limited du-
by intelligent storage servers: storage clients contact a           ration), both of which reduce the frequency with which a
shared management service to get vouchers, a capability-            client must interact with the quota tracking server.
like certificate that the clients can redeem at participating           The existing quota systems also provide for a sin-
storage servers to allocate storage space. This approach            gle quota-related policy for all clients. In a distributed
produces low load on the shared management service,                 file system like SanFS [10], for example, the centralized
promotes good scaling, and allows the client to make de-            metadata server decides which logical disks in a storage
cisions about which storage server(s) to use without com-           pool a client should be allocated when it needs storage.
municating with the management service for further ap-              This requires that the policy not only have an accurate
proval. Storage servers and the management service peri-            record of how much quota each user has consumed, but
odically reconcile voucher usage to ensure that clients do          also an accurate map of how much resource is available
not cheat by spending the same voucher at multiple stor-            on every server on which that resource could be allocated.
age servers. We report on a simulation study that shows                We propose an alternative approach to tracking and en-
that this approach gives performance nearly as good as              forcing resource limits. This approach borrows from mi-
not enforcing quota at all, and that the load on the shared         crocash mechanisms: there is a centralized server that
management server is remarkably low.                                acts as a bank that issues vouchers to clients, which the
                                                                    client can spend to allocate resources on whatever server
                                                                    they want. The client can withdraw enough vouchers to
1 Introduction                                                      cover their needs for some period, during which time the
                                                                    client does not need to contact the bank. Servers are able
Tracking and enforcing resource usage limits in a large             to check the vouchers for authenticity. The vouchers are
distributed system is difficult because it requires maintain-        valid for a limited time, in order to handle clients that fail,
ing a consistent view of total usage when consumption is            and servers periodically reconcile their transactions with
occurring in several places concurrently. In a file system,          the bank to check that clients have behaved correctly.
for example, users must not use more than their storage                This approach provides a different tradeoff than other
quota. Many scientific applications involve tens of thou-            quota servers. It provides excellent scalability—in most
sands of nodes all cooperating on a problem, all writing to         cases indistinguishable from not tracking resource us-
shared files and consuming from the same pool of quota.              age at all—while providing byte-accurate but temporally-
The file system is typically built as a small cluster of meta-       coarse accuracy similar to time-limited escrow. It reduces
data servers and a larger number of storage servers or disk         load on the centralized tracking service well below that of
arrays. We concentrate here on systems that use storage             other mechanisms. It also decouples quota tracking from
servers that provide intelligence similar to object storage,        allocation policy, so that the quota server only needs to
and so can track local storage allocation and enforce ac-           track how much quota a user has consumed, and does not


                                                                1
decisions that give the best result for the application the
                                                                                                                                       C   l   i   e   n   t
                                                                                                                                                                                                                                                                                   client is running (self-interest), while ensuring that users
                                                                                                                                                                                                                                                                                   do not go over quota and that storage resources are not
           A   u   t   h       o       r           i   z               a               t           i       o           n                                                               A                       c                   c                   e               s   s
                                                                                                                                                                                                                                                                                   over-used (community interest).
                                                                                                                                                                                                                                                                                      Different files can have significantly different needs,
                                                                                                                                                                                                                                                                                   and so the system allows a different layout for each file.
                           m
                                   a           n           a
                                                               F




                                                                           g
                                                                                   i   l




                                                                                       e
                                                                                           e




                                                                                                               m
                                                                                                                           e   n   t
                                                                                                                                                                                                                       D
                                                                                                                                                                                                                                           i   s                   k
                                                                                                                                                                                                                                                                                   One file may be located on a single storage server; another
                                                                                                                                                                                                                                                                                   may be mirrored and striped across many storage servers.
                                                                                                                                                                                   i       s           k
                                                                                                                                                               D




                                                                                                                                                                       S       t               o                   r       a           g                       e




                                                           l       u           s               t       e           r




                                           c




                                                                                                                                                                   s       e                       r       v                   e                   r       s




                                                                                                                                                                                                                                                                                   The client decides what the layout should be, based on
                                                                                                                                                                                                                                                                                   expected needs derived either from application hints or
                                                                                                                                                                                                                                                                                   inferred from file attributes. Peak file creation rates can
                                                                                                                                                                                                                                                                                   be high in some scientific applications, and so it is impor-
                                                                                                                                                                                                                                                                                   tant for good scalability to minimize the dependence on
 Figure 1: Basic distributed storage system architecture.                                                                                                                                                                                                                          the shared file management service during file creation.
                                                                                                                                                                                                                                                                                      Scientific applications have characteristics different
                                                                                                                                                                                                                                                                                   from what studies of end-user workstations have shown.
need to be concerned with where that consumption has                                                                                                                                                                                                                               The absolute numbers are several orders of magnitude
occurred, which reduces the load on the quota server and                                                                                                                                                                                                                           larger: petabytes of data are being deployed now, with
makes it easier to partition the quota tracking work across                                                                                                                                                                                                                        aggregate transfer rates of gigabytes per second, files in
multiple servers. Further, each client can decide for itself                                                                                                                                                                                                                       the terabytes, all being accessed by tens of thousands of
where to allocate resources based on its own needs which                                                                                                                                                                                                                           clients. The clients are cooperating to run one application,
allows a client to customize its allocation based, for ex-                                                                                                                                                                                                                         and both read- and write-share files. The applications are
ample, on how a particular file will be used.                                                                                                                                                                                                                                       bursty, as the clients synchronize as they move through
                                                                                                                                                                                                                                                                                   phases of a computation and write out checkpoints con-
                                                                                                                                                                                                                                                                                   currently or read the results of previous phases. Some of
2 System context                                                                                                                                                                                                                                                                   the files are only temporary, for communication between
                                                                                                                                                                                                                                                                                   computation phases, while others are results of days of
Figure 1 shows the architecture of the distributed storage                                                                                                                                                                                                                         computation and must be carefully protected.
systems we are investigating. In these systems, clients act                                                                                                                                                                                                                           Because K2 is built for large distributed environments,
on behalf of users. The clients communicate with a file                                                                                                                                                                                                                             its design works to minimise the trust required in any one
system management service cluster to locate files and au-                                                                                                                                                                                                                           component—in particular, the client. While many clients
thorize actions. The authorization includes both checking                                                                                                                                                                                                                          may be part of a single homogeneous compute cluster,
permission to access data and permission to consume re-                                                                                                                                                                                                                            some clients will be different and potentially not under
sources. The authorization is expressed using location-                                                                                                                                                                                                                            careful administrative care—for example, user worksta-
independent capabilities [11] and vouchers, which en-                                                                                                                                                                                                                              tions used for visualizing results. The system assumes
code the client’s rights to access files and to allocate re-                                                                                                                                                                                                                        that clients authenticate the users that run on them, and
sources respectively. Once the client has the capabilities                                                                                                                                                                                                                         that the clients can provide evidence of that authentica-
and vouchers it needs, it communicates directly with stor-                                                                                                                                                                                                                         tion when communicating with the file management ser-
age servers to read and write data and to create and delete                                                                                                                                                                                                                        vice and with storage server [9]. Our design assumes that
files. The storage server has the intelligence to manage                                                                                                                                                                                                                            clients can crash-fail. While some clients can also be ma-
internal resource allocation and to check capabilities for                                                                                                                                                                                                                         licious, we do not focus on them; for example, we do
validity, similar to object store model [8, 5].                                                                                                                                                                                                                                    not provide data access that can survive Byzantine behav-
   We are investigating quota management as part of the                                                                                                                                                                                                                            ior [1]. However, for resource quota management we do
K2 distributed storage system. For scalability reasons, K2                                                                                                                                                                                                                         bound the effect that any client can have on the system.
pushes decentralization as far as possible. Each node is
an autonomous agent, acting in its own interest as much
as possible while respecting community needs. We make                                                                                                                                                                                                                              3    Protocol operation
this possible by each node acting as an enlightened ra-
tional agent, with algorithms that work to meet the node’s                                                                                                                                                                                                                         In this section we give an overview of the operation of the
needs while avoiding the “tragedy of the commons,” [6] in                                                                                                                                                                                                                          voucher-based quota system. Figure 2 shows the general
which the limits on shared resources are not considered.                                                                                                                                                                                                                           flow of usage. A client first requests a voucher for storage
In concrete terms for a storage system, this means that                                                                                                                                                                                                                            resources from the quota server for the user, then sends
we want each client to be able to make its own allocation                                                                                                                                                                                                                          IO requests to storage nodes, including the voucher when


                                                                                                                                                                                                                                                                               2
fi   l   e   m




                     (
                         a




                             b
                                 n




                                     a
                                             a




                                                     n
                                                         g




                                                                 k
                                                                     e




                                                                         )
                                                                             m       e           n       t




                                                                                                                                                                                 (       f           o
                                                                                                                                                                                                         c
                                                                                                                                                                                                                     r
                                                                                                                                                                                                                         l           i




                                                                                                                                                                                                                                             u
                                                                                                                                                                                                                                                 e




                                                                                                                                                                                                                                                         s
                                                                                                                                                                                                                                                             n




                                                                                                                                                                                                                                                                         e
                                                                                                                                                                                                                                                                             t




                                                                                                                                                                                                                                                                                         r
                                                                                                                                                                                                                                                                                             )
                                                                                                                                                                                                                                                                                                                                                                                                                                 s
                                                                                                                                                                                                                                                                                                                                                                                                                                         s




                                                                                                                                                                                                                                                                                                                                                                                                                                                 e
                                                                                                                                                                                                                                                                                                                                                                                                                                                         t
                                                                                                                                                                                                                                                                                                                                                                                                                                                                 o




                                                                                                                                                                                                                                                                                                                                                                                                                                                                     r           v
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             a




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 e
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         g




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     e




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         1                                                                       s
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     s




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         e
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             t
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 o




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     r   v
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 a




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     e
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         g




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 e




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     2
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Getting vouchers. While a client could ask the man-
                                         r       e           q       u       e   s       t   (       u           s       e       r       i   d       ,       a       m       o               u               n               t           )
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             agement server for quota authorization on every I/O,
                                                                                                             v
                                                                                                                     o       u       c           h       e       r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             this would put an unreasonable load on the management
                                                                                                                                                                                                                                                                 I               O




                                                                                                                                                                                                                                                                                                 r   e       q           u   e               s           t       (
                                                                                                                                                                                                                                                                                                                                                                             v
                                                                                                                                                                                                                                                                                                                                                                                         o           u           c       h                   e               r               )
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             server. Instead, the client maintains a pool of vouchers,
                                                                                                                                                                                                                                                                                                                 I   O




                                                                                                                                                                                                                                                                                                                                         r       e                   q           u           e           s           t       (
                                                                                                                                                                                                                                                                                                                                                                                                                                     v
                                                                                                                                                                                                                                                                                                                                                                                                                                                     o                   u               c           h           e           r   )




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             and only periodically communicates with the manage-
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ment server. The client tries to maintain enough vouchers
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     a       l   l       o       c       a       t       e




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     r   e           s       o       u       r       c       e




                                                                                                                                                                                                                                                                                                                                 l   e               f       t           o                       e           r
                                                                                                                                                                                                                                                                                                                                                                                     v




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             to cover any allocation it expects to do in the near future,
                                                                                                                                                                         c           h           e               c               k                   u               s               a       g           e




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             while allowing for other clients to share quota. This ap-
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             proach reduces load on the management server and im-
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             proves client response latency.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                The client has to decide when to request vouchers and
Figure 2: Sequence of operations. A client begins by ob-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     how much resource to ask for; the management server
taining a voucher for a user from the quota server, then                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     has to decide how much of that request to grant. The
spending that voucher during IO requests to different stor-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  management server must maintain the invariant that the
age nodes. Later, the quota server and storage nodes check                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   vouchers granted for a user to any client—which repre-
that the client did not overuse a voucher.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   sent potentially-used resources—plus the amount actually
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             allocated do not go over the user’s quota.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                While there are many possible policies for deciding
those requests may consume resources. If a client frees                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      when and how much to ask for, we focus on a client re-
resources on a storage server, the storage server gives the                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  questing quota from the management server on a regular
client a “refund” voucher for the amount freed. The quota                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    schedule. This generally makes the load on the manage-
server and storage nodes periodically reconcile the set of                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ment server proportional to the number of clients, rather
vouchers that have been spent against those that have been                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   than to the intensity of workload on those clients. The
issued, in order to detect clients that overuse a voucher.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   client uses its history of recent resource consumption to
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             estimate how much it will likely use between the current
Vouchers. A voucher is a record of a decision to allow                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       request and the next request, and asks the management
a client to consume resources on behalf of a particular                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      server for the difference between the estimated usage and
user. It is represented as a cryptographically-protected se-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 the vouchers it already has on hand.
quence of bytes:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                If actual usage is higher than anticipated, then the client
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             will have to ask the management server for extra vouchers
                     {epoch, expiry, user, amount, serial}auth                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               before its next scheduled request. The client can estimate
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             the management server’s response time and the short-term
similar to capabilities used to authorize actions in
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             voucher usage rate to predict when to send a request to the
Amoeba [11] and the T10 OSD [8]. User and amount are
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             management server before the client runs out of vouchers.
obvious. The voucher has a unique serial number, which
is used when storage servers reconcile voucher usage with                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       The client may have extra vouchers when net consump-
the management server. Each voucher also records when                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        tion is lower than anticipated—perhaps because it has
it was issued (the epoch) and when it expires.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               been freeing rather than allocating storage. In that case
   The voucher includes a signature or MAC generated                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         the client can return some vouchers to the management
using a secret key known only to the management and                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          server, making the quota available for other clients. This
storage servers, which ensures that a client cannot forge                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    matters, for example, when one client is cleaning up old
a voucher. However, it does not prevent one client from                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      files while other clients are writing new data.
eavesdropping on another, and so vouchers must be trans-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        The management server determines how much to grant
mitted only over private channels. Issues like avoiding re-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  to a client based on its global information, including the
play attacks do not require special mechanisms in vouch-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     total amount of vouchers outstanding for a user and es-
ers if they are used in conjunction with authorization ca-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   timated demand from all clients consuming that user’s
pabilities that provide defense against replay.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              quota. Granting more to a client can reduce the number of
   Each voucher is valid for only a limited duration,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        request messages that the management server must pro-
as recorded in its expiry field. This is used in han-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         cess, but giving too much to one client can inhibit sharing
dling failure—if a client crashes while holding an un-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       across multiple clients. The management server must also
used voucher, other clients can use that quota once the                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      not issue enough vouchers that a user could go over quota.
voucher expires—and in reconciling storage and manage-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          One reasonable heuristic is for the management server
ment servers. These are discussed further below.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             to give each client an amount proportional to its con-


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         3
sumption rate, while reserving some quota in case new                                                                                                                                                                                                                                                                     K




                                                                                                                                                                                                                                                                                                                                                  u




clients begin using quota. The client policy discussed
                                                                                                                                                                                                                                                                                                                                                                      n           e                   x                           i   r           e                   d




                                                                                                                                                                                                                                                                                                                                                                                                                      p




                                                                                                                                                                                          e       x                   i       r           e                       d




                                                                                                                                                                                                      p




                                                                                                                                                                                                                                                                                                                                                                                                          X




above makes requests approximately proportional to con-
sumption rate, and so the management server can give
                                                                                              R           e       c       o       n           c   i       l       e       d           2       3                                       2                       4                                               2                       5                                                   2                   6                                               2                   7




each client the same fraction f of their requested amount.                                        (   j       u       s


                                                                                                                              t       t   o           t       a       l
                                                                                                                                                                              s   )
                                                                                                                                                                                                              U                   n
                                                                                                                                                                                                                                                  r
                                                                                                                                                                                                                                                                      e               c           o                   n                       c                   i   l   e           d
                                                                                                                                                                                                                                                                                                                                                                                                                                          C                       u                   r           r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      e           n   t




When there is plenty of quota, f = 1. As the number of                                                                                                                                                                                                e                   p               o               c                   h                       s                                                                                               e                   p               o               c   h




clients increases or the amount of remaining quota de-                                                                                                                                                    (


                                                                                                                                                                                                                  t       o                   t           a                   l
                                                                                                                                                                                                                                                                                  s


                                                                                                                                                                                                                                                                                                      a                           n                       d                   v               o
                                                                                                                                                                                                                                                                                                                                                                                                                  u           c               h


                                                                                                                                                                                                                                                                                                                                                                                                                                                          e                   r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              s




creases, each client gets a fraction f = r/((n + 1)r) of
                                                                                                                                                                                                                                                                                                                                                                                                  c               h       )


                                                                                                                                                                                                                                                                                              e                   r                       e                                   o




                                                                                                                                                                                                                                                                              p                                                                               p




                                                                              t   i   m   e




their requests, where r is this client’s consumption rate, n
is the number of clients consuming from that quota, and r
is the average consumption rate over all active clients.
                                                                     Figure 3: How the storage server maintains consumption
Using vouchers. Once a client has obtained a voucher,                information over multiple epochs. X is the number of
it can use the voucher to consume resources. The client              epochs before vouchers expire; K is the number of epochs
picks which storage server it will use; the problem of se-           in the past when reconciliation happens.
lecting the server is outside the scope of this paper.
   In the simplest way of using vouchers, the client sends
its I/O request to the storage server, along with one or The system divides time into epochs, as illustrated in
more vouchers that will cover any resource allocationsFigure 3. Each voucher is associated with the epoch in
the I/O request might require. The storage server keeps
                                                      which it was issued, and the storage server keeps a list of
track of how much resource was actually consumed by   vouchers that it has received for each epoch. It also tracks
the request, and may send the client a new voucher forhow much resource was consumed against those vouch-
any balance in its reply. The storage server keeps track
                                                      ers in the epoch. At some point there can be no more
of how much each user has consumed, plus any recently-activity associated with an epoch—because all vouchers
spent vouchers. The vouchers are periodically reconciled
                                                      from that epoch will have expired—and the storage server
with the management server in order to handle failure or
                                                      can reconcile the list of vouchers used for that epoch with
to catch cheaters, as discussed below.                the management server. After reconciliation, the storage
                                                      server can get rid of the list of vouchers and merge that
   Consider a simple scenario: a client is trying to write
1 MB of data into an existing file. The client obtains epoch’s consumption information into the record of over-
a voucher for (say) 2 MB from the management server,  all reconciled consumption.
then sends a write request to the storage server along with
                                                         We summarize the formal model for tracking and rec-
the voucher. The storage server determines how much re-
                                                      onciling quota for a single user as follows. (The expo-
source is consumed. It might consume nothing, if the re-
                                                      sition for a single user is clearer than for multiple users,
quest only overwrites already-allocated blocks, or it might
                                                      but the rules are the same.) The user has a quota Q, and
consume a full 1 MB, or something in between. The stor-
                                                      the management server has authorized some allocation A;
age server will reply with a refund voucher for 2 MB mi-
                                                      the system works to keep A ≤ Q. The allocation A is the
nus the amount actually allocated.                    amount of storage used on all storage servers plus the
   Vouchers can also be used in a somewhat different  amount of any unredeemed vouchers: A = ∑∀d Sd + U,
way to solve a long-standing problem in object storagewhere Sd is the amount used on storage server d and U is
systems—ensuring that an operation will succeed when  the amount of unredeemed vouchers.
multiple clients could be consuming resources in the stor-
                                                         For the management server to know A accurately us-
age server. A client can use a voucher to reserve resources
                                                      ing this definition, it would need to be involved syn-
at the storage server to ensure that its later operations will
                                                      chronously on every resource allocation, which defeats
have the resources to complete. This is particularly im-
                                                      the intent of the voucher approach. Instead, the man-
portant for object stores because it is hard for a client to
                                                      agement server uses a conservative estimate of A: in-
predict how much resource any one operation will con- stead of the actual current usage ∑ Sd , it uses the us-
sume.                                                 age determined at the last reconciliation, and instead of
                                                      the actual unredeemed vouchers, it uses all vouchers is-
Tracking and reconciling usage. The storage server sued since the last reconciliation. Formally, the manage-
keeps track of how much a user has consumed, peri- ment server knows the amount of resource the user had
odically reconciles voucher usage with the management consumed as of the last reconciliation, which covered all
server to catch cheaters and recover from failures.                                                e
                                                      epochs up to and including epoch e: ∑∀d Sd , where e de-


                                                                 4
quota-voucher-paper.pdf
quota-voucher-paper.pdf
quota-voucher-paper.pdf
quota-voucher-paper.pdf
quota-voucher-paper.pdf
quota-voucher-paper.pdf

Weitere ähnliche Inhalte

Ähnlich wie quota-voucher-paper.pdf

Resource pooling_Sharing and resource provisioning.pptx
Resource pooling_Sharing and resource provisioning.pptxResource pooling_Sharing and resource provisioning.pptx
Resource pooling_Sharing and resource provisioning.pptxVasavi Bande
 
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...Prasadu Peddi
 
Improving utilization of infrastructure clouds
Improving utilization of infrastructure cloudsImproving utilization of infrastructure clouds
Improving utilization of infrastructure cloudsRaga Deepthi
 
An Inference Sharing Architecture for a More Efficient Context Reasoning
An Inference Sharing Architecture for a More Efficient Context ReasoningAn Inference Sharing Architecture for a More Efficient Context Reasoning
An Inference Sharing Architecture for a More Efficient Context ReasoningAitor Almeida
 
A stochastic model to investigate data center performance and qos in iaas clo...
A stochastic model to investigate data center performance and qos in iaas clo...A stochastic model to investigate data center performance and qos in iaas clo...
A stochastic model to investigate data center performance and qos in iaas clo...Papitha Velumani
 
dynamic resource allocation using virtual machines for cloud computing enviro...
dynamic resource allocation using virtual machines for cloud computing enviro...dynamic resource allocation using virtual machines for cloud computing enviro...
dynamic resource allocation using virtual machines for cloud computing enviro...Kumar Goud
 
Large-Scale Decentralized Storage Systems for Volunter Computing Systems
Large-Scale Decentralized Storage Systems for Volunter Computing SystemsLarge-Scale Decentralized Storage Systems for Volunter Computing Systems
Large-Scale Decentralized Storage Systems for Volunter Computing SystemsArinto Murdopo
 
LOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGLOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGIRJET Journal
 
A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...JPINFOTECH JAYAPRAKASH
 
Cost Based Performance Modelling
Cost Based Performance ModellingCost Based Performance Modelling
Cost Based Performance ModellingEugene Margulis
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...IEEEGLOBALSOFTTECHNOLOGIES
 
A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...IEEEGLOBALSOFTTECHNOLOGIES
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESTushar Dhoot
 
Resource Allocation using Virtual Clusters
Resource Allocation using Virtual ClustersResource Allocation using Virtual Clusters
Resource Allocation using Virtual ClustersMark Stillwell
 
Chapeter 2 introduction to cloud computing
Chapeter 2   introduction to cloud computingChapeter 2   introduction to cloud computing
Chapeter 2 introduction to cloud computingeShikshak
 

Ähnlich wie quota-voucher-paper.pdf (20)

unit3 part1.pptx
unit3 part1.pptxunit3 part1.pptx
unit3 part1.pptx
 
Os
OsOs
Os
 
Resource pooling_Sharing and resource provisioning.pptx
Resource pooling_Sharing and resource provisioning.pptxResource pooling_Sharing and resource provisioning.pptx
Resource pooling_Sharing and resource provisioning.pptx
 
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
 
Improving utilization of infrastructure clouds
Improving utilization of infrastructure cloudsImproving utilization of infrastructure clouds
Improving utilization of infrastructure clouds
 
An Inference Sharing Architecture for a More Efficient Context Reasoning
An Inference Sharing Architecture for a More Efficient Context ReasoningAn Inference Sharing Architecture for a More Efficient Context Reasoning
An Inference Sharing Architecture for a More Efficient Context Reasoning
 
A stochastic model to investigate data center performance and qos in iaas clo...
A stochastic model to investigate data center performance and qos in iaas clo...A stochastic model to investigate data center performance and qos in iaas clo...
A stochastic model to investigate data center performance and qos in iaas clo...
 
unit3.ppt
unit3.pptunit3.ppt
unit3.ppt
 
dynamic resource allocation using virtual machines for cloud computing enviro...
dynamic resource allocation using virtual machines for cloud computing enviro...dynamic resource allocation using virtual machines for cloud computing enviro...
dynamic resource allocation using virtual machines for cloud computing enviro...
 
T04503113118
T04503113118T04503113118
T04503113118
 
Large-Scale Decentralized Storage Systems for Volunter Computing Systems
Large-Scale Decentralized Storage Systems for Volunter Computing SystemsLarge-Scale Decentralized Storage Systems for Volunter Computing Systems
Large-Scale Decentralized Storage Systems for Volunter Computing Systems
 
LOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTINGLOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTING
 
A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...
 
Cost Based Performance Modelling
Cost Based Performance ModellingCost Based Performance Modelling
Cost Based Performance Modelling
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate dat...
 
A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...A stochastic model to investigate data center performance and qo s in iaas cl...
A stochastic model to investigate data center performance and qo s in iaas cl...
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A stochastic model to investigate data ...
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTES
 
Resource Allocation using Virtual Clusters
Resource Allocation using Virtual ClustersResource Allocation using Virtual Clusters
Resource Allocation using Virtual Clusters
 
Chapeter 2 introduction to cloud computing
Chapeter 2   introduction to cloud computingChapeter 2   introduction to cloud computing
Chapeter 2 introduction to cloud computing
 

quota-voucher-paper.pdf

  • 1. Quota enforcement for high-performance distributed storage systems Kristal T. Pollack, Darrell D. E. Long, Richard A. Golding, Benjamin Reed, Ralph A. Becker-Szendy IBM Almaden Research Center, San Jose, CA Abstract cess control. Existing quota systems trade off scalability and accu- Storage systems manage quota to ensure that each user racy. A centralized quota tracking server can be byte- gets the storage they need, and that no one user can—even accurate as long as each client informs the server on each by accident—use up all available storage. This is diffi- resource allocation, which inhibits scalable performance. cult for large, distributed systems, especially those used Other systems use a centralized server but relax accuracy, for high-performance computing applications, because re- either by tracking quota only in large granules or by us- source allocation occurs on many nodes concurrently. We ing time-limited escrow mechanisms (which set aside a present a scheme where quota is enforced asynchronously certain amount of resource for a client for a limited du- by intelligent storage servers: storage clients contact a ration), both of which reduce the frequency with which a shared management service to get vouchers, a capability- client must interact with the quota tracking server. like certificate that the clients can redeem at participating The existing quota systems also provide for a sin- storage servers to allocate storage space. This approach gle quota-related policy for all clients. In a distributed produces low load on the shared management service, file system like SanFS [10], for example, the centralized promotes good scaling, and allows the client to make de- metadata server decides which logical disks in a storage cisions about which storage server(s) to use without com- pool a client should be allocated when it needs storage. municating with the management service for further ap- This requires that the policy not only have an accurate proval. Storage servers and the management service peri- record of how much quota each user has consumed, but odically reconcile voucher usage to ensure that clients do also an accurate map of how much resource is available not cheat by spending the same voucher at multiple stor- on every server on which that resource could be allocated. age servers. We report on a simulation study that shows We propose an alternative approach to tracking and en- that this approach gives performance nearly as good as forcing resource limits. This approach borrows from mi- not enforcing quota at all, and that the load on the shared crocash mechanisms: there is a centralized server that management server is remarkably low. acts as a bank that issues vouchers to clients, which the client can spend to allocate resources on whatever server they want. The client can withdraw enough vouchers to 1 Introduction cover their needs for some period, during which time the client does not need to contact the bank. Servers are able Tracking and enforcing resource usage limits in a large to check the vouchers for authenticity. The vouchers are distributed system is difficult because it requires maintain- valid for a limited time, in order to handle clients that fail, ing a consistent view of total usage when consumption is and servers periodically reconcile their transactions with occurring in several places concurrently. In a file system, the bank to check that clients have behaved correctly. for example, users must not use more than their storage This approach provides a different tradeoff than other quota. Many scientific applications involve tens of thou- quota servers. It provides excellent scalability—in most sands of nodes all cooperating on a problem, all writing to cases indistinguishable from not tracking resource us- shared files and consuming from the same pool of quota. age at all—while providing byte-accurate but temporally- The file system is typically built as a small cluster of meta- coarse accuracy similar to time-limited escrow. It reduces data servers and a larger number of storage servers or disk load on the centralized tracking service well below that of arrays. We concentrate here on systems that use storage other mechanisms. It also decouples quota tracking from servers that provide intelligence similar to object storage, allocation policy, so that the quota server only needs to and so can track local storage allocation and enforce ac- track how much quota a user has consumed, and does not 1
  • 2. decisions that give the best result for the application the C l i e n t client is running (self-interest), while ensuring that users do not go over quota and that storage resources are not A u t h o r i z a t i o n A c c e s s over-used (community interest). Different files can have significantly different needs, and so the system allows a different layout for each file. m a n a F g i l e e m e n t D i s k One file may be located on a single storage server; another may be mirrored and striped across many storage servers. i s k D S t o r a g e l u s t e r c s e r v e r s The client decides what the layout should be, based on expected needs derived either from application hints or inferred from file attributes. Peak file creation rates can be high in some scientific applications, and so it is impor- tant for good scalability to minimize the dependence on Figure 1: Basic distributed storage system architecture. the shared file management service during file creation. Scientific applications have characteristics different from what studies of end-user workstations have shown. need to be concerned with where that consumption has The absolute numbers are several orders of magnitude occurred, which reduces the load on the quota server and larger: petabytes of data are being deployed now, with makes it easier to partition the quota tracking work across aggregate transfer rates of gigabytes per second, files in multiple servers. Further, each client can decide for itself the terabytes, all being accessed by tens of thousands of where to allocate resources based on its own needs which clients. The clients are cooperating to run one application, allows a client to customize its allocation based, for ex- and both read- and write-share files. The applications are ample, on how a particular file will be used. bursty, as the clients synchronize as they move through phases of a computation and write out checkpoints con- currently or read the results of previous phases. Some of 2 System context the files are only temporary, for communication between computation phases, while others are results of days of Figure 1 shows the architecture of the distributed storage computation and must be carefully protected. systems we are investigating. In these systems, clients act Because K2 is built for large distributed environments, on behalf of users. The clients communicate with a file its design works to minimise the trust required in any one system management service cluster to locate files and au- component—in particular, the client. While many clients thorize actions. The authorization includes both checking may be part of a single homogeneous compute cluster, permission to access data and permission to consume re- some clients will be different and potentially not under sources. The authorization is expressed using location- careful administrative care—for example, user worksta- independent capabilities [11] and vouchers, which en- tions used for visualizing results. The system assumes code the client’s rights to access files and to allocate re- that clients authenticate the users that run on them, and sources respectively. Once the client has the capabilities that the clients can provide evidence of that authentica- and vouchers it needs, it communicates directly with stor- tion when communicating with the file management ser- age servers to read and write data and to create and delete vice and with storage server [9]. Our design assumes that files. The storage server has the intelligence to manage clients can crash-fail. While some clients can also be ma- internal resource allocation and to check capabilities for licious, we do not focus on them; for example, we do validity, similar to object store model [8, 5]. not provide data access that can survive Byzantine behav- We are investigating quota management as part of the ior [1]. However, for resource quota management we do K2 distributed storage system. For scalability reasons, K2 bound the effect that any client can have on the system. pushes decentralization as far as possible. Each node is an autonomous agent, acting in its own interest as much as possible while respecting community needs. We make 3 Protocol operation this possible by each node acting as an enlightened ra- tional agent, with algorithms that work to meet the node’s In this section we give an overview of the operation of the needs while avoiding the “tragedy of the commons,” [6] in voucher-based quota system. Figure 2 shows the general which the limits on shared resources are not considered. flow of usage. A client first requests a voucher for storage In concrete terms for a storage system, this means that resources from the quota server for the user, then sends we want each client to be able to make its own allocation IO requests to storage nodes, including the voucher when 2
  • 3. l e m ( a b n a a n g k e ) m e n t ( f o c r l i u e s n e t r ) s s e t o r v r a e g r e 1 s s e t o r v r a e g r e 2 Getting vouchers. While a client could ask the man- r e q u e s t ( u s e r i d , a m o u n t ) agement server for quota authorization on every I/O, v o u c h e r this would put an unreasonable load on the management I O r e q u e s t ( v o u c h e r ) server. Instead, the client maintains a pool of vouchers, I O r e q u e s t ( v o u c h e r ) and only periodically communicates with the manage- ment server. The client tries to maintain enough vouchers a l l o c a t e r e s o u r c e l e f t o e r v to cover any allocation it expects to do in the near future, c h e c k u s a g e while allowing for other clients to share quota. This ap- proach reduces load on the management server and im- proves client response latency. The client has to decide when to request vouchers and Figure 2: Sequence of operations. A client begins by ob- how much resource to ask for; the management server taining a voucher for a user from the quota server, then has to decide how much of that request to grant. The spending that voucher during IO requests to different stor- management server must maintain the invariant that the age nodes. Later, the quota server and storage nodes check vouchers granted for a user to any client—which repre- that the client did not overuse a voucher. sent potentially-used resources—plus the amount actually allocated do not go over the user’s quota. While there are many possible policies for deciding those requests may consume resources. If a client frees when and how much to ask for, we focus on a client re- resources on a storage server, the storage server gives the questing quota from the management server on a regular client a “refund” voucher for the amount freed. The quota schedule. This generally makes the load on the manage- server and storage nodes periodically reconcile the set of ment server proportional to the number of clients, rather vouchers that have been spent against those that have been than to the intensity of workload on those clients. The issued, in order to detect clients that overuse a voucher. client uses its history of recent resource consumption to estimate how much it will likely use between the current Vouchers. A voucher is a record of a decision to allow request and the next request, and asks the management a client to consume resources on behalf of a particular server for the difference between the estimated usage and user. It is represented as a cryptographically-protected se- the vouchers it already has on hand. quence of bytes: If actual usage is higher than anticipated, then the client will have to ask the management server for extra vouchers {epoch, expiry, user, amount, serial}auth before its next scheduled request. The client can estimate the management server’s response time and the short-term similar to capabilities used to authorize actions in voucher usage rate to predict when to send a request to the Amoeba [11] and the T10 OSD [8]. User and amount are management server before the client runs out of vouchers. obvious. The voucher has a unique serial number, which is used when storage servers reconcile voucher usage with The client may have extra vouchers when net consump- the management server. Each voucher also records when tion is lower than anticipated—perhaps because it has it was issued (the epoch) and when it expires. been freeing rather than allocating storage. In that case The voucher includes a signature or MAC generated the client can return some vouchers to the management using a secret key known only to the management and server, making the quota available for other clients. This storage servers, which ensures that a client cannot forge matters, for example, when one client is cleaning up old a voucher. However, it does not prevent one client from files while other clients are writing new data. eavesdropping on another, and so vouchers must be trans- The management server determines how much to grant mitted only over private channels. Issues like avoiding re- to a client based on its global information, including the play attacks do not require special mechanisms in vouch- total amount of vouchers outstanding for a user and es- ers if they are used in conjunction with authorization ca- timated demand from all clients consuming that user’s pabilities that provide defense against replay. quota. Granting more to a client can reduce the number of Each voucher is valid for only a limited duration, request messages that the management server must pro- as recorded in its expiry field. This is used in han- cess, but giving too much to one client can inhibit sharing dling failure—if a client crashes while holding an un- across multiple clients. The management server must also used voucher, other clients can use that quota once the not issue enough vouchers that a user could go over quota. voucher expires—and in reconciling storage and manage- One reasonable heuristic is for the management server ment servers. These are discussed further below. to give each client an amount proportional to its con- 3
  • 4. sumption rate, while reserving some quota in case new K u clients begin using quota. The client policy discussed n e x i r e d p e x i r e d p X above makes requests approximately proportional to con- sumption rate, and so the management server can give R e c o n c i l e d 2 3 2 4 2 5 2 6 2 7 each client the same fraction f of their requested amount. ( j u s t t o t a l s ) U n r e c o n c i l e d C u r r e n t When there is plenty of quota, f = 1. As the number of e p o c h s e p o c h clients increases or the amount of remaining quota de- ( t o t a l s a n d v o u c h e r s creases, each client gets a fraction f = r/((n + 1)r) of c h ) e r e o p p t i m e their requests, where r is this client’s consumption rate, n is the number of clients consuming from that quota, and r is the average consumption rate over all active clients. Figure 3: How the storage server maintains consumption Using vouchers. Once a client has obtained a voucher, information over multiple epochs. X is the number of it can use the voucher to consume resources. The client epochs before vouchers expire; K is the number of epochs picks which storage server it will use; the problem of se- in the past when reconciliation happens. lecting the server is outside the scope of this paper. In the simplest way of using vouchers, the client sends its I/O request to the storage server, along with one or The system divides time into epochs, as illustrated in more vouchers that will cover any resource allocationsFigure 3. Each voucher is associated with the epoch in the I/O request might require. The storage server keeps which it was issued, and the storage server keeps a list of track of how much resource was actually consumed by vouchers that it has received for each epoch. It also tracks the request, and may send the client a new voucher forhow much resource was consumed against those vouch- any balance in its reply. The storage server keeps track ers in the epoch. At some point there can be no more of how much each user has consumed, plus any recently-activity associated with an epoch—because all vouchers spent vouchers. The vouchers are periodically reconciled from that epoch will have expired—and the storage server with the management server in order to handle failure or can reconcile the list of vouchers used for that epoch with to catch cheaters, as discussed below. the management server. After reconciliation, the storage server can get rid of the list of vouchers and merge that Consider a simple scenario: a client is trying to write 1 MB of data into an existing file. The client obtains epoch’s consumption information into the record of over- a voucher for (say) 2 MB from the management server, all reconciled consumption. then sends a write request to the storage server along with We summarize the formal model for tracking and rec- the voucher. The storage server determines how much re- onciling quota for a single user as follows. (The expo- source is consumed. It might consume nothing, if the re- sition for a single user is clearer than for multiple users, quest only overwrites already-allocated blocks, or it might but the rules are the same.) The user has a quota Q, and consume a full 1 MB, or something in between. The stor- the management server has authorized some allocation A; age server will reply with a refund voucher for 2 MB mi- the system works to keep A ≤ Q. The allocation A is the nus the amount actually allocated. amount of storage used on all storage servers plus the Vouchers can also be used in a somewhat different amount of any unredeemed vouchers: A = ∑∀d Sd + U, way to solve a long-standing problem in object storagewhere Sd is the amount used on storage server d and U is systems—ensuring that an operation will succeed when the amount of unredeemed vouchers. multiple clients could be consuming resources in the stor- For the management server to know A accurately us- age server. A client can use a voucher to reserve resources ing this definition, it would need to be involved syn- at the storage server to ensure that its later operations will chronously on every resource allocation, which defeats have the resources to complete. This is particularly im- the intent of the voucher approach. Instead, the man- portant for object stores because it is hard for a client to agement server uses a conservative estimate of A: in- predict how much resource any one operation will con- stead of the actual current usage ∑ Sd , it uses the us- sume. age determined at the last reconciliation, and instead of the actual unredeemed vouchers, it uses all vouchers is- Tracking and reconciling usage. The storage server sued since the last reconciliation. Formally, the manage- keeps track of how much a user has consumed, peri- ment server knows the amount of resource the user had odically reconciles voucher usage with the management consumed as of the last reconciliation, which covered all server to catch cheaters and recover from failures. e epochs up to and including epoch e: ∑∀d Sd , where e de- 4