CS代写 Don’t stack your Log on my Log

Don’t stack your Log on my Log
Jingpei Yang, , , ,
Log-structured applications and file systems have been used to achieve high write throughput by sequentializ- ing writes. Flash-based storage systems, due to flash memory’s out-of-place update characteristic, have also relied on log-structured approaches. Our work investi- gates the impacts to performance and endurance in flash when multiple layers of log-structured applications and file systems are layered on top of a log-structured flash device. We show that multiple log layers affects se- quentiality and increases write pressure to flash devices through randomization of workloads, unaligned segment sizes, and uncoordinated multi-log garbage collection. All of these effects can combine to negate the intended positive affects of using a log. In this paper we character- ize the interactions between multiple levels of indepen- dent logs, identify issues that must be considered, and describe design choices to mitigate negative behaviors in multi-log configurations.
1 Introduction

Flash-based devices are
software runs atop the SSD’s log-structured or data- remapping layer – the Flash Translation Layer (FTL). Therefore, it is possible that two or more log-structured I/O patterns may become stacked on flash media. For ex- ample, it is possible to have an application like fatcache write a sequential stream atop a log-structured file system like F2FS, which in turn operates over a log-structured FTL on physical flash media.
While log-structured applications, file systems and log stacking is not new [5], log stacking on flash deserves special attention. First, since flash devices contain a remapping log-like FTL, any log-structured application run atop a flash device creates a stacked log scenario, making such scenarios now more common. Second, flash devices have limited endurance and any additional writes caused by multiple log layers can impact device lifetime. Third, each layer’s log-remapping engine frequently re- serves some capacity for GC and only exposes part of its usable capacity to the upper layer. Thus a large frac- tion of, the still relatively expensive, flash media can be consumed as reserve capacity by multiple logs stacked atop it. Fourth, the high performance of flash devices implies that log “aging”, or the need for GC to defrag- ment the log, occurs quickly, frequently, and incoher- ently amongst all the logs involved. This combined inco- herent GC behavior, across multiple log layers, critically impacts overall performance and endurance.
We focus on Log on Log – the issues that arise when two or more log layers are stacked on each other. At first glance, we observe that multiple layers of software per- forming the same function, i.e. data remapping and GC, seems redundant and suboptimal. In a multi-layer log configuration, there are further issues. Each log structure is unaware of the objectives and algorithms of those be- low or above it. Since each log operates independently towards its own objectives, it is possible that its perfor- mance or efficiency goals can be undone by the other log layers. In addition, increased metadata, conflicting and incoherent GC strategies, and fragmentation of work-
frequently
applications ranging from
performance-sensitive
databases to key-value stores to persistent messaging. In many of these environments, applications began by using flash as a fast disk and then made optimizations to better match the unique characteristics of flash. Since flash devices are known for asymmetric write performance and garbage collection (GC), a frequent application design pattern is to write in a log structure to optimize for flash devices. Recent examples include twitter fatcache [1], NILFS [11], F2FS [6], and SILT [12].
The log-structured write pattern has been adopted by both user-space applications and file systems. Such
⃝c 2014 SanDisk Corporation. All rights reserved. INFLOW’14, October 5, Broomfield, CO, USA.
SanDisk Corporation

loads, all result in increased write pressure, which greatly impacts flash device performance and endurance. This can also result in a great performance reduction of the overall application using these multiple log layers.
This paper makes the following contributions:
1. We outline the architectural issues that can arise when one or more logs are stacked atop an FTL.
2. We demonstrate the impacts of these issues on flash devices using a combination of two techniques. First, we gather empirical results of workloads on log-structured software running atop a commer- cially available flash device. We then assess the issues in depth using a purpose-built log-on-log event driven simulator. We measure the impact of multiple uncoordinated log activities and demon- strate that, multi-layer log configurations introduce higher write pressures (up to 33%) from log meta- data maintenance, and increased GC activities (up to 32%) due to decoupled cleaning.
3. We propose some optimizations to mitigate the issues found with multi-layer log configurations. We propose optimal sizing of log segment sizes amongst layers and coordination of GC interactions. In addition, we discuss approaches to collapsing logs through new interface semantics.
This paper argues that the increasingly common prac- tice of using log-structured writing to flash is fraught with complexities and opportunities for unpredictable behavior. We outline ways to both understand and miti- gate the effects of log stacking, and discuss alternatives to stacked logs over flash.
2 Background
Log-structured data persistence has been employed in storage systems [4], file systems [16], databases [20] and other applications. Some stores are strictly log structured and allow no update-in-place operations, while other stores are more write-anywhere in nature [5] and allow hole plugging. All such stores allow new writes to be di- rected to free space in the device, and all contain some form of GC (frequently called cleaning) to compact and reuse invalidated physical space. Substantial research has been done on optimizing log-structured stores, par- ticularly for GC [2, 17, 21]. In this paper, we use the term “log-structuring” generally to mean stores with dynamic remapping of writes and GC. Specific configurations of such stores are defined and explored in detail in Sections 3 and 4.
Prior to the arrival of flash, a key motivation for log- structured stores was to accelerate write performance
while allowing random reads to be serviced from DRAM cache. Log-appends provide additional advantages, such as enabling snapshots, enabling transactional updates, and eliminating the small write performance problem when used in RAID 5 configurations [5, 14].
Flash creates a new motivation for log structuring. Flash can only be erased in the unit of erase blocks which are typically much larger than the write unit (e.g. 512 write pages per erase block). As such, all new writes must be directed to (freshly erased) blocks. Erased blocks are made available to satisfy new writes through GC. One or more erase blocks are garbage-collected to- gether, making them conceptually similar to cleaned seg- ments in a log-structured file system. Since flash has a limited number of program/erase cycles, flash GC has to balance the efficiency of cleaning with erase block wear leveling to meet reliability requirements. Flash has addi- tional requirements, such as read disturb handling, which require rewrites to maintain data integrity. As such, while some of the factors that drive flash GC are similar to those driving cleaning in higher level log stores, others are flash media specific.
Recently some efforts have been directed towards the reduction of the cost of journaling of journals (similar to a log-stacking model) between the application and file system layer [10,18]. This work observed, as we do, the general inefficiency of having redundant work done in multiple log layers. Our work is complementary to these efforts in that we aim to understand the behavior of a more generalized multi-log stacking model and its im- pact on flash, focusing on write amplification, GC over- head and overall performance.
3 Approach
We start by outlining several different models of log stacking that can commonly occur with flash devices. We then define a number of architectural aspects of logging, GC and write amplification which we then use in the sub- sequent sections to analyze log-on-log interactions.
3.1 Log Stacking Models
Figure 1 outlines some of the log stacking configurations that can occur when log-structured applications meet log-structured file systems and/or flash devices. Fig- ure 1a represents a single log-structured application (or file system) residing on a single FTL-based SSD. This is the most basic example of a log-on-log configuration. Some form of this configuration occurs every time a log- structured application runs on an SSD. The illustration demonstrates the potential complexities that can occur even in a simple log-on-log scenario. In this example, the upper level log has three data types (data, metadata,

and garbage collection) that are being written to three sequential streams. The underlying lower level log has two sequential streams. Figure 1b outlines a configura- tion where a log-based application/filesystem and a non- log based application share one FTL. This configuration can commonly occur when an SSD is divided into two partitions and one partition is used by a log-structured filesystem while the other is used by an application with a very different access pattern. Other configurations of multiple log layers include Figure 1c, where two or more log-structured applications share an FTL, and Figure 1d, where a log-structured application (such as a key-value store), resides on top of another log-structured software module (such as a file system) which itself is on top of an FTL.
Figure 1: Log-on-log structured approaches can be used in all levels of the storage stack.
3.2 Append Streams
Log stacking is further complicated as each log- structured application can have multiple streams over a single internal address space. Figure 1a shows an exam- ple of multiple sequential streams within each log layer. We call each such stream an Append Stream writing to the Append Point. An append stream is a sequential stream of writing and subsequent GC, similar to that used in [8]. We assume that all writes of an append stream oc- cur at the head (the Append Point) for that stream, and that reads can occur from anywhere within the stream. In addition, GC can read from any part of the stream and write subsets of the data to the append point. While some log-structured architectures are strictly single ap- pend stream, implying that all writes, incoming, clean- ing, metadata, are driven to the same append point, oth- ers have multiple streams. F2FS [6], for example, has six logical append streams, twitter fatcache has one, and SILT has several. Similarly, the FTL within a flash de- vice may have one or more append streams depending on
the design.
3.3 Write Amplification
As each log layer remaps and garbage collects its data, it generates its own write amplification (WA). The incom- ing data seen by each log layer includes the amplified writes generated by the log layers above. In this paper, we compute and refer to each log layer’s WA separately. Each layer’s WA is computed as: the ratio of outgoing writes from that layer, to the incoming writes of that layer. We shall make clear below which log level’s WA we are dealing with at the moment. The total combined write amplification (TCWA) is computed as the product of all of the involved write amplification factors.
3.4 Evaluation Methodology
Armed with the above concepts, we explore a number of different log-on-log behaviors. We conduct two classes of experiments.
WeuseF2FSasanexampleofaflash-optimizedlog structured file system with multiple append streams. We run experiments with F2FS on top of a commer- cially available SSD.
We developed and used a log-on-log simulator that implements a two level log-on-log structure with up to two independent append streams at each layer. With the simulator, we measure and analyze in de- tail the WA generated by different log-on-log inter- actions. The simulator is independent of hardware and operating system configurations so that it could be abstracted as any two-layer log system.
Scenarios and Results
In this section, we analyze simple and frequently de- ployed log on log scenarios and demonstrate some of the issues that arise. We characterize their impact on write pressure, endurance and capacity efficiency.
4.1 Metadata Footprint
The first topic we examine is metadata footprint. At a cursory glance, log stacking is expected to increase meta- data footprint, since each log layer will need to add its own metadata for the incoming data to track layout and persist indirection maps.
The amount of metadata added by a log structured store depends heavily on the design of the store and the number of append streams within the store. To un- derstand the potential metadata overhead of log stack- ing and multiple streams, we perform experiments on

Figure 2: Metadata foot print increases as more append streams are used on file system.
F2FS. F2FS is designed to support up to 6 append streams, making it possible for the file system to identify hot/warm/cold data and separate them to different seg- ments. We measured the total file system write bytes is- sued to the device under different workloads while vary- ing the number of F2FS append points. We configured F2FS to have 2 and 6 append streams. With 2 append points, F2FS separates user data and metadata, while 6 append points further differentiate each type of data as hot/warm/cold. The workloads were generated using the FIO benchmark tool with various combinations of work- load configurations – 1k vs. 4k I/O size, buffered vs. di- rect I/O, and random vs. sequential writes. As is shown in Figure 2, with an application workload that writes a total of 8GiB, the file system generally writes more data to the device when the number of append points is in- creased (e.g. 2 to 6 append points). For example, the first column set shows the total number of file system writes issued to the device from an 8GiB random write workload with buffered IO and a 1k IO size. The file system amplifies the original writes due to file metadata and log metadata. Since the workload is the same for 2 and 6 append streams, we assume file metadata used to maintain file status remains the same. Thus, the in- creased writes from 2 to 6 streams are the consequence of the additional logs’ metadata. Our experiment shows that the File System Write Amplification (FSWA) varies based on the number of file system append points, and increases from 1.5 to 2.0 (up to 33% for seq-4k-direct) when growing from 2 to 6 append points. While this is only one example, it does suggest that the number of append streams can be a factor in the WA generated by a log-structured store. FSWA is the amplification of the application workload by the filesystem. It is not the same as TCWA as it is the amplification of that amplification by the device.
4.2 Fragmentation
A key goal of log-structured systems is sequentializ- ing writes. However, if the FTL is shared by two log- structured applications (or even a single application with multiple append streams), the incoming data into the FTL is likely to look random or disjoint. Additionally,
GC in the upper layer can further complicate the traffic stream seen by the lower layer.
(a) Data is written sequentially to the lower log.
(b) Deleting one fsys seg spread across two dev seg. Figure 3: Fragmented logs.
Even when each log layer has exactly one append stream, complexities exist that can cause the underlying device to see non-sequential traffic. One such complex- ity is segment size mismatch – where the upper and lower logs both do GC, but at different segment boundaries and sizes. Figure 3 illustrates this issue with an example of two logs, each with one append stream but GC-ing at dif- ferent segment sizes. Data from one upper log segment is spread across two segments in the lower log. When GC occurs in the upper log stream, a deleted upper log segment (fsys seg 2 in the example) results in partial in- validation of two lower log segments. Reclaiming space in the lower log now requires GC of two dev seg, and results in higher WA in the lower log (see Section 4.5 for more detailed discussion on segment cleaning).
(a) upper/lower log capac- ity ratio 90%
(b) upper/lower log capacity ratio 70%
Figure 4: TPC-E: overall system WA (TCWA) varying log capacity ratio and segment size ratio.
When each log has a single append stream, this is- sue can be mitigated to some extent by matching seg- ment sizes between upper and lower logs. We measured the impact of different upper/lower log segment size ra- tios using our log-on-log simulator. Figure 4 depicts one such result for a TPCE-like workload trace. For each line of a fixed upper log segment size in Figure 4, there is a dramatic change of slope when lower log segment size exceeds upper log segment size. This is because the reuse of upper segments (seen as invalidation by the lower log) cannot cover the entire lower log segment, and causes data fragmentation on the lower layer. As both layers GC becomes active, a large portion of valid data in

each lower segment is copied forward resulting in higher lower log WA and hence higher TCWA.
This result further demonstrates that optimal segment sizes for log-structured GC that held true for standalone logs may not hold true for the whole system if it has a log-on-log scenario. A segment is the smallest unit for GC processing. In a standalone log, generally smaller segment sizes provide improved flexibility on GC victim selection and hence achieves lower WA. This is not true for a log-on-log configuration.
The above example illustrates the fragmentation and cleaning overheads that can result from two single stream logs being stacked atop one another. If each log were to have multiple append streams, the situation worsens since segments in the lower log are far more likely to have inter-mixed content from many upper log segments. It is also not clear that segment size matching can over- come the issues since data intermixing will still occur.
4.3 Aggregate Reserve Capacities
Since GC re-arranges data, many GCs rely upon some fraction of the underlying capacity to be reserved. When logs are stacked, each log layer’s capacity reserve eats into the capacity available for user data. The behavior of a log-on-log configuration also depends on the capac- ity used (and reserved) by the GC at each level. Fig- ure 4(b) shows the same log configuration but with more reserve capacity in the upper log. The turning slope in Figure 4(b) is at a larger lower segment size. This gives the upper log more flexibility on tuning its segment size to achieve lower FSWA. On the other hand, if each log’s reserve capacity ratio is low, the lower log has more spare capacity exposed to the upper log, then device GC is trig- gered less actively.
Our analysis of log-structured applications like NILFS and F2FS, as well as FTLs, has shown that each log has its own metadata which is invisible to the higher level logs. Due to this metadata, a segment contains metadata inter-mixed with data from upper logs. Hence cleaning of segments at one log layer doesn’t preclude the need to clean the segments at another layer. As our simulation is conducted with no other traffic nor log metadata, the real log-on-log system will be more complex and introduce higher degree of log fragmentation, making the impact of size ratio between two layers harder to predict.
4.4 Multiple Append Streams/Points
Fragmentation and associated complexity onl

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts