Top Document: Comp.os.research: Frequently answered questions [2/3: l/m 13 Aug 1996] Previous Document: [2.3] Modern Unix file and block sizes Next Document: [2.3.2] Block sizes See reader questions & answers on this topic! - Help others by sharing your knowledge From: Performance and workload studies There is no such thing as an average file system. Some file systems have lots of little files. Others have a few big files. However as a mental model the notion of an average file system is invaluable. The following table gives a break down of file sizes and the amount of space they consume. file size #files %files %files disk space %space %space (max. bytes) cumm. (Mb) cumm. 0 147479 1.2 1.2 0.0 0.0 0.0 1 3288 0.0 1.2 0.0 0.0 0.0 2 5740 0.0 1.3 0.0 0.0 0.0 4 10234 0.1 1.4 0.0 0.0 0.0 8 21217 0.2 1.5 0.1 0.0 0.0 16 67144 0.6 2.1 0.9 0.0 0.0 32 231970 1.9 4.0 5.8 0.0 0.0 64 282079 2.3 6.3 14.3 0.0 0.0 128 278731 2.3 8.6 26.1 0.0 0.0 256 512897 4.2 12.9 95.1 0.0 0.1 512 1284617 10.6 23.5 566.7 0.2 0.3 1024 1808526 14.9 38.4 1442.8 0.6 0.8 2048 2397908 19.8 58.1 3554.1 1.4 2.2 4096 1717869 14.2 72.3 4966.8 1.9 4.1 8192 1144688 9.4 81.7 6646.6 2.6 6.7 16384 865126 7.1 88.9 10114.5 3.9 10.6 32768 574651 4.7 93.6 13420.4 5.2 15.8 65536 348280 2.9 96.5 16162.6 6.2 22.0 131072 194864 1.6 98.1 18079.7 7.0 29.0 262144 112967 0.9 99.0 21055.8 8.1 37.1 524288 58644 0.5 99.5 21523.9 8.3 45.4 1048576 32286 0.3 99.8 23652.5 9.1 54.5 2097152 16140 0.1 99.9 23230.4 9.0 63.5 4194304 7221 0.1 100.0 20850.3 8.0 71.5 8388608 2475 0.0 100.0 14042.0 5.4 77.0 16777216 991 0.0 100.0 11378.8 4.4 81.3 33554432 479 0.0 100.0 11456.1 4.4 85.8 67108864 258 0.0 100.0 12555.9 4.8 90.6 134217728 61 0.0 100.0 5633.3 2.2 92.8 268435456 29 0.0 100.0 5649.2 2.2 95.0 536870912 12 0.0 100.0 4419.1 1.7 96.7 1073741824 7 0.0 100.0 5004.5 1.9 98.6 2147483647 3 0.0 100.0 3620.8 1.4 100.0 A number of observations can be made: - the distribution is heavily skewed towards small files - but it has a very long tail - the average file size is 22k - pick a file at random: it is probably smaller than 2k - pick a byte at random: it is probably in a file larger than 512k - 89% of files take up 11% of the disk space - 11% of files take up 89% of the disk space Such a heavily skewed distribution of file sizes suggests that, if one were to design a file system from scratch, it might make sense to employ radically different strategies for small and large files. The seductive power of mathematics allows us treat a 200 byte and a 2MB file in the same way. But do we really want to? Are there any problems in engineering where the same techniques would be used in handling physical objects that span 6 orders of magnitude? A quote from sci.physics that has stuck with me: `When things change by 2 orders of magnitude, you are actually dealing with fundamentally different problems'. People I trust say they would have expected the tail of the above distribution to have been even longer. There are at least some files in the 1-2G range. They point out that DBMS shops with really large files might have been less inclined to respond to a survey like this than some other sites. This would bias the disk space figures, but it would have no appreciable effect on file counts. The results gathered would still be valuable because many static disk layout issues are determined by the distribution of small files and are largely independent of the potential existence of massive files. (It should be noted that many popular DBMSs, such as Oracle, Sybase, and Informix, use raw disk partitions instead of Unix file systems for storing data, hence the difficulty in gathering data about them in a uniform way.) User Contributions: 1 UoowNen ⚠ Sep 24, 2021 @ 7:07 am buy zithromax online https://zithromaxazitromycin.com/ - buy zithromax online zithromax online https://zithromaxazitromycin.com/ - buy zithromax Comment about this article, ask questions, or add new information about this topic:Top Document: Comp.os.research: Frequently answered questions [2/3: l/m 13 Aug 1996] Previous Document: [2.3] Modern Unix file and block sizes Next Document: [2.3.2] Block sizes Part1 - Part2 - Part3 - Single Page [ Usenet FAQs | Web FAQs | Documents | RFC Index ] Send corrections/additions to the FAQ Maintainer: os-faq@cse.ucsc.edu
Last Update March 27 2014 @ 02:12 PM
|