File Systems: Fundamentals

Similar documents
CS 5523: Operating Systems

Maps and Hash Tables. EECS 2011 Prof. J. Elder - 1 -

Maps, Hash Tables and Dictionaries

Processes. Criteria for Comparing Scheduling Algorithms

Optimization Strategies

ECE250: Algorithms and Data Structures Trees

Final Review. Chenyang Lu. CSE 467S Embedded Compu5ng Systems

Midterm Review. EECS 2011 Prof. J. Elder - 1 -

Operating Systems. Chenyang Lu

Virtual Memory and Address Translation

A Bloom Filter Based Scalable Data Integrity Check Tool for Large-scale Dataset

Concurrent Programing: Why you should care, deeply. Don Porter Portions courtesy Emmett Witchel

Cyber-Physical Systems Scheduling

Comparison Sorts. EECS 2011 Prof. J. Elder - 1 -

Priority Queues & Heaps

Midterm Review. EECS 2011 Prof. J. Elder - 1 -

CS 2461: Computer Architecture I

Priority Queues & Heaps

Lecture 6 Cryptographic Hash Functions

Priority Queues & Heaps

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :51 PM

Kjell-Einar Anderssen. Country Manager Norway - Nutanix

CS 5523 Operating Systems: Intro to Distributed Systems

A Micro-Benchmark Evaluation of Catamount and Cray Linux Environment (CLE) Performance

CS 5523: Operating Systems

Analyzing the Power Consumption Behavior of a Large Scale Data Center

Frequently Asked Questions

Chapter 8: Recursion

Florida Supreme Court Standards for Electronic Access to the Courts

COMP : DATA STRUCTURES 2/27/14. Are binary trees satisfying two additional properties:

EXAM TTM2 Information security, advanced. Technical Tools/Aid: None Duration: (3 hours) Contact person: Svein Willassen, ph.

Elections with Only 2 Alternatives

Data Sampling using Congressional sampling. by Juhani Heliö

Subreddit Recommendations within Reddit Communities

Swiss E-Voting Workshop 2010

Exploring QR Factorization on GPU for Quantum Monte Carlo Simulation

Data 100. Lecture 9: Scraping Web Technologies. Slides by: Joseph E. Gonzalez, Deb Nolan

The optical memory card is a Write Once media, a written area cannot be overwritten. Information stored on an optical memory card is non-volatile.

TAFTW (Take Aways for the Week) APT Quiz and Markov Overview. Comparing objects and tradeoffs. From Comparable to TreeMap/Sort

Batch binary Edwards. D. J. Bernstein University of Illinois at Chicago NSF ITR

Contents. Bibliography 121. Index 123

Last Time. Bit banged SPI I2C LIN Ethernet. u Embedded networks. Ø Characteristics Ø Requirements Ø Simple embedded LANs

Belton I.S.D. Records Management Policy and Procedural Manual. Compiled by: Record Management Committee

CS 5523 Operating Systems: Synchronization in Distributed Systems

Metadata Stat-ahead DLD

Objec&ves. Usability Project Discussion. May 9, 2016 Sprenkle - CSCI335 1

Open Source, Public Redistricting Software

Servilla: Service Provisioning in Wireless Sensor Networks. Chenyang Lu

Testing the Waters: Working With CSS Data in Congressional Collections

Outline. From Pixels to Semantics Research on automatic indexing and retrieval of large collections of images. Research: Main Areas

GST 104: Cartographic Design Lab 6: Countries with Refugees and Internally Displaced Persons Over 1 Million Map Design

DATA PROCESSING AGREEMENT. between [Customer] (the "Controller") and LINK Mobility (the "Processor")

Sector Discrimination: Sector Identification with Similarity Digest Fingerprints

Platform independent proc interface

Cloud Tutorial: AWS IoT. TA for class CSE 521S, Fall, Jan/18/2018 Haoran Li

Supreme Court of Florida

ADMISSIBILITY OF COMPUTER EVIDENCE IN TANZANIA

Real-Time Scheduling Single Processor. Chenyang Lu

IBM Cognos Open Mic Cognos Analytics 11 Part nd June, IBM Corporation

Arthur M. Keller, Ph.D. David Mertz, Ph.D.

HISTORY GEOSHARE, DRINET, U2U

for fingerprint submitting agencies and contractors Prepared by the National Crime Prevention and Privacy Compact Council

City of Toronto Election Services Internet Voting for Persons with Disabilities Demonstration Script December 2013

Act means the Municipal Elections Act, 1996, c. 32 as amended;

Support Vector Machines

Verity Touch with Controller

Virginia Beach Police Department General Order Chapter 8 - Criminal Investigations

Voting Protocol. Bekir Arslan November 15, 2008

ETSI TS V8.3.0 ( )

ICAO MRTD & emrtd Specifications: High Level Overview

MEETINGS POLICY. Approved by Council: 16 May 2012 Revised by Council: None. 1 Introduction

Exhibit No. 373A-06 to IBM Vendor Access Agreement Page 1 of 5

Communicating Student Learning

Legislative Records: Guide to Preparation and Transfer

Cluster Analysis. (see also: Segmentation)

Implementing Domain Specific Languages using Dependent Types and Partial Evaluation

Local differential privacy

DevOps Course Content

Global Conditions (applies to all components):

DOWNLOAD PDF STATEMENT OF CONGRESSIONAL DOCUMENTS, JOURNALS, REGISTERS OF DEBATES, ETC.

OPEN SOURCE CRYPTOCURRENCY

Vote Tabulator. Election Day User Procedures

New legal stuff that I/T folks need to know about

CENTRAL CATALOGUE OF OFFICIAL DOCUMENTS OF THE REPUBLIC OF CROATIA

HPCG on Tianhe2. Yutong Lu 1,Chao Yang 2, Yunfei Du 1

Real-Time Wireless Control Networks for Cyber-Physical Systems

E-DISCOVERY Will it byte you or your client? COPYRIGHT 2014 ALL RIGHTS RESERVED

Constraint satisfaction problems. Lirong Xia

Andreas Fring. Basic Operations

Introduction to VI-HPS

FROM CRIME TO CRIME STATISTICS AND CRIME STATISTICS TO CRIME INTELLIGENCE

Case Study. MegaMatcher Accelerator

Ballot Reconciliation Procedure Guide

PRACTICE DIRECTION [ ] DISCLOSURE PILOT FOR THE BUSINESS AND PROPERTY COURTS

UNITED STATES [DISTRICT/BANKRUPTCY] COURT FOR THE DISTRICT OF DIVISION., ) ) Plaintiff, ) ) vs. ) Case No. ), ) Judge ) Defendant.

4th International Industrial Supercomputing Workshop Supercomputing for industry and SMEs in the Netherlands

Digital research data in the Sigma2 prospective

MARYLAND Maryland MVA Real ID Act - Impact Analysis

Uninformed search. Lirong Xia

MODULE B - PROCESS SUBMODULES B1.

Transcription:

File Systems: Fundamentals 1

Files What is a file? Ø A named collection of related information recorded on secondary storage (e.g., disks) File attributes Ø Name, type, location, size, protection, creator, creation time, lastmodified-time, File operations Ø Create, Open, Read, Write, Seek, Delete, How does the OS allow users to use files? Ø Open a file before use Ø OS maintains an open file table per process, a file descriptor is an index into this file. Ø Allow sharing by maintaining a system-wide open file table 2

Fundamental Ontology of File Systems Metadata Ø The index node (inode) is the fundamental data structure Ø The superblock also has important file system metadata, like block size Data Ø The contents that users actually care about Files Ø Contain data and have metadata like creation time, length, etc. Directories Ø Map file names to inode numbers 3

Basic data structures Disk Ø An array of blocks, where a block is a fixed size data array File Ø Sequence of blocks (fixed length data array) Directory Ø Creates the namespace of files Heirarchical traditional file names and GUI folders Flat like the all songs list on an ipod Design issues: Representing files, finding file data, finding free blocks 4

Block vs. Sector The operating system may choose to use a larger block size than the sector size of the physical disk. Each block consists of consecutive sectors. Why? Ø A larger block size increases the transfer efficiency (why?) Ø It can be convenient to have block size match (a multiple of) the machine's page size (why?) Some systems allow transferring of many sectors between interrupts. Some systems interrupt after each sector operation (rare these days) Ø consecutive sectors may mean every other physical sector to allow time for CPU to start the next transfer before the head moves over the desired sector 5

File System Functionality and Implementation File system functionality: Ø Pick the blocks that constitute a file. Must balance locality with expandability. Must manage free space. Ø Provide file naming organization, such as a hierarchical name space. File system implementation: Ø File header (descriptor, inode): owner id, size, last modified time, and location of all data blocks. OS should be able to find metadata block number N without a disk access (e.g., by using math or cached data structure). Ø Data blocks. Directory data blocks (human readable names) File data blocks (data). Ø Superblocks, group descriptors, other metadata 6

File System Properties Most files are small. Ø Need strong support for small files. Ø Block size can t be too big. Some files are very large. Ø Must allow large files (64-bit file offsets). Ø Large file access should be reasonably efficient. Most systems fit the following profile: 1. Most files are small 2. Most disk space is taken up by large files. 3. I/O operations target both small and large files. --> The per-file cost must be low, but large files must also have good performance. 7

If my file system only has lots of big video files what block size do I want? 1. Large 2. Small 8

How do we find and organize files on the disk? The information that we need: file header points to data blocks fileid 0, Block 0 --> Disk block 19 fileid 0, Block 1 --> Disk block 4,528 Key performance issues: 1. We need to support sequential and random access. 2. What is the right data structure in which to maintain file location information? 3. How do we lay out the files on the physical disk? 9

File Allocation Methods Contiguous allocation I File header specifies starting block & length Placement/Allocation policies Ø First-fit, best-fit,... Pluses Ø Best file read performance Ø Efficient sequential & random access Minuses Ø Fragmentation! Ø Problems with file growth Pre-allocation? On-demand allocation? 10

File Allocation Methods Linked allocation I Files stored as a linked list of blocks File header contains a pointer to the first and last file blocks Pluses Ø Easy to create, grow & shrink files Ø No external fragmentation Minuses Ø Impossible to do true random access Ø Reliability Break one link in the chain and... 11

File Allocation Methods Linked allocation File Allocation Table (FAT) (Win9x, OS2) Create a table with an entry for each block Ø Overlay the table with a linked list Ø Each entry serves as a link in the list Ø Each table entry in a file has a pointer to the next entry in that file (with a special eof marker) Ø A 0 in the table entry è free block Comparison with linked allocation Ø If FAT is cached è better sequential and random access performance How much memory is needed to cache entire FAT? 400GB disk, 4KB/block è 100M entries in FAT è 400MB Solution approaches Allocate larger clusters of storage space Allocate different parts of the file near each other è better locality for FAT 12

File Allocation Methods Direct allocation I File header points to each data block Pluses Ø Easy to create, grow & shrink files Ø Little fragmentation Ø Supports direct access Minuses Ø Inode is big or variable size Ø How to handle large files? 13

File Allocation Methods Indexed allocation I Create a non-data block for each file called the index block Ø A list of pointers to file blocks File header contains the index block Pluses Ø Easy to create, grow & shrink files Ø Little fragmentation Ø Supports direct access Minuses Ø Overhead of storing index when files are small Ø How to handle large files? 14

Indexed Allocation Handling large files Linked index blocks (++ ) I Multilevel index blocks (** ) I 15

Why bother with index blocks? Ø A. Allows greater file size. Ø B. Faster to create files. Ø C. Simpler to grow files. Ø D. Simpler to prepend and append to files. 16

Multi-level Indirection in Unix File header contains 13 pointers Ø 10 pointes to data blocks; 11 th pointer à indirect block; 12 th pointer à doubly-indirect block; and 13 th pointer à triply-indirect block Implications Ø Upper limit on file size (~2 TB) Ø Blocks are allocated dynamically (allocate indirect blocks only for large files) Features Ø Pros Simple Files can easily expand Small files are cheap Ø Cons Large files require a lot of seek to access indirect blocks 17

Inode Indexed Allocation in UNIX Multilevel, indirection, index blocks 1 st Level Indirection Block 10 Data Blocks n Data Blocks 2 nd Level Indirection Block n 2 Data Blocks n 3 Data Blocks 3 rd Level Indirection Block 18

How big is an inode? Ø A. 1 byte Ø B. 16 bytes Ø C. 128 bytes Ø D. 1 KB Ø E. 16 KB 19

Allocate from a free list Need a data block Ø Consult list of free data blocks Need an inode Ø Consult a list of free inodes Why do inodes have their own free list? Ø A. Because they are fixed size Ø B. Because they exist at fixed locations Ø C. Because there are a fixed number of them 20

Free list representation Represent the list of free blocks as a bit vector: 111111111111111001110101011101111... Ø If bit i = 0 then block i is free, if i = 1 then it is allocated Simple to use and vector is compact: 1TB disk with 4KB blocks is 2^28 bits or 32 MB If free sectors are uniformly distributed across the disk then the expected number of bits that must be scanned before finding a 0 is n/r where n = total number of blocks on the disk, r = number of free blocks If a disk is 90% full, then the average number of bits to be scanned is 10, independent of the size of the disk 21

Deleting a file is a lot of work Data blocks back to free list Ø Coalescing free space Indirect blocks back to free list Ø Expensive for large files, an ext3 problem Inodes cleared (makes data blocks dead ) Inode free list written Directory updated The order of updates matters! Ø Can put block on free list only after no inode points to it 22

Naming and Directories Files are organized in directories Ø Directories are themselves files Ø Contain <name, pointer to file header> table Only OS can modify a directory Ø Ensure integrity of the mapping Ø Application programs can read directory (e.g., ls) Directory operations: Ø List contents of a directory Ø Search (find a file) Linear search Binary search Hash table Ø Create a file Ø Delete a file 23

Every directory has an inode Ø A. True Ø B. False Given only the inode number (inumber) the OS can find the inode on disk Ø A. True Ø B. False 24

Directory Hierarchy and Traversal Directories are often organized in a hierarchy Directory traversal: Ø How do you find blocks of a file? Let s start at the bottom Find file header (inode) it contains pointers to file blocks To find file header (inode), we need its I-number To find I-number, read the directory that contains the file But wait, the directory itself is a file Recursion!! Ø Example: Read file /A/B/C C is a file B/ is a directory that contains the I-number for file C A/ is a directory that contains the I-number for file B How do you find I-number for A? / is a directory that contains the I-number for file A What is the I-number for /? In Unix, it is 2 25

Directory Traversal (Cont d.) How many disk accesses are needed to access file /A/B/C? 1. Read I-node for / (root) from a fixed location 2. Read the first data block for root 3. Read the I-node for A 4. Read the first data block of A 5. Read the I-node for B 6. Read the first data block of B 7. Read I-node for C 8. Read the first data block of C Optimization: Ø Maintain the notion of a current working directory (CWD) Ø Users can now specify relative file names Ø OS can cache the data blocks of CWD 26

Naming and Directories Once you have the file header, you can access all blocks within a file Ø How to find the file header? Inode number + layout. Where are file headers stored on disk? Ø In early Unix: Special reserved array of sectors Files are referred to with an index into the array (I-node number) Limitations: (1) Header is not near data; (2) fixed size of array à fixed number of files on disk (determined at the time of formatting the disk) Ø Berkeley fast file system (FFS): Distribute file header array across cylinders. Ø Ext2 (linux): Put inodes in block group header. How do we find the I-node number for a file? Ø Solution: directories and name lookup 27

A corrupt directory can make a file system useless Ø A. True Ø B. False 28

Other Free List Representations In-situ linked lists D Grouped lists D G Next group block Allocated block Empty block 29