程序代写案例-DSCI 551

  • February 17, 2021

File Systems DSCI 551 Wensheng Wu 1 Roadmap • Files and directories – CRUD operations via system calls • Implementing CRUD – Data structures, e.g., organization of blocks – Access methods: turn system calls into operations on data structures 2 Files and directories • File content stored in blocks on storage device – Has user defined name: hello.txt – & low-level name, e.g., inode number: 410689 • Files are organized into directories (folders) – each may have a list of files and/or subdirectories – That is, directories can be nested 3 Example 4 Root directory Empty directory Operations on files • Create – open(), write() • Read – open(), read(), lseek() • Update – write(), lseek() • Delete – unlink() 5 System calls: calls to functions in the API provided by OS Create • User interface via GUI or touch command in Linux • Implementation, e.g., via a C program with a system call: open() • int fd = open(“foo”, O_CREAT | O_WRONLY | O_TRUNC); – Open with flags indicating the specifics – O_CREAT: create a file – O_WRONLY: write only – O_TRUNC: remove existing contents if exits 6 |: Bitwise OR operator File descriptor • Note open() returns a file descriptor – Typically an integer – Reserved fds: stdin 0, stdout, 1, stderr 2 7 Read • read(fd, buffer, size) – Read from file “fd” number of bytes – And store them in buffer • Read starts from the current offset of fd – Initially 0 8 Write • write(fd, buffer, size) – Write to file fd number of bytes stored in buffer – Also start writing from the current offset 9 Random read and write • off_t lseek(int fd, off_t offset, int whence) – If whence is SEEK_SET, the offset is set to bytes from the beginning of file – If whence is SEEK_CUR, the offset is set to its current location plus bytes – If whence is SEEK_END, the offset is set to the size of the file plus bytes (typically offset is negative, e.g., -8 for 8 bytes from the end) • whence: from where 10 11 Copy a file “0” starts an octal number => permissions: 110 (owner) rw- 100 (group) r– 100 (others) r– Pointer to a character array File permission mode 12 rw-r–r– => 110 (owner permission) 100 (group) 100 (others) Resources for system calls • https://en.wikipedia.org/wiki/System_call • open: https://en.wikipedia.org/wiki/Open_(system_call ) • read: https://en.wikipedia.org/wiki/Read_(system_call) • write: https://en.wikipedia.org/wiki/Write_(system_call ) • close: https://en.wikipedia.org/wiki/Close_(system_call) 13 Resources for system calls • man –S 2 read – Find it in the Section 2 of the manual 14 Install gcc on EC2 • sudo yum install gcc • Usage: – gcc -o copy2 copy2.c 15 File and directory • When creating a file – Bookkeeping data structure (inode) created: recording size of file, location of its blocks, etc. – Linking a human-readable name to the file – Putting the link in a directory 16 Info about file (stored in inode) struct stat { dev_t st_dev; /* ID of device containing file */ ino_t st_ino; /* inode number */ mode_t st_mode; /* protection */ nlink_t st_nlink; /* number of (hard) links */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */ dev_t st_rdev; /* device ID (if special device file, e.g., /etc/tty) */ off_t st_size; /* total size, in bytes */ blksize_t st_blksize; /* blocksize for filesystem I/O */ blkcnt_t st_blocks; /* number of blocks allocated */ time_t st_atime; /* last time file content was accessed */ time_t st_mtime; /* last time file content was modified */ time_t st_ctime; /* last time inode was changed */ }; Execute “man -S 2 stat” for more details… 17 inode • Stores metadata/attributes about the file • Also stores locations of blocks holding the content of the file 18 Example • a.txt abc def abc def abc def 19 Block size # of blocks allocated Device id Inode # User id Group id # of (hard) links Access permission noatime 20 Hard links 21 Symbolic links 22 Working with directories • Create: mkdir() system call – Used to implement command, e.g., mkdir xyz • Read: opendir(), readdir(), closedir() – ls xyz • Delete: rmdir() 23 Roadmap • Files and directories – CRUD operations • Implementation – Data structures: how to organize the blocks – Access methods: turn system calls into operations on data structures 24 Organization of blocks • Array-based – Disk consists of a list of blocks – We will assume this • Tree-based, e.g., SGI XFS – Blocks are organized into variable-length extents – Use B+-tree to quickly find free extents 25 Blocks • Consider a disk with 64 blocks – 4KB/block – 512B/sector (we assume this in this lecture) • So there are 212/29 = 23 = 8 sectors/block – Capacity of disk = 64 * 4KB = 256KB 26 Data region • 56 blocks (#8-63) – used to store data/content of files – but see later: some blocks may store pointers 27 Metadata • For each file, file system records its metadata – Information in the “stat” struct – Location of blocks that stores the content of file 28 Metadata of files stored in inodes • Index nodes • Stored in blocks #3 — #7 (i.e., 5 blocks) • Together they are called the “inode” table 29 How many inodes are there? • 256 bytes/inode • 5 blocks, 4KB/block => 16 inodes/block (4K/256 = 212/28 ) => 5 blocks, 5 * 16 = 80 inodes => File system can store at most 80 files 30 Free space management using bitmaps • Bitmap: a vector of bits – 0 for free (inode/block), 1 for in-use • Inode bitmap (imap) – keep track of which inodes in the inode table are available • Data bitmap (dmap) – Keep track of which blocks in data region are available 31 Bitmaps • Each bitmap is stored in a block – Block “i”: keep track of 80 inodes (could track 32K) – Block “d”: keep track of the 56 data blocks 32 Superblock • Track where i/d blocks and inode table are – E.g., inode table starts at block 3; there are 80 inodes and 56 data blocks, etc. • Indicate type of FS & inumber of its root dir • Will be read first when file system is mounted 33 inumber • Each inode is identified by a number – Low-level number of file name • Can figure out location of inode from inumber 34 inumber => location • inumber = 32 => address: offset in bytes from the beginning => which sector? 35 inumber => location of inode • Address: 12K + 32 * 256 = 20K • Sector #: 20K/512 = 40 – more generally – ہ ۂ (inodeStartAddress + inumber ∗ inode size)/sector size 36 inode => location of data blocks • A number of direct pointers – E.g., 8 pointers, each points to a data block – Enough for 8*4K = 32K size of file • Also has a slot for indirect pointer – Pointing to a data block storing direct pointers – Assume 4 bytes for block address (e.g., represented in CHS), so 1024 pointers/block – Now file can have (8 + 1024) blocks or 4,128KB 37 Multi-level index • Pointers may be organized into multiple levels – Indirect pointer (as in previous slide) • Inode (pointer1, pointer2, …, indirect pointer) • Indirect pointer -> a block of direct pointers – Double indirect pointers • Inode (pointer1, pointer2, …, indirect pointer) • Indirect pointer -> a block of indirect pointers instead -> each points to a block of direct pointers – Triple indirect pointers • Indirect pointer -> a block of indirect pointers -> each points to a block of indirect pointers -> each points to a block of direct pointers 38 Double Indirect Pointers 39 inode Block of indirect pointers Indirect pointer Indirect pointer Block of direct pointers direct pointers Advantages of multi-level index • Grow to more levels as needed • Direct pointers handle most of the cases – Many files are small 40 Directory organization • Directory itself stored as a file • For each file in the directory, it stores: – name, inumber, record length, string length 41 Actual length Record length vs string length • String length = # of characters in file name + 1 (for \0: end of string) • Record length >= string length – Due to entry reuse 42 Reusing directory entries • If file is deleted (using rm command) or a name is unlinked (using unlink command) – File is finally deleted when its last (hard) link is removed • Then inumber in its directory entry set to 0 (reserved for empty entry) – So we know it can be reused 43 Storing a directory • Also as a file with its own inode + data block • inode: – file type: directory (instead of regular file) – pointer to block(s) in data region storing directory entries 44 Roadmap • Files and directories – CRUD operations • Implementation – Data structures: how to organize blocks, e.g., into array/tree – Access methods: turn system calls to operations on data structures 45 Open for read • fd = open(“/foo/bar”, O_RDONLY) 46 Open for read • fd = open(“/foo/bar”, O_RDONLY) – Need to locate inode of the file “/foo/bar” – Assume inumber of root, say 2, is known (e.g., when the file system is mounted) 47 Open for read 1. Read inode and content of / (2 reads) – Look for “foo” in / -> foo’s inumber 2. Read inode and content of /foo (2 reads) – Look for “bar” in /foo -> bar’s inumber 3. Read inode of /foo/bar (1 read) – Permission check + allocate file descriptor 48 Cost of open() • Need 5 reads of inode/data block 49 File-open table per process File descriptor File name Inumber Position offset … 3 /foo/bar 32382 0 4 /foo/more 48482 512 … 50 Reading the file • read(fd, buffer, size) – Note fd is maintained in per-process open-file table – Table translates fd -> inumber of file 51 Reading the file • read(fd, buffer, size) 1. Consult bar’s inode to locate a block 2. Read the block 3. Update inode with newest file access time 4. Update open-file table with new offset 5. Repeat above steps until done (with reading data of given size) 52 Cost for reading a block • 3 I/O’s: – read inode, read data block, write inode 53 2 Open for write • int fd = open(“/foo/bar”, O_WRONLY) – Or int fd = create((“/foo/bar”) – Assume bar is a new file under foo – (note the difference from reading chapter!) 54 Open for write • int fd = open(“/foo/bar”, O_WRONLY) 1. Read ‘/’ inode & content – obtain foo’s inumber 2. Read ‘/foo’ inode & content – check if bar exists 55 Open for write 3. Read imap, to find a free inode for bar 4. Update imap, setting 1 for allocated inode 5. Write bar’s inode 56 Open for write 6. Update foo’s content block – Adding an entry for bar 7. Update foo’s inode – Update its modification time 57 Cost for “open for write” • int fd = open(“/foo/bar”, O_WRONLY) • Need 9 I/O’s 58 data bitmap inode bitmap root inode foo inode bar inode root data foo data bar data[0] bar data[1] bar data[2] read read read read read write write write write create() Writing the file: /foo/bar 1. Read inode of bar (by first looking up its inumber in the file-open table) 2. Allocate new data block – Read and write bmap 3. Write to data block of bar 4. Update bar inode – new modification time, add pointer to block 59 Cost of writing /foo/bar • 5 I/O’s for write a block 60 data bitmap inode bitmap root inode foo inode bar inode root data foo data bar data[0] bar data[1] bar data[2] read read read read read write write write write read read write write write create() write() Caching for read • First read may be slow – But subsequent ones will speed up • Good idea to cache popular blocks – e.g., determined via LRU strategy 61 Buffering for delayed write • Improve write performance via: – Batching (e.g., two updates to the same imap) – Scheduling (reordering for better performance) – Avoiding writes (if file created, then quickly deleted) • Problem: update may be lost when system crashes 62 Example file systems • NTFS – New technology file system, Microsoft proprietary • FAT – File allocation table – FAT 16, 32, … – 32 bits = # of sectors a file can occupy 512B/sector => 2TB limit on file size 4KB/sector => 16TB limit • Ext4 – fourth extended file system, common in Linux 63 欢迎咨询51作业君