Last Updated: 2023-04-18 Tue 09:35

CSCI 2021 Lab13: Memory Mapped Files

CODE DISTRIBUTION: lab13-code.zip

CHANGELOG: Empty

1 Rationale

Memory Mapping files via the mmap() function provides a powerful way to treat the contents of a file as an in-memory data structure. This is particularly useful for binary data comprised of structs. The techniques to handle such binary files are explored in this lab.

Grading Policy

Credit for this Lab is earned by completing the exercises here and submitting a Zip of the work to Gradescope. Students are responsible to check that the results produced locally via make test are reflected on Gradescope after submitting their completed Zip. Successful completion earns 1 Engagement Point.

Lab Exercises are open resource/open collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.

See the full policies in the course syllabus.

2 Codepack

The codepack for this lab contains the following files.

File   Description
QUESTIONS.txt EDIT Questions to answer: fill in the multiple choice selections in this file.
print_department.c Examine Code to analyze to complete the QUIZ and assist with the CODE
email_lookup.c EDIT C code to complete for the CODE portion by filling in ??? items
make_dept_directory.c Optional Code that generates the binary data structure
Makefile Build Enables make test and make zip
QUESTIONS.txt.bk Backup Backup copy of the original file to help revert if needed
QUESTIONS.md5 Testing Checksum for answers in questions file
test_quiz_filter Testing Filter to extract answers from Questions file, used in testing
test_lab13.org Testing Tests for this lab
testy Testing Test running scripts

3 QUESTIONS.txt File Contents

Below are the contents of the QUESTIONS.txt file for the lab. Follow the instructions in it to complete the QUIZ and CODE questions for the lab.

                           __________________

                            LAB 13 QUESTIONS
                           __________________





Lab Instructions
================

  Follow the instructions below to experiment with topics related to
  this lab.
  - For sections marked QUIZ, fill in an (X) for the appropriate
    response in this file. Use the command `make test-quiz' to see if
    all of your answers are correct.
  - For sections marked CODE, complete the code indicated. Use the
    command `make test-code' to check if your code is complete.
  - DO NOT CHANGE any parts of this file except the QUIZ sections as it
    may interfere with the tests otherwise.
  - If your `QUESTIONS.txt' file seems corrupted, restore it by copying
    over the `QUESTIONS.txt.bk' backup file.
  - When you complete the exercises, check your answers with `make test'
    and if all is well, create a zip file with `make zip' and upload it
    to Gradescope. Ensure that the Autograder there reflects your local
    results.
  - IF YOU WORK IN A GROUP only one member needs to submit and then add
    the names of their group.


QUIZ: print_department.c
========================

  The print_department.c code is provided as an example of how to use
  the `mmap()' function to memory map a file with binary data in
  it. Review this code as many of the techniques demonstrated in it will
  be used to complete the CODE portion of this lab.

  At the top of `print_department.c' there are a series of structs
  defined and associated documentation of them and an example of a
  binary file which contains these structs. Which of the following best
  conveys the arrangement of these structs in the binary file described.
  - ( ) Files start with an array of file_header_t structs followed by
    one dept_offest_t struct and then an array of contact_t structs
  - ( ) Files start with one file_header_t struct, are followed by an
    array of dept_offset_t structs, and then an array of contact_t
    structs
  - ( ) Files start with one dept_offset_t structs, then an array of
    file_header_t struts, then an array of contact_t structs
  - ( ) Files start with an array of contact_t structs and then have an
    array of dept_offset_t structs without any file_header_t

  Examine the call to `mmap()' in the `main()' function of
  `print_department.c'.  One of its arguments is `PROT_READ'. Do an
  internet search to determine what this argument means.
  - ( ) PROT_READ means all permissions to read/write/execute the memory
    are granted
  - ( ) PROT_READ means the memory is protected from reading so cannot
    be read
  - ( ) PROT_READ means that the memory is protected so cannot be
    accessed except with supervisor permission
  - ( ) PROT_READ means that the mapped memory may be read / accessed

  Near beginning of `main()' in `print_department.c', code like the
  following appears.
  ,----
  | file_header_t *header = (file_header_t *) file_bytes;
  `----

  Which the below best describes what this line is doing?
  - ( ) It sets a pointer to the same location as file_bytes but
    indicates that the data there should be treated as a file_header_t.
  - ( ) It does pointer arithmetic between file_bytes (pointer to
    beginning of the file) and offset to find the location where
    name/emails for an individual department are located in the file.
  - ( ) It changes the memory mapped file to be at a different location
    in memory and refer to a different type; subsequently it will be
    treated as file_header_t data.
  - ( ) It alters the header of the binary file by assigning new data to
    the mapped location in the file.

  NOTE: Lab demoers may use the provided `diagram.txt' file to
  illustrate some of the pointers that are being created. Students may
  find it useful to examine this as well.

  One part of the program print_department.c uses a line that looks like
  the following
  ,----
  | contact_t *dept_contacts = (contact_t *) (file_bytes + offset);
  `----

  Which the below best describes what this line is doing?
  - ( ) It does pointer arithmetic between file_bytes (pointer to
    beginning of the file) and offset to find the location where
    name/emails for an individual department are located in the file.
  - ( ) It does pointer arithmetic to convert character data (a string)
    to be reformatted as a contact_t; this converts binary data to
    printable data.
  - ( ) It calculates how much of the file remains by summing the number
    of bytes in the file and the current offset; this is then
    dereferenced to get a struct pointer.
  - ( ) It changes the memory mapped file to be at a different location
    in memory and refer to a different type; subsequently it will be
    treated as contact_t data.


  Compile and run the `print_department' program as shown below.
  ,----
  | >> make print_department
  | gcc -Wall -Werror -g -Og -c print_department.c
  | gcc -Wall -Werror -g -Og -o print_department print_department.o
  | 
  | >> ./print_department
  | usage: ./print_department ...
  | ...
  | 
  | >> ./print_department cse_depts.dat ???
  | ...
  `----
  Fill in the `...' with appropriate text. Which of the following best
  describes the usage and effect of the `print_department' program.
  - ( ) It accepts a binary file and prints out all contacts from all
    departments in the binary file.
  - ( ) It accepts binary file and a department name and prints a short
    description of that department
  - ( ) It accepts a binary file to process and the name of a department
    like CS or EE and prints out all of the contacts in that department
    only.
  - ( ) It allows one to search for a name in the binary file and prints
    out contact information for the person with that name.


CODE: Complete `email_lookup.c'
===============================

  Once you feel confident about the flow of code in
  `print_department.c', work to complete the code in the
  `email_lookup.c'. This file which is intended to accept an email
  address on the command line and lookup the Department / Name of the
  person to which it belongs.  Much of the code is provided but some
  lines are marked with ??? and need code filled in. Many of the
  techniques required have analogs in `print_department.c' so adapt the
  code provided there as appropriate.

  When complete, the `email_lookup' program will run as follows.
  ,----
  | >> ./email_lookup cse_depts.dat follstad@umn.edu
  | Found follstad@umn.edu: IT Department --> Carl Follstad
  | 
  | >> ./email_lookup cse_depts.dat sjguy@umn.edu
  | Found sjguy@umn.edu: CS Department --> Stephen Guy
  | 
  | >> ./email_lookup cse_depts.dat heimdahl@umn.edu
  | Found heimdahl@umn.edu: CS Department --> Mats Heimdahl
  | 
  | >> ./email_lookup cse_depts.dat linus@umn.edu
  | Email 'linus@umn.edu' not found
  `----

4 Submission

Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed lab zip to Gradescope.


Author: Chris Kauffman (kauffman@umn.edu)
Date: 2023-04-18 Tue 09:35