HW 2: The Social Setwork

The goal of this assignment is to get some practice using the set data structure.

Setup

Download the following stencil files:

On your desktop, create a cs0112 folder. This will hold your future assignments, labs, and projects. In the cs0112 folder, create a hw2 folder, and place the setwork.py and setwork_test.py files there.

To open the hw2 folder in VSCode, start by opening VSCode and then opening the Explorer tab. The Explorer can be opened with the top-most button on the left sidebar which looks like this:

Image of the explorer icon.

Once you’ve opened the explorer, your VSCode window should look something like this:

Image of a VSCode window. The Explorer tab is open, with a button that says Open Foler.

Click on the button which says “Open a Folder” and find the hw2 folder. Open that folder, and your VSCode window will look something like this:

Image of a VSCode window. The Explorer tab is open, with a list of files: hw2_revised.md, setwork_test.py, and setwork.py.

Double-click the file names to open them in new tabs. You can also click the Explorer icon again to close that sidebar.

Since you opened the hw2 folder with VSCode, the VSCode terminal will automatically be in the hw2 directory. This means that you can use the terminal to run pytest without having to navigate to this directory.

Helpful Things

Documentation

Staff Assistance

Your friendly TAs and Professor are here to help with this assignment! You can find our schedule for office hours here.

The Assignment

Part 1:

Image of social media as cameras.

The Premise: Social Media Privacy and Tracking

In 2017, an article in The Economist reported that the world’s most valuable resource was no longer oil but data. In order to attain massive amounts of data, one of the most efficient ways for technology and marketing companies is to utilize social media. Social media sites such as Instagram, Facebook, Reddit, Youtube, Twitter, and more track your likes, comments, infer your interests and even store your messages to provide optimized marketing campaigns and custom ads.

2017 was before some of you were in high school. This is not a new situation.

In this assignment, you’re going to reflect on the power that you, as programmers, have in these privacy situations, as well as your thoughts on social media monitoring.

Before continuing on with the assignment, make sure you read and complete the tasks below

Note you should not need to have a subscription to access these articles. You should be able to access these articles by logging in with your Brown Google Account.

  1. Read these two articles:
  2. Next, download your Instagram data following the instructions below. If you do not have Instagram, we will provide a file that you can work with for this homework. Make sure you start this early, as getting the link to your data may take up to a couple of hours.

How to Download Your Instagram Data On Mobile (Recommended, Reduces the time it takes)

  1. Tap profile or your profile picture in the bottom right to go to your profile.
  2. Tap Hamburger in the top right, then tap Activity Your activity.
  3. Below Information you shared with Instagram, tap DownloadDownload your information.
  4. Tap Request Download.
  5. Tap Select Types of information
  6. Choose the following
    • Your topics
    • Information About You
    • Ads and Topics
    • Advertising
    • Comments
    • Likes
  7. Tap Next
  8. Change the format to JSON
  9. Submit request
  10. You’ll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.

How to Download Your Instagram Data On Desktop

  1. Click Hamburger menu in the bottom left, then click Your Activity.
  2. Click Download your information.
  3. Enter the email address where you’d like to receive a link to your data.
  4. Click JSON as the format you’d like to receive your data in, then click Next.
  5. Enter your Instagram account password and click Request download.You’ll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.

See your Google Ad Assumptions

Make sure you use an .edu email

  1. Click the “Your Google Account” button in the top right corner and login
  2. On the left-hand sidebar, click “Data and Privacy”
  3. Scroll down, and in the “Things you’ve done and places you’ve been” section, under “Personalized ads” click “My Ad Center”

Part 2: Setwork

Tim’s recent talk on courses at Brown has been on the Top 10 talks list for weeks! With Tim’s Talks gaining fame, you’re assigned to create a fan networking site called TimNet. Due to its popularity, companies want to advertise their products on it!

The website’s data is organized as a dictionary of users, where the keys are usernames (strings) and the values are sets of interests (also strings). For example, the data might look like this:

{
    "tim": {"reading", "cooking", "logic"},
    "ashley": {"books", "piano", "animation"},
    "ben": {"chocolate", "pretzels", "chocolate-covered pretzels"}
}

As a programmer for this website, you’ve been asked to implement and test several functions that either query or modify the user data in this format in preparation for advertising.

Each function you’re implementing takes a users dictionary (formatted as above) as its first argument.

Implement the functions in setwork.py. You should test all of your functions in setwork_test.py. Both of these stencils can be downloaded in the Setup section of this handout.

Make sure to test edge cases! Writing a good set of tests for each function before you start implementing it is always a good idea–that way, you can ensure that you understand what your function is supposed to do.

Using your data:

When implementing these functions, you are going to be using them on your very own Instagram data! If you don’t have Instagram, we will provide you with a file to work with.

*For those with Instagram:*

Go to the your_topics folder and find the your_topics.json file. Drag the file into the directory in which you will work in. Do not edit any of the json package code or the name of the file.

Next, where it says users = load_json_to_dict(data, "Your Favorite Artists Name Here"), write your favorite artists name. Run the setwork.py file and verify that a dictionary with your favorite artists name and your interests are being outputted in the terminal.

*For those who are using the provided file:*

If you don’t have Instagram, use the attached your_topics.json file. Drag the file into the directory in which you will work in. We strongly recommend using your own data should you have Instagram.

Answer all the questions denoted with ‘Q’ in your README file:

Look at the dictionary with your interests being outputted in your terminal. Based on your activity these sites have inferred that you are interested in these topics.

Q: Do your topics inferred accurately reflect your real interests?

If you don’t have Instagram, go to your Google “My Ad Center” linked at the top of this handout. Under “Manage Privacy” and look at the guesses Google made about you in “Your Google Account info” and “Categories used to show you ads”.

Q(only if you don’t have Instagram): Were Google’s guesses about you correct?

Q: Who do you think benefits most from interest monitoring–users, companies, both, neither?

Functions that modify the dictionary

Now, you will be writing functions and test them directly onto your personal user dictionary.

add_user

A new user wants to join you on TimNet, and you need to add them to your dictionary.
The add_user function should add a user with the given name to the dictionary with no interests. If there is already a user with that name, it should not modify the dictionary.

Note that in Python, set() is the empty set (whereas {} is the empty dictionary).

add_interest

Now, in order to advertise on TimNet effectively, we need to track what users are interested in. The add_interest function should add the provided interest for the given user. If the user does not exist, the function should add a user with that name and then add the provided interest.

copy_interests

A friend of yours liked a couple of the same posts as you. Now third party advertisers on TimNet wants to add all your interests to your friends in hopes of spreading their reach. Implement a copy_interests function to achieve this.

The copy_interests function should add all of name_from’s interests to the interests of name_to. If a user named name_to does not exist, it should be created. If a user named name_from does not exist, pretend it’s a user with no interests (i.e., don’t modify name_to’s interests).

Q: What are some issues that could arise from copying all the interests from one user to another user? Should users have explicit consent when interests are added to their account, or are the implications minimal?

Functions that query the dictionary

interest_exists

Before choosing to advertise on TimNet, third-party companies want to see if any users are interested in any given topic. Implement a function that will check if any user is interested in each interest.

The interest_exists function should return True if any user is interested in interest, and False otherwise.

interests_match

When seeing that an interest does in fact exist within TimNet’s users, third-party companies now want to see which users share common interests; implement a function to achieve this goal.

The interests_match function should return a set of users (names) who share at least n interests with the user named name. If user name is not present, it should return the empty set. If n is greater than the number of interests for a specified name, an empty set should be returned.

Note that every user will share all their interests with themselves.

Q: What are some positive and negative effects that could arise from third party companies having access to user interest data and the interests_exist and interests_match functions? Think about the Facebook Cambridge Analytica article you read at the beginning of this assignment.

remove_user:

Oh no! You now have a user who wants all their tracked data removed from TimNet completely, and now you need to implement a remove user function. The remove_user should remove the given name from the user dictionary. If said name does not exist, make no modifications to the data structure.

Q: What is another function you could implement or edit to protect user privacy and rights on social media? There is no need to write code for this question.

Further exploration

Q: In this assignment, you were programming functions that would theoretically collect user data for targeted media. Given this context, to what extent do you believe programmers have a moral obligation to safeguard user rights? Conversely, should their primary allegiance lie with protecting their company’s interests?

Look more into your data folders–there are lots of interesting folders to look into! We have some recommended folders for you if you don’t know where to start.

Recommended Instagram folders to explore: ads_information, ads_and_topics, comments, likes, login_and_account_creation, messages, personal_information, recent_searches

Google Ads: Go to your “Data and Privacy” center in your Google account settings and scroll down to “Third-party apps and services.”

Q: Reflect upon your findings below! What were you and were you not surprised about when exploring this vast data collection? To what extent do you believe companies should have the right to track user data on social media platforms, and what are your thoughts on users’ rights to full and complete privacy on these sites?

Extra content:

General tips

The set documentation will be useful–some of the problems can be solved in fewer lines of code with a judicious use of the operations described!

Note that sets do not contain duplicates, i.e {"A", "B"} and {"A", "B", "A", "A"} are the same set. How can you use this property while testing?

Testing and Running your Code

Make sure to test individual functions in your setwork_test.py! We encourage you to test using the data from your_topics.json, but not required.

If you opened your hw2 directory through VSCode using the instructions in the Setup section of this handout, then your VSCode terminal will already be in the correct directory.

To run setwork.py, you can use VSCode’s run button or type python3 setwork.py in the terminal. To use pytest to run your tests, run pytest setwork_test.py in the terminal. Alternatively, if that command doesn’t work, try python3 -m pytest setwork_test.py

To open the Terminal in VSCode, click the button which looks something like this:

Image of a button from the sidebar on the bottom of the VSCode window. This button has an icon saying how many warning and how many errors there are, and opens a sidebar containing the terminal.

This will open a sidebar on the bottom of your screen which looks like this:

Image of a sidebar from a VSCode window. It has several tabs, called problems, output, terminal, and debug console.

Click on the Terminal tab to open your terminal.

README

In your README.txt, include answers to the following questions:

Handin

You may submit as many times as you want. Only your latest submission will be graded. This means that if you submit after the deadline, you will be using a late day – so do NOT submit after the deadline unless you plan on using late days.

Because all we need are the above files, you do not need to submit the whole project folder. As long as the files you create have the same name as what needs to be submitted, you’re good to go!

If you are using late days, make sure to make a note of that in your README. Remember, you may only use a maximum of 3 late days per assignment. If the assignment is late (and you do NOT have any more late days) no credit will be given.

Please don’t put your name anywhere in any of the handin files–we grade assigments anonymously!

Hand in your work on Gradescope. Use the Gradescope account associated with your anonymous ID which you set up during HW0. The guide for setting that up and enrolling in Gradescope can be found in the Setup guide here.