HW 2: The Social Setwork

Due: Tue, Sep 17 2024 at 9:00 PM EST

Released: Wed, Sep 11 2024

The goal of this assignment is to get some practice using the set data structure.

Setup

Download the following stencil files:

Items labelled as Question should be answered in the README file.

The Assignment

Part 1: Social Media Privacy and Tracking

Image of social media as cameras.

In 2017, an article in The Economist reported that the world’s most valuable resource was no longer oil but data. In order to attain massive amounts of data, one of the most efficient ways for technology and marketing companies is to utilize social media. Social media sites such as Instagram, Facebook, Reddit, Youtube, Twitter, and more track your likes, comments, infer your interests and even store your messages to provide optimized marketing campaigns and custom ads.

2017 was before some of you were in high school. This is not a new situation.

In this assignment, you’re going to reflect on the power that you, as programmers, have in these privacy situations, as well as your thoughts on social media monitoring.

Note: You should not need to have a subscription to access these articles. You should be able to access these articles by logging in with your Brown Google Account.

Task 1: Read these two articles:
What You Don’t Know About How Facebook Uses Your Data
Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens

Part 2: Download Your Data

Task 2: Download your Instagram data by following the instructions below. If you do not have Instagram, we will provide a file that you can work with for this homework. Make sure you start this early, as getting the link to your data may take up to a couple of hours.

How to Download Your Instagram Data On Mobile (Recommended, Reduces the time it takes)
  1. Tap profile or your profile picture in the bottom right to go to your profile.
  2. Tap Hamburger in the top right, then tap Activity Your activity.
  3. Below Information you shared with Instagram, tap DownloadDownload your information.
  4. Tap Request Download.
  5. Tap Select Types of information
  6. Choose the following:
    • Your topics
    • Information About You
    • Ads and Topics
    • Advertising
    • Comments
    • Likes
  7. Tap Next
  8. Change the format to JSON
  9. Submit request
  10. You'll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.

Official Instagram Mobile Download Instructions

How to Download Your Instagram Data On Desktop
  1. Click Hamburger menu in the bottom left, then click Your Activity.
  2. Click Download your information.
  3. Enter the email address where you'd like to receive a link to your data.
  4. Click JSON as the format you'd like to receive your data in, then click Next.
  5. Enter your Instagram account password and click Request download.
  6. You'll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.

Official Instagram Desktop Download Instructions

See your Google Ad Assumptions

Make sure you use an .edu email

  1. Click the "Your Google Account" button in the top right corner and login
  2. On the left-hand sidebar, click "Data and Privacy"
  3. Scroll down, and in the "Things you've done and places you've been" section, under "Personalized ads" click "My Ad Center"

Google Ad Assumptions Page

Part 3: Implementing a Social Network

Tim’s recent talk on courses at Brown has been on the Top 10 talks list for weeks! With Tim’s Talks gaining fame, you’re assigned to create a fan networking site called TimNet. Due to its popularity, companies want to advertise their products on it!

The website’s data is organized as a dictionary of users, where the keys are usernames (strings) and the values are sets of interests (also strings). For example, the data might look like this:

{
    "tim": {"reading", "cooking", "logic"},
    "ashley": {"books", "piano", "animation"},
    "ben": {"chocolate", "pretzels", "chocolate-covered pretzels"}
}

As a programmer for this website, you’ve been asked to implement and test several functions that either query or modify the user data in this format in preparation for advertising. Each function you’re implementing takes a users dictionary (formatted as above) as its first argument.

Part 4: Using Your Data

When implementing these functions, you are going to be using them on your very own Instagram data! If you don’t have Instagram, we will provide you with a file to work with.

For those with Instagram: Go to the your_topics folder and find the your_topics.json file. Drag the file into the directory in which you will work in. Do NOT edit any of the json package code or the name of the file. Next, write your favourite artists name here:
users = load_json_to_dict(data, "Your Favorite Artists Name Here")
Run the setwork.py file and verify that a dictionary with your favorite artists name and your interests are being outputted in the terminal.
For those who are using the provided file: If you don’t have Instagram, use the attached your_topics.json file. Drag the file into the directory in which you will work in. We strongly recommend using your own data should you have Instagram.

Task 3: After making the changes, run the setwork.py file and verify that a dictionary with your favorite artists name and your interests are being outputted in the terminal.

Look at the dictionary with your interests being outputted in your terminal. Based on your activity, these sites have inferred that you are interested in these topics.

Question: Do your topics inferred accurately reflect your real interests?
If you don’t have Instagram, go to your Google “My Ad Center” linked at the top of this handout. Under “Manage Privacy” and look at the guesses Google made about you in “Your Google Account info” and “Categories used to show you ads”.

Question(only if you don’t have Instagram): Were Google’s guesses about you correct?

Question: Who do you think benefits most from interest monitoring–users, companies, both, neither?

Part 5: Functions that Modify the Dictionary

A new user wants to join you on TimNet, and you need to add them to your dictionary.

Task 4: Implement the add_user function in setwork.py. This function should add a user with the given name to the dictionary with no interests. If there is already a user with that name, it should not modify the dictionary. Note that in Python, set() is the empty set (whereas {} is the empty dictionary).

Now, in order to advertise on TimNet effectively, we need to track what users are interested in.

Task 5: Implement the add_interest function in setwork.py. This function should add the provided interest for the given user. If the user does not exist, the function should add a user with that name and then add the provided interest.

A friend of yours liked a couple of the same posts as you. Now third-party advertisers on TimNet want to add all your interests to your friends in hopes of spreading their reach.

Task 6: Implement the copy_interests function in setwork.py. This function should add all of name_from’s interests to the interests of name_to. If a user named name_to does not exist, it should be created. If a user named name_from does not exist, pretend it’s a user with no interests (i.e., don’t modify name_to’s interests).

Question: What are some issues that could arise from copying all the interests from one user to another user? Should users have explicit consent when interests are added to their account, or are the implications minimal?

Part 6: Functions that Query the Dictionary

Before choosing to advertise on TimNet, third-party companies want to see if any users are interested in any given topic. Implement a function that will check if any user is interested in each interest.

Task 7: Implement the interest_exists function in setwork.py. This function should return True if any user is interested in interest, and False otherwise.

When seeing that an interest does in fact exist within TimNet’s users, third-party companies now want to see which users share common interests; implement a function to achieve this goal.

Task 8: Implement the interests_match function in setwork.py. This function should return a set of users (names) who share at least n interests with the user named name. If user name is not present, it should return the empty set. If n is greater than the number of interests for a specified name, an empty set should be returned.

Note that every user will share all their interests with themselves.

Question: What are some positive and negative effects that could arise from third party companies having access to user interest data and the interests_exist and interests_match functions? Think about the Facebook Cambridge Analytica article you read at the beginning of this assignment.

Oh no! You now have a user who wants all their tracked data removed from TimNet completely, and now you need to implement a remove user function.

Task 9: Implement the remove_user function in setwork.py. This function should remove the given name from the user dictionary. If said name does not exist, make no modifications to the data structure.

Question: What is another function you could implement or edit to protect user privacy and rights on social media? There is no need to write code for this question.

Part 7: Further Exploration

Question: In this assignment, you were programming functions that would theoretically collect user data for targeted media. Given this context, to what extent do you believe programmers have a moral obligation to safeguard user rights? Conversely, should their primary allegiance lie with protecting their company’s interests?

Look more into your data folders — there are lots of interesting folders to look into! We recommend you look into the following folders: ads_information, ads_and_topics, comments, likes, login_and_account_creation, messages, personal_information, and recent_searches.

For Google Ads, you can go to your “Data and Privacy” center in your Google account settings and scroll down to “Third-party apps and services.”

Question: Reflect upon your findings below! What were you and were you not surprised about when exploring this vast data collection? To what extent do you believe companies should have the right to track user data on social media platforms, and what are your thoughts on users’ rights to full and complete privacy on these sites?

Extra content:

Part 8: Testing

Task 10: Write tests for each of the functions you implemented in setwork.py. You can find the test file in setwork_test.py. We encourage you to test using the data from your_topics.json, but not required.

Part 9: README

In your README file, answer the questions marked throughout the assignment.

Submission

Please follow the design and clarity guide–part of your grade will be for code style and clarity. Additionally, you should be adhering to the course design recipe. After completing the homework, you will submit:

Please don’t put your name in your code files, as we grade anonymously. If you have any questions about the assignment, please post on Ed.

You can only use a maximum of 3 late days per assignment. If the assignment is late (and you do NOT have anymore late days) no credit will be given.