HW 2: The Social Setwork
Due: Tue, Sep 17 2024 at 9:00 PM EST
Released: Wed, Sep 11 2024
The goal of this assignment is to get some practice using the set
data structure.
Setup
Download the following stencil files:
Items labelled as Question should be answered in the README file.
The Assignment
Part 1: Social Media Privacy and Tracking
In 2017, an article in The Economist reported that the world’s most valuable resource was no longer oil but data. In order to attain massive amounts of data, one of the most efficient ways for technology and marketing companies is to utilize social media. Social media sites such as Instagram, Facebook, Reddit, Youtube, Twitter, and more track your likes, comments, infer your interests and even store your messages to provide optimized marketing campaigns and custom ads.
2017 was before some of you were in high school. This is not a new situation.
In this assignment, you’re going to reflect on the power that you, as programmers, have in these privacy situations, as well as your thoughts on social media monitoring.
Note: You should not need to have a subscription to access these articles. You should be able to access these articles by logging in with your Brown Google Account.
Task 1: Read these two articles:
What You Don’t Know About How Facebook Uses Your Data
Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens
Part 2: Download Your Data
Task 2: Download your Instagram data by following the instructions below. If you do not have Instagram, we will provide a file that you can work with for this homework. Make sure you start this early, as getting the link to your data may take up to a couple of hours.
How to Download Your Instagram Data On Mobile (Recommended, Reduces the time it takes)
- Tap profile or your profile picture in the bottom right to go to your profile.
- Tap in the top right, then tap Your activity.
- Below Information you shared with Instagram, tap Download your information.
- Tap Request Download.
- Tap Select Types of information
- Choose the following:
- Your topics
- Information About You
- Ads and Topics
- Advertising
- Comments
- Likes
- Tap Next
- Change the format to JSON
- Submit request
- You'll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.
Official Instagram Mobile Download Instructions
How to Download Your Instagram Data On Desktop
- Click menu in the bottom left, then click Your Activity.
- Click Download your information.
- Enter the email address where you'd like to receive a link to your data.
- Click JSON as the format you'd like to receive your data in, then click Next.
- Enter your Instagram account password and click Request download.
- You'll soon receive an email titled Your Instagram Data with a link to your data. Click Download data and follow the instructions to finish downloading your information.
Official Instagram Desktop Download Instructions
See your Google Ad Assumptions
Make sure you use an .edu email
- Click the "Your Google Account" button in the top right corner and login
- On the left-hand sidebar, click "Data and Privacy"
- Scroll down, and in the "Things you've done and places you've been" section, under "Personalized ads" click "My Ad Center"
Part 3: Implementing a Social Network
Tim’s recent talk on courses at Brown has been on the Top 10 talks list for weeks! With Tim’s Talks gaining fame, you’re assigned to create a fan networking site called TimNet. Due to its popularity, companies want to advertise their products on it!
The website’s data is organized as a dictionary of users, where the keys are usernames (strings) and the values are sets of interests (also strings). For example, the data might look like this:
{
"tim": {"reading", "cooking", "logic"},
"ashley": {"books", "piano", "animation"},
"ben": {"chocolate", "pretzels", "chocolate-covered pretzels"}
}
As a programmer for this website, you’ve been asked to implement and test several
functions that either query or modify the user data in this format in preparation for advertising.
Each function you’re implementing takes a users
dictionary (formatted as above)
as its first argument.
Part 4: Using Your Data
When implementing these functions, you are going to be using them on your very own Instagram data! If you don’t have Instagram, we will provide you with a file to work with.
For those with Instagram:
Go to the your_topics folder and find theyour_topics.json
file. Drag the file into the directory in which you will work in. Do NOT edit any of the json package code or the name of the file. Next, write your favourite artists name here: users = load_json_to_dict(data, "Your Favorite Artists Name Here")
Run the setwork.py
file and verify that a dictionary with your favorite artists name and your interests are being outputted in the terminal.
For those who are using the provided file:
If you don’t have Instagram, use the attachedyour_topics.json
file. Drag the file into the directory in which you will work in. We strongly recommend using your own data should you have Instagram.
Task 3: After making the changes, run the setwork.py
file and verify that a dictionary with your favorite artists name and your interests are being outputted in the terminal.
Look at the dictionary with your interests being outputted in your terminal. Based on your activity, these sites have inferred that you are interested in these topics.
Question: Do your topics inferred accurately reflect your real interests?
If you don’t have Instagram, go to your Google “My Ad Center” linked at the top of this handout. Under “Manage Privacy” and look at the guesses Google made about you in “Your Google Account info” and “Categories used to show you ads”.
Question(only if you don’t have Instagram): Were Google’s guesses about you correct?
Question: Who do you think benefits most from interest monitoring–users, companies, both, neither?
Part 5: Functions that Modify the Dictionary
A new user wants to join you on TimNet, and you need to add them to your dictionary.
Task 4: Implement the add_user
function in setwork.py
. This function should add a user with the given name to the dictionary with no interests. If there is already a user with that name, it should not modify the dictionary. Note that in Python, set()
is the empty set (whereas {}
is the empty dictionary).
Now, in order to advertise on TimNet effectively, we need to track what users are interested in.
Task 5: Implement the add_interest
function in setwork.py
. This function should add the provided interest for the given user. If the user does not exist, the function should add a user with that name
and then add the provided interest
.
A friend of yours liked a couple of the same posts as you. Now third-party advertisers on TimNet want to add all your interests to your friends in hopes of spreading their reach.
Task 6: Implement the copy_interests
function in setwork.py
. This function should add all of name_from’s
interests to the interests of name_to
. If a user named name_to
does not exist, it should be created. If a user named name_from
does not exist, pretend it’s a user with no interests (i.e., don’t modify name_to
’s interests).
Question: What are some issues that could arise from copying all the interests from one user to another user? Should users have explicit consent when interests are added to their account, or are the implications minimal?
Part 6: Functions that Query the Dictionary
Before choosing to advertise on TimNet, third-party companies want to see if any users are interested in any given topic. Implement a function that will check if any user is interested in each interest.
Task 7: Implement the interest_exists
function in setwork.py
. This function should return True
if any user is interested in interest
, and False
otherwise.
When seeing that an interest does in fact exist within TimNet’s users, third-party companies now want to see which users share common interests; implement a function to achieve this goal.
Task 8: Implement the interests_match
function in setwork.py
. This function should return a set
of users (names) who share at least n
interests with the user named name
. If user name
is not present, it should return the empty set. If n
is greater than the number of interests for a specified name, an empty set
should be returned.
Note that every user will share all their interests with themselves.
Question: What are some positive and negative effects that could arise from third party companies having access to user interest data and the interests_exist
and interests_match
functions? Think about the Facebook Cambridge Analytica article you read at the beginning of this assignment.
Oh no! You now have a user who wants all their tracked data removed from TimNet completely, and now you need to implement a remove user function.
Task 9: Implement the remove_user
function in setwork.py
. This function should remove the given name
from the user
dictionary. If said name
does not exist, make no modifications to the data structure.
Question: What is another function you could implement or edit to protect user privacy and rights on social media? There is no need to write code for this question.
Part 7: Further Exploration
Question: In this assignment, you were programming functions that would theoretically collect user data for targeted media. Given this context, to what extent do you believe programmers have a moral obligation to safeguard user rights? Conversely, should their primary allegiance lie with protecting their company’s interests?
Look more into your data folders — there are lots of interesting folders to look into! We recommend you look into the following folders: ads_information, ads_and_topics, comments, likes, login_and_account_creation, messages, personal_information, and recent_searches.
For Google Ads, you can go to your “Data and Privacy” center in your Google account settings and scroll down to “Third-party apps and services.”
Question: Reflect upon your findings below! What were you and were you not surprised about when exploring this vast data collection? To what extent do you believe companies should have the right to track user data on social media platforms, and what are your thoughts on users’ rights to full and complete privacy on these sites?
Extra content:
- The State of Consumer Data Privacy Laws in the US (And Why It Matters)
- How to Protect Your Digital Privacy - The Privacy Project Guides - The New York Times
Part 8: Testing
Task 10: Write tests for each of the functions you implemented in setwork.py
. You can find the test file in setwork_test.py
. We encourage you to test using the data from your_topics.json
, but not required.
Part 9: README
In your README file, answer the questions marked throughout the assignment.
Submission
Please follow the design and clarity guide–part of your grade will be for code style and clarity. Additionally, you should be adhering to the course design recipe. After completing the homework, you will submit:
README.txt
setwork.py
setwork_test.py
your_topics.json
Please don’t put your name in your code files, as we grade anonymously. If you have any questions about the assignment, please post on Ed.
You can only use a maximum of 3 late days per assignment. If the assignment is late (and you do NOT have anymore late days) no credit will be given.