MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
data
Recherche

PyPi is Reducing Stored IP Address Data

dimanche 28 mai 2023, 16:34 , par Slashdot
The PyPi registry of open source Python packages 'began evaluating ways to reduce the amount of identifying information that it stores,' reports the Register, 'even before the U.S. Justice Department came asking for data on suspect users.'

But now, 'the Python community package registry wants developers to understand that it's working to minimize the user data that it stores.'
The goal is not to be unable to respond to lawful requests for information; rather it's to store only the minimum amount of data necessary so as not to expose users to unnecessary privacy intrusion. Coincidentally, data minimization may prevent organizations from becoming a preferred source of on-demand surveillance: having excessive amounts of information about users invites legal demands, which staff then have to handle...

Mike Fiedler, a member of the PyPI admin team, said in a statement on Friday that the organization's effort to improve user privacy and security dates back to 2020. Since the receipt of the subpoenas in March and April, that effort has been reinvigorated.

Much of the concern focuses on IP address data, which gets stored in conjunction with web log access; user events such as logins; project events including uploads; events associated with recently introduced organizations; and administrative PyPI journal entries. According to Fiedler, PyPI was able to stop storing IP data for journal entries — an append-only transaction log — because these were only exposed to administrators... To obscure IP addresses, PyPI is salting them — adding an arbitrary value — and then hashing them — running the data through a one-way scrambling function that creates a value called a hash. This provides a way to store a reference to potentially identifying data without actually storing raw data... PyPI has been using its CDN provider Fastly to pass along a salted hash of the IP address for requests via a custom header, along with broad GeoIP data (the country and city where the user is located), and is using that instead of the raw IP address. In April, the registry adopted code changes for hashing and salting IP addresses for requests that PyPI handles directly in Warehouse, the web application that implements the official Python package index.

And over the past few days, it has been replacing IP addresses in the PyPI user interface with geolocation data. PyPI still relies on IP address information to identify abuse — the creation of malicious packages, harassments, and so on — but Fiedler says even that is being looked at. 'We're thinking about how to manage that without storing IP data, but we're not there yet,' he said. Fiedler says the PyPI team will be weighing whether it can remove IP data from event history records after a period of time and whether the service can handle all its requests via CDN.

Read more of this story at Slashdot.
https://developers.slashdot.org/story/23/05/27/2122238/pypi-is-reducing-stored-ip-address-data?utm_s...
News copyright owned by their original publishers | Copyright © 2004 - 2024 Zicos / 440Network
Date Actuelle
ven. 19 avril - 20:13 CEST