When news of donk_enby's archival efforts broke, several viral tweets, Reddit posts, and Facebook posts claimed that she had captured private information, scans of drivers licenses and IDs, and other highly sensitive information. She said those posts are “not at all” accurate.
“Everything we grabbed was publicly available on the web, we just made a permanent public snapshot of it,” donk_enby told me.
Nevertheless, with the FBI, state and local law enforcement, and open-source investigators looking for media from Wednesday's attack, the archive could be highly useful to a whole host of people.
...
When word of donk_enby’s project broke online, competing theories circled about what information had actually been pulled. What donk_enby actually did was an old school scrape of already publicly available information. Using a jailbroken iPad and
Ghidra, a piece of reverse-engineering software designed and publicly released by the National Security Agency, donk_enby managed to exploit weaknesses in the website’s design to pull the URL’s of every single public post on Parler in sequential order, from the very first to the very last, allowing her to then capture and archive the contents.
...
donk_enby had originally intended to grab data only from the day of the Capitol takeover, but found that the poor construction and security of Parler allowed her to capture, essentially, the entire website. That ended up being 56.7 terabytes of data, which included every public post on Parler, 412 million files in all—including 150 million photos and more than 1 million videos. Each of these had embedded metadata like date, time and
GPS coordinates—unlike most social media sites, Parler does not strip metadata from media its users upload, which, crucially, could be useful for law enforcement and open source investigators.