Dan Froomkin reports today at the Intercept:
Most people don’t realize that the words they speak are not so private anymore, either.
Top-secret documents from the archive of former NSA contractor Edward Snowden show the National Security Agency can now automatically recognize the content within phone calls by creating rough transcripts and phonetic representations that can be easily searched and stored.
The documents show NSA analysts celebrating the development of what they called “Google for Voice” nearly a decade ago.
The Snowden documents describe extensive use of keyword searching as well as computer programs designed to analyze and “extract” the content of voice conversations, and even use sophisticated algorithms to flag conversations of interest.
By leveraging advances in automated speech recognition, the NSA has entered the era of bulk listening.
And this has happened with no apparent public oversight, hearings or legislative action. Congress hasn’t shown signs of even knowing that it’s going on.
Jennifer Granick, civil liberties director at the Stanford Center for Internet and Society, told The Intercept … “Once you have this capability, then the question is: How will it be deployed? Can you temporarily cache all American phone calls, transcribe all the phone calls, and do text searching of the content of the calls?” she said. “It may not be what they are doing right now, but they’ll be able to do it.”
And, she asked: “How would we ever know if they change the policy?”
Indeed, NSA officials have been secretive about their ability to convert speech to text, and how widely they use it, leaving open any number of possibilities.
That secrecy is the key, Granick said. “We don’t have any idea how many innocent people are being affected, or how many of those innocent people are also Americans.”
Voice communications can be collected by the NSA whether they are being sent by regular phone lines, over cellular networks, or through voice-over-internet services.
(Anyone who has tried dictating into their smart phone – or who uses modern dictation software – knows how incredibly accurate this kind of technology can be.)
The NSA refused to tell the Intercept how widely the speech-to-text programs are used on Americans.
But high-level NSA whistleblowers Bill Binney, Thomas Drake and Russell Tice have all told Washington’s Blog that NSA is recording the content – and not just the metadata – of Americans’ phone calls. And see this.
Indeed, any statements that content is only stored for some short period of time is moot, since transcripts of that content can easily be stored forever … without taking up much space.
Of course, the NSA voluntarily shares the raw data it collects on American citizens with Israel and likely also with Australia, Canada, New Zealand and the UK. And see this. So those countries might be doing voice-to-text transcription on Americans' conversations as well.
Update: After reviewing this story, NSA whistleblower Bill Binney told Washington’s Blog:
It sounds like they have achieved as least a working level of success at automatically translating speech. This means to me that they use this capability to do a rough scan of unlimited numbers of phone calls to sample what is being said. Add to that the ability to do digital recordings at the same time and keep for a short period (20-30 days) gives their analysts (as well as FBI/DEA/DOJ/IRS/DHS/CIA/etc.) a window of opportunity to select the recordings and have people do a final transcription for the record and files/storage. Now they need to do a similar thing for video. For Americans, this has major implications when applied to the NSA FAIRVIEW program with its 80 to 100 taps on the fiber lines inside the lower 48 states. FAIRVIEW would enable them to capture rough translations of most calls made in the US.
As Binney previously explained:
BILL BINNEY: Fairview is the program they use that produces most of the content and metadata on US citizens. Note the distribution of tape points in the lower 48 states.
WASHINGTON’S BLOG: The US 990-Fairview slide certainly shows NSA sucking up alot of data from within America.
But the press characterizes Fairview as gathering info solely on foreigners. Sounds like this is false?
In other words, Fairview sucks up information – content and metadata – on Americans and foreigners, and then NSA simply retains and stores the info?
BILL BINNEY: If NSA was after only foreigners, then they would have collection points on the east and west coasts at points where the transoceanic cables surface. Anything other than that is collecting domestic communications – the PSTN phone network and the world wide web.
You could argue that Stormbrew [another “upstream” collection program … see below] is targeted at foreign by the distribution. But, some of that is also questionable. This does not count input from cooperative countries (second and third parties) on domestic activity collected by them as well. To the point, they have done nothing but lie to us.
In other words, the Stormbrew map is a pretty good proxy for what foreign surveillance locations should look like. But the Fairview map shows many more collection points all over the country … proving that the NSA is specifically collecting information on Americans living on U.S. soil.